-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
automod: test capture framework #470
Conversation
fwiw about packages and cyclic imports: since that issue comes up lot as code grows, and file moving is such a bummer for git history, I have kind of a semi-standard convention for preemptively avoiding that:
There's lots of ways to lay out packages, but I like the above because people usually start looking for docs and examples at the shorter paths into the package tree. Putting all the most essential types in the rootwards packages is a natural thing to do... but almost invariably creates frustration in the long run. Having the highest level usages and the coremost type defns in the same package is pretty much guaranteed to result in cycle issues when attempting to extract things. Aliasing (especially nowadays that we have type aliasing) also makes it pretty easy to have the rootmost package expose "everything" a downstream consumer needs, without necessarily exposing them to the internal package graph details. (So for example, (Maybe this is all familiar old hat to you, but, 2c :)) |
package structure: makes sense! I'd be open to refactoring automod to fit that pattern. I think we should wait until a moment when there are not other PRs in flight though. |
Manually resolved conficts: automod/engine_test.go automod/rules/fixture_test.go
Merged |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, including all the caveats you mentioned. One question about the schema.
@@ -198,12 +200,42 @@ var runCmd = &cli.Command{ | |||
}, | |||
} | |||
|
|||
// for simple commands, not long-running daemons | |||
func configEphemeralServer(cctx *cli.Context) (*Server, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
// Test helper which processes all the records from a capture. Intentionally exported, for use in other packages. | ||
// | ||
// This method replaces any pre-existing directory on the engine with a mock directory. | ||
func ProcessCaptureRules(e *Engine, capture AccountCapture) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely feeling the lack of packages here or other strong organizational cues for what's testing-land and what's not. Understood that that's a future PR topic, but just want to ratify that outloud :)
The comment being explicit that this function is rewiring the engine is definitely very very good and appreciated 👍 , because I wouldn't necessarily presume that from the signature or name otherwise.
|
||
type AccountCapture struct { | ||
CapturedAt syntax.Datetime `json:"capturedAt"` | ||
AccountMeta AccountMeta `json:"accountMeta"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, main question: if there will be rules that want to look at multiple identities, should we brace for that now by making this a list or a map? Or is that overkill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a reasonable question, but I think it is less ergonomic and maybe overkill. For simple rules it is nice to have a direct struct field to get access to the account meta for the owner/creator/author of the repo/record, which is by far the common access case.
For fetching more account meta, I think the pattern we should use is tiered caching. The rule code should just call evt.GetAccountMeta(did-or-handle)
, and the engine should read-through whatever layers of cache to get that info. That caching may even include per-event caching (not implemented currently!) which would be 1) fast and 2) ensure consistent behavior between rules executed on the same event (aka, the whole event should see the same account meta for all accounts).
This is following the facebook FXL functional pattern of basically memoizing all function calls within the scope of an event.
(side note: still mulling a better name than "event")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💭 I guess we could have an internal accountMetaMap
and have a helper method (CurrentAccount()
) which would access the current account?
Manually resolved conflicts: automod/event.go
This PR is currently rebased on top of #466, to demonstrate testing that rule. UPDATE: that PR merged, so now against
main
Adds a
hepa
command to "capture" the current state of a real-world account: currently some account metadata (identity, profile, etc), plus some recent post records. This gets serialized to JSON for easy dumping to file, like:go run ./cmd/hepa/ capture-recent atproto.com > automod/testdata/capture_atprotocom.json
Then, a test helper function which loads this file, and processes all the post records using an engine fixture.
Combined, these fixtures make it easy to do test-driven-development of new rules. You find an account which recently sent spam or violated some policy, take a capture snapshot, set up a test case, and then write a rule which triggers and satisfies the test.
Some notes:
indigo/automod/automodtest
) but hit a circular import, so left where it is