Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

automod: extracting some packages, and concurrency safety #464

Merged
merged 8 commits into from
Dec 10, 2023

Conversation

warpfork
Copy link
Contributor

@warpfork warpfork commented Dec 6, 2023

A few minor hopefully-advancements to the automod/hepa packages:

  • countstore is now extracted as a package.
  • Synchronization primitives are added to MemCountStore. This fixes crashes when running hepa without the use of Redis.
  • Additional testing, to make sure that MemCountStore gets enough exercise that the race detector has something to work on in tests.
  • Attempted to sneak in some docs along the way!

The diff looks a little larger than it is, because of git and file moves... but individual commits are broken down as much as possible, so those should also be fine to review in isolation if preferred.

Also separate the mem implementation and the interface into their own
files.

No other changes in this commit but moves, to keep history as legible
as possible.

This is just a feature grouping package, to reduce how much is in the
main automod package.  Implementations are not separted into further
subpackages because there's no visible need to do that as yet (e.g.
we're not particular concerned with minimizing dep graphs, etc).

Some other parts of automod might deserve similar extraction (the
cachestore is a likely example) but that can be pursued separately.
Yet others (namely, the redis identity directory) might deserve
extraction, but that one might already be general well beyond
automod, perhaps deserving to live out in the atproto packages,
so that will definitely deserve its own full review.
… interfaces and shared logic.

Also, no clear reason for it to be exported, so it no longer is.
…aces.

The existing tests were linear enough that they didn't provoke race
conditions; the new test does just enough concurrency to give the race
detector a chance to shine.  It also tests final count accuracy.

This fixes a real issue: the hepa command would crash periodically if
run without redis, due to the optimistic race checker that golang now
uses by default even in production builds.  (Concurrency is introduced
in the hepa command by the `events.HandleRepoStream` system and the
autoscaling scheduler it's given.)

Why github.com/puzpuzpuz/xsync/v3 instead of stdlib sync.Map?
Primarily, this package is generic, and the stdlib (oddly) still isn't.
It also seems to have impressive benchmarks.  And ultimately, it passes
the functional tests here.  (Do note the "v3".  If you try to use this
module without the major version suffix, you'll get "v1.5.2", which is
nonviable -- it had not yet introduced the compare-and-swap primitive!)
@warpfork warpfork requested a review from bnewbold December 6, 2023 19:23
Copy link
Collaborator

@bnewbold bnewbold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!

Left some comments, but nothing merge-blocking from my perspective. If you can add a note about memory usage for MemCountStore that would be great, but I can also add that just before merging.

As a process note, we (bluesky) often leave "approve" review status on PRs if there is only a small amount of cleanup to finish, indicating that anybody could merge if urgent, or that the original author can self-merge after small changes. We tend to keep "request changes" to indicate a blocking/veto issue ("revise and resubmit for review"), and comment for drive-by comments (if review wasn't requested).

automod/countstore/countstore_mem.go Show resolved Hide resolved
func (s MemCountStore) IncrementDistinct(ctx context.Context, name, bucket, val string) error {
for _, p := range []string{PeriodTotal, PeriodDay, PeriodHour} {
k := periodBucket(name, bucket, p)
s.DistinctCounts.Compute(k,func(nested *xsync.MapOf[string, bool], _ bool) (*xsync.MapOf[string, bool], bool) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the nested map need to be a sync primitive as well? my intuition is that the sync control of the outer map should protect the inner map.

this isn't a review-blocking issue, just curious

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...Okay, that's a good question. I've been nerdsnipped and I'm actually still a bit unsure.

My theoretical understanding: I don't think it would be safe. (And not just like "aw that's a minor inaccuracy", like, I think the race detector should flag it.)

  • I read the source of this syncx package:
    • This map does use some mutexing during writes.
    • And that mutexing does cover the compute function callback body.
    • It does not use mutexting during reads. (Reads are done purely with atomic load operations.)
    • (And obviously, once you've done a read, you've... got the value; even if there was mutexing during the read (and there's not), it couldn't still be in effect after you hold the value.)
  • So do we do anything on the value after a read? Yes, we do have a read where we extract the inner map... and then call .Size() on it (or would call len() if it was a native map).
  • Is that operation safe?
    • xsync's .Size(): yes, defintely. It's another atomic read.
    • native len(): No. len() of native maps is considered a read for race detector purposes. Performing one outside of a mutex (even if all writes are mutexed) will trigger the race detector. (I even wrote a little side test to be sure -- separate from our nested maps situation here. Yep, easy to trigger.)

Practically verifying the whole thing: I wrote a quick variant with a plain native map on the inner, and... I'm unable to get the race detector to flag it, nor get the test to fail.

🤯

Overall: I'm suspicious my testing is somehow not aggressive enough, rather than confident that a plain map on the inner is fully safe.

automod/countstore/countstore_test.go Show resolved Hide resolved
automod/countstore/countstore.go Outdated Show resolved Hide resolved
@warpfork warpfork merged commit c54aff4 into main Dec 10, 2023
7 checks passed
@bnewbold bnewbold deleted the warpfork/automodularizing branch December 11, 2023 08:05
bnewbold added a commit that referenced this pull request Dec 14, 2023
Two enabling features:

- cheap consistent non-cryptographic-strength hashing of strings for use
in counter keys (went with uint64 murmur3, which was already in
dependency tree)
- ability to increment a counter for a single time period, to control
counter key space growth (for redis)

This initial version of the rule counts replies to any other user in the
same bucket, not distinct-accounts-with-same-reply-text. I'm a little
worried about redis memory growth if we have a HyperLogLog for each
author+text combination (as opposed to simple counter int). Maybe the
redis implementation is clever and efficient for the small-distinct
case? Or maybe RAM is cheap enough?

This branch will conflict with
#464. Plan to merge that
one first, then i'll rebase this one.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants