-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
automod: identical reply rule #466
Changes from 3 commits
1f49fc7
7489349
8d6486d
8df80a3
002a887
b5b3175
32c2df3
c10dfce
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -27,3 +27,24 @@ func ReplyCountPostRule(evt *automod.RecordEvent, post *appbsky.FeedPost) error | |
evt.IncrementDistinct("reply-to", did, parentURI.Authority().String()) | ||
return nil | ||
} | ||
|
||
var identicalReplyLimit = 5 | ||
|
||
// Looks for accounts posting the exact same text multiple times. Does not currently count the number of distinct accounts replied to, just counts replies at all. | ||
// | ||
// There can be legitimate situations that trigger this rule, so in most situations should be a "report" not "label" action. | ||
func IdenticalReplyPostRule(evt *automod.RecordEvent, post *appbsky.FeedPost) error { | ||
if post.Reply == nil || IsSelfThread(evt, post) { | ||
return nil | ||
} | ||
|
||
// use a specific period (IncrementPeriod()) to reduce the number of counters (one per unique post text) | ||
period := automod.PeriodDay | ||
bucket := evt.Account.Identity.DID.String() + "/" + HashOfString(post.Text) | ||
if evt.GetCount("reply-text", bucket, period) >= identicalReplyLimit { | ||
evt.AddAccountFlag("multi-identical-reply") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not new to this PR, but I think some docs on the semantic distinctions between There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. mmm! added doc comments to all the fields for most of the RepoEvent variants |
||
} | ||
|
||
evt.IncrementPeriod("reply-text", bucket, period) | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now, this is proceeding to act immediately on a hash collision. Do you think it would be reasonable to do a more expensive check to see if things are actually identical?
I don't know how bad a false positive is here. Maybe if the threshhold for identical in sheer count is moderate, it's unlikely to trigger on reasonable real human behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the naive false positive rate (for the 64bit variant of murmur3) is low enough to not worry about it and not do secondary network requests to check for exact matches.
This isn't a cryptographic hash, so attacks could be a concern for some rules.
Generally, I feel like all the counters we are using here should be treated as a bit fuzzy, at least for record-level counts. It is totally possible for events to get partially-processed (and partially persisted) and then re-processed again after a crash. I think the semantics and kinds of rules and actions we write are generally resilient to this: doing things like reporting for human review, or having large margins before taking fully-automated action.