automod: mentions spike rule #482

warpfork · 2023-12-15T13:44:18Z

Introduce a simple new rule for detecting and flagging spikes of mentions within a period.

warpfork · 2023-12-15T13:49:35Z

automod/rules/spammentions.go

+	if !newMentions {
+		return nil
+	}
+	if mentionHourlyThreshold <= evt.GetCountDistinct("mentions", did, automod.PeriodHour) {


Interestingly, I think it's basically impossible for something to trigger on a single message, due to the order of events: counter increments get buffered until persist time, so if we're reading the counts on the same topic like this during the same rule function, we're getting the data from before we started.

That's maybe fine, but seems like maybe we should spawn a todo to write something up about that in a doc block somewhere about things to be expecting as a rule author?

Yup! And in particular with limits you'll hit most rules on "limit plus one" not the limit number directly. Agree this should be included in rule-writing guide.

automod/rules/spammentions.go

bnewbold

This looks great! It is basically how I would write it.

Before i'd deploy this, i'd probably want to test it a bit, either with a capture of real-world behavior, or by reducing the threshold and testing against local or staging.

bnewbold · 2023-12-15T14:02:43Z

automod/rules/spammentions.go

+	if !newMentions {
+		return nil
+	}


this skip-a-network-fetch-if-nothing-new is an interesting little idiom. I think my rules are mostly the other way around (read first, increment later), but this is potentially more efficient.

bnewbold · 2023-12-15T14:06:27Z

automod/rules/spammentions.go

+
+var _ automod.PostRuleFunc = SpamMentionsRule
+
+var mentionHourlyThreshold = 20


FWIW, my intuition is that we'll need to bump this a lot. one example is follow-friday type threads where people list out a bunch of recommended follows, easily a few dozen.

in theory we could do some data science, but I think what probably works best is what you're doing here implicitly: start with just a "flag" which will result in a slack notification, and we can see how legit the hits are.

one workflow thing that will need improvement is flag naming. right now once a flag gets set, it sticks around forever. if we change the theshold, or re-write the rule entirely, it would make sense to purge all the old flags. we could use manually process to work around this (prefix new flags with "dev-" or "beta-"), but there are probably better options.

Hm, daily numbers might be useful too then. Follow-friday behaviors could tip the hourly count is that number is relatively low, but if we also do a daily number that's considerably higher, and it should be easy for follow-friday to stay below that while real spam still goes above it.

A mechanism for counting hours-today-that-crossed-the-line would also buff that out. Follow-friday shouldn't go above 1 or 2.

bnewbold · 2023-12-15T14:08:46Z

automod/rules/spammentions.go

+	for _, facet := range post.Facets {
+		for _, feature := range facet.Features {
+			mention := feature.RichtextFacet_Mention
+			if mention == nil {
+				continue
+			}


there is an ExtractFacets helper function, though in this case the code looks pretty tight without it:
https://pkg.go.dev/github.com/bluesky-social/indigo/automod/rules#ExtractFacets

bnewbold · 2023-12-15T14:17:35Z

automod/rules/spammentions.go

+			if mention == nil {
+				continue
+			}
+			evt.IncrementDistinct("mentions", did, mention.Did)


we use the IncrementDistinctPeriod variant to reduce the number of hyperloglog data counters in memory. on the other hand, distinct mentions do seem like the kind of thing we might want to persist long-term, or change our mind amount, so seems good to keep as-is.

A rule's name should describe what it does, not presumptively describe the valence of what it hopes that it's detecting.

automod: spam mentions rule.

c05826d

warpfork requested a review from bnewbold December 15, 2023 13:44

warpfork commented Dec 15, 2023

View reviewed changes

automod/rules/spammentions.go Outdated Show resolved Hide resolved

bnewbold approved these changes Dec 15, 2023

View reviewed changes

automod: distinct mentions rule rename to be more neutral.

8d5de74

A rule's name should describe what it does, not presumptively describe the valence of what it hopes that it's detecting.

bnewbold merged commit 998c066 into main Dec 19, 2023
7 checks passed

bnewbold deleted the warpfork/automod-mentions-spike-rule branch December 19, 2023 07:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

automod: mentions spike rule #482

automod: mentions spike rule #482

warpfork commented Dec 15, 2023

warpfork Dec 15, 2023

bnewbold Dec 15, 2023

bnewbold left a comment

bnewbold Dec 15, 2023

bnewbold Dec 15, 2023

warpfork Dec 15, 2023

bnewbold Dec 15, 2023

bnewbold Dec 15, 2023


		var _ automod.PostRuleFunc = SpamMentionsRule

		var mentionHourlyThreshold = 20

automod: mentions spike rule #482

automod: mentions spike rule #482

Conversation

warpfork commented Dec 15, 2023

warpfork Dec 15, 2023

Choose a reason for hiding this comment

bnewbold Dec 15, 2023

Choose a reason for hiding this comment

bnewbold left a comment

Choose a reason for hiding this comment

bnewbold Dec 15, 2023

Choose a reason for hiding this comment

bnewbold Dec 15, 2023

Choose a reason for hiding this comment

warpfork Dec 15, 2023

Choose a reason for hiding this comment

bnewbold Dec 15, 2023

Choose a reason for hiding this comment

bnewbold Dec 15, 2023

Choose a reason for hiding this comment