Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for hard/soft bloom filter base + stash #22828

Merged
merged 2 commits into from
Nov 27, 2024

Conversation

KevinMind
Copy link
Contributor

@KevinMind KevinMind commented Nov 6, 2024

Fixes: mozilla/addons#15014
Relates to: mozilla/addons#15155

Description

Adds logic to generate and write bloom filters for both soft and hard blocked addons. Additionally this PR introduces logic to determine whether we should update one or both bloom filters and or a stash as multiple possible outcomes are possible now. Finally, we handle cleaning up files on a more granular level from both the local storage and remote settings.

Context

Now when we run the upload_mlbf_to_remote_settings cron job we will check for both hard and soft blocked items. It is possible to:

  • do nothing
  • upload a hard block filter only
  • upload a hard block filter and a stash (for soft blocks)
  • upload a soft block filter
  • upload a soft block filter and a stash (for hard blocks)
  • upload both filters

This adds a bit of complexity we need to address.

Additionally, instead of deleting all records from remote settings, we need to check for the current set of block filters and only delete records older than the older of the two.

Finally, since it is also possible to run the cron when no updates have occurred, we can safely delete mlbf cache files when that happens as there is no benefit from diffing an empty array.

Testing

This is gonna suck to test. First some preparation work.

Setup

  • Setup a local remote server here
  • Set the base replace threshold to a low number (so you can trigger re-uploading of filters without creating a bunch of blocks)
  • enable enable-soft-blocking and blocklist_mlbf_submit waffle switch

src/olympia/constants/blocklist.py

BASE_REPLACE_THRESHOLD = 1

See the test scenarios

from olympia.blocklist.models import BlockType
from olympia.amo.tests import addon_factory, block_factory, version_factory

def _blocked_addon(block_type=BlockType.BLOCKED, **kwargs):
    addon = addon_factory(**kwargs)
    block = block_factory(
        guid=addon.guid, updated_by=user, block_type=block_type
    )
    return addon, block

user = UserProfile.objects.first()

Now you can call the _blocked_addon method to create an addon with block/version of the specified type.

Ex:

_blocked_addon(block_type=BlockType.BLOCKED)
_blocked_addon(block_type=BlockType.BLOCKED)
_blocked_addon(block_type=BlockType.SOFT_BLOCKED)

If you run the cron job now, you'd expect a blocked filter and a stash with the soft blocked version added.

Checklist

  • Add #ISSUENUM at the top of your PR to an existing open issue in the mozilla/addons repository.
  • Successfully verified the change locally.
  • The change is covered by automated tests, or otherwise indicated why doing so is unnecessary/impossible.
  • Add before and after screenshots (Only for changes that impact the UI).
  • Add or update relevant docs reflecting the changes made.

@KevinMind KevinMind force-pushed the soft-block-bloom-filter-filter branch 4 times, most recently from 1c5160b to aeb1b36 Compare November 7, 2024 11:15
@KevinMind KevinMind mentioned this pull request Nov 7, 2024
5 tasks
@KevinMind KevinMind force-pushed the soft-block-bloom-filter-filter branch 7 times, most recently from f30bf73 to f387a3d Compare November 12, 2024 10:22
src/olympia/blocklist/cron.py Outdated Show resolved Hide resolved
src/olympia/blocklist/cron.py Outdated Show resolved Hide resolved
src/olympia/blocklist/mlbf.py Show resolved Hide resolved
src/olympia/blocklist/tasks.py Outdated Show resolved Hide resolved
@KevinMind KevinMind requested a review from willdurand November 12, 2024 12:13
@KevinMind KevinMind marked this pull request as ready for review November 12, 2024 12:13
@KevinMind KevinMind force-pushed the soft-block-bloom-filter-filter branch from f39e370 to 08d3851 Compare November 12, 2024 13:11
@KevinMind KevinMind force-pushed the soft-block-bloom-filter-filter branch 3 times, most recently from 779be6e to b2007e3 Compare November 13, 2024 17:15
@willdurand
Copy link
Member

Fixes: mozilla/addons#15014
Fixes: mozilla/addons#15166
Rleates to: mozilla/addons#15155

we should almost never fix two issues with a single PR, so please fix mozilla/addons#15166 in a different PR.

@KevinMind KevinMind force-pushed the soft-block-bloom-filter-filter branch 6 times, most recently from 227f07e to 3917ee4 Compare November 13, 2024 20:10
@KevinMind
Copy link
Contributor Author

Fixes: mozilla/addons#15014
Fixes: mozilla/addons#15166
Rleates to: mozilla/addons#15155

we should almost never fix two issues with a single PR, so please fix mozilla/addons#15166 in a different PR.

I've never heard of this rule and regularly do this. Why should that be a rule?

@KevinMind KevinMind force-pushed the soft-block-bloom-filter-filter branch 2 times, most recently from 5492d61 to 598ef06 Compare November 25, 2024 18:50
Base automatically changed from soft-block-cron-job to master November 25, 2024 20:42
@KevinMind KevinMind force-pushed the soft-block-bloom-filter-filter branch from 598ef06 to d1c4088 Compare November 25, 2024 20:46
Copy link
Member

@willdurand willdurand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am done for today, but that makes it much easier to see what's going on, thanks.

To be continued...

src/olympia/blocklist/mlbf.py Outdated Show resolved Hide resolved
src/olympia/blocklist/tasks.py Outdated Show resolved Hide resolved
src/olympia/blocklist/mlbf.py Outdated Show resolved Hide resolved
@KevinMind KevinMind force-pushed the soft-block-bloom-filter-filter branch from c357bc7 to 5379148 Compare November 26, 2024 11:12
@KevinMind KevinMind requested a review from willdurand November 26, 2024 11:13
@KevinMind KevinMind force-pushed the soft-block-bloom-filter-filter branch 6 times, most recently from 99a0e05 to 3c32be4 Compare November 26, 2024 14:04
@KevinMind KevinMind force-pushed the soft-block-bloom-filter-filter branch 3 times, most recently from 7cfa66a to fb2db99 Compare November 26, 2024 16:41
Copy link
Member

@willdurand willdurand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more feedback. This is coming together.

src/olympia/blocklist/tests/test_cron.py Outdated Show resolved Hide resolved
src/olympia/blocklist/mlbf.py Outdated Show resolved Hide resolved
src/olympia/blocklist/mlbf.py Outdated Show resolved Hide resolved
src/olympia/blocklist/mlbf.py Outdated Show resolved Hide resolved
src/olympia/blocklist/mlbf.py Show resolved Hide resolved
src/olympia/blocklist/tests/test_cron.py Outdated Show resolved Hide resolved
src/olympia/blocklist/tests/test_cron.py Outdated Show resolved Hide resolved
src/olympia/blocklist/tests/test_cron.py Outdated Show resolved Hide resolved
src/olympia/blocklist/tests/test_cron.py Outdated Show resolved Hide resolved
src/olympia/blocklist/tests/test_cron.py Outdated Show resolved Hide resolved
@KevinMind KevinMind force-pushed the soft-block-bloom-filter-filter branch 2 times, most recently from 4e83dcf to 8c0030a Compare November 26, 2024 18:09
Remove unecessary and redundant code

Fix ordering of cache/stash + increase validity of tests

Upload multiple filters

More logs + correct handling of attachment_type

Verify cron passes correct args to task

TMP: Ignore soft blocks

Add waffle switch

Fix invalid class reference

Update to correct waffle switch

Update to fix the test

reafactoring

add missing tests

Apply suggestions from code review

Co-authored-by: William Durand <[email protected]>

Updates from review

Ensure blocks of type X are excluded from stash if filter of type X is also being uploaded

TMP: squash

Better exclusion of stashes from updated filters + more comment resolution

Delete correct files from remote settings:
- delete any existing attachments matching filters we are reuploading
- delete any stashes that are older than the oldest filter

Add tests for shape of stash

Add test for the craziness that is stash

Simple is better than complex

Compare to base filter when determining stash

Use block type level base filter

Apply suggestions from code review

Co-authored-by: William Durand <[email protected]>
@KevinMind KevinMind force-pushed the soft-block-bloom-filter-filter branch from faa29ba to c02eff7 Compare November 26, 2024 18:13
@KevinMind KevinMind requested a review from willdurand November 26, 2024 18:15
Copy link
Member

@willdurand willdurand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r+wc, thanks!

src/olympia/blocklist/tests/test_mlbf.py Show resolved Hide resolved
src/olympia/blocklist/mlbf.py Show resolved Hide resolved
@willdurand
Copy link
Member

(note: it'd be good if you could clear the commit message before merging)

@KevinMind KevinMind force-pushed the soft-block-bloom-filter-filter branch from 0a13ed7 to 67be5ff Compare November 27, 2024 09:39
@KevinMind KevinMind merged commit fe36896 into master Nov 27, 2024
31 checks passed
@KevinMind KevinMind deleted the soft-block-bloom-filter-filter branch November 27, 2024 09:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Task]: Generate separate bloomfilter for soft-blocks
2 participants