Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[py-tx][mlh][tracking issue] Upgrade py-tx CLI storage implementation to dbm #1687

Open
Dcallies opened this issue Nov 8, 2024 · 0 comments
Labels
mlh Related to Major League Hacking Fellowship

Comments

@Dcallies
Copy link
Contributor

Dcallies commented Nov 8, 2024

This is a large, multistep issue. In general, it seems to replace the custom file-and-pickle based solution for storage in the python-threatexchange CLI with a more standardised solution using pythons dbm module.

This requires backwards-compatibility with the current version, with the intent to deprecate the old storage in a future 2.0.0 release.

We can do each bit piecemeal, but it will require a lot of refactoring:

Foreach of the major interface types in https://github.com/facebook/ThreatExchange/blob/main/hasher-matcher-actioner/src/OpenMediaMatch/storage/interface.py:

  1. Copy the inner interface over to a new file in python-threatexchange/threatexchange/storage/interface.py (own PR)
  2. Release a new hotfix version of pytx (own PR)
  3. Change the HMA version to import pytx version instead
  4. Create a new file in python-threatexchange/threatexchange/cli/storage, and implement an equivalent to the old logic in https://github.com/facebook/ThreatExchange/blob/main/python-threatexchange/threatexchange/cli/cli_state.py (own PR) make sure to add unittests
  5. Remove the old copy of the functionality in https://github.com/facebook/ThreatExchange/blob/main/python-threatexchange/threatexchange/cli/cli_state.py and very carefully refactor all existing locations to use the new interface (own PR)
  6. Create a second implementation of the interface that instead uses dbm as the storage backing - you can test with a fresh copy of the CLI to make sure it works - add tests (own PR)
  7. Add compatibility logic - first check the local state to see if there is an existing state, and if not, use your new DBM code, and if so, use the historic code

Here are the sub-issues for each interface:

@Dcallies Dcallies converted this from a draft issue Nov 8, 2024
@Dcallies Dcallies added the mlh Related to Major League Hacking Fellowship label Nov 8, 2024
@Dcallies Dcallies changed the title [py-tx][mlh] Upgrade py-tx CLI storage implementation to dbm [py-tx][mlh][tracking issue] Upgrade py-tx CLI storage implementation to dbm Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mlh Related to Major League Hacking Fellowship
Projects
Status: No status
Development

No branches or pull requests

1 participant