-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[py-tx][tat] TAT API implementation doesn't work correctly with HMA #1668
Comments
Hi @aokolish Our hash list API delivers a JSON file which is updated on a nightly basis as you have seen in our documentation. In the current implementation running I'm happy to go over this further to help you find a solution, or if there is something I'm misunderstanding. |
Hey @aokolish , @Bruce-Pouncey-TAT - I looked into this briefly, and this might be a missing functionality in HMA, which currently can't handle APIs like TATs that doesn't handle deltas but also changes over time. Since there is no efficient way to discover updated records with this kind of API, we'd need to write something new in HMA to load all the previously downloaded records from TAT in memory, then create a diff itself (especially removals). We could also force clear all the hashes every time, but as the number of hashes grows over time, this creates weird inconsistencies in the database (hashes disappearing and reappearing in the index) that might have real production impact. This feature doesn't exist today, and so by default TAT isn't correctly supported by HMA, hence the logs that Alex is seeing. To evaluate the potential solutions:
|
I'm testing out hasher-matcher-actioner + tech against terrorism (TAT) API and noticed a log line from the server:
This leads me to believe that HMA would never pull new hashes from the TAT API. Can this somehow be fixed in the TAT exchange implementation?
Here are their API docs - https://www.terrorismanalytics.org/docs/hash-list-v1
There are probably easier steps to reproduce, but my steps were...
The text was updated successfully, but these errors were encountered: