Files should be tracked and stored as first-class objects with their own attributes. #15
Labels
enhancement
New feature or request
good first issue
Good for newcomers
help wanted
Extra attention is needed
Can borrow some of this thinking from Skluma --
Files should be treated as first-class entities along with groups (and families). For instance, we no longer check whether files should be inflated (because in MDF, some files are pushed into workflows as deflated objects). Moreover, we're not tracking individual file size, etc, throughout.
The reason this is important is because we should want a more-granular bookkeeping of files in the system in order to start extending towards the use of other extractors. Is the file compressed? Decompress it. Is it near other interesting files? Check its context.
Each file should have its own object with keyword args:
Then we should also create a separate database of files and their necessary metadata so we can answer questions like "how many of this type of file do you have?"
The text was updated successfully, but these errors were encountered: