Malware detection algorithms need to extract features from executable files (PE, ELF, MachO, OAT, DEX, VDEX, and ART formats). The LIEF package (https://github.com/lief-project/LIEF) has tools to parse these files and create features that can be used in training.
Features for PE files are based on EMBER (https://arxiv.org/abs/1804.04637) (https://github.com/endgameinc/ember).
Extracts general features from PE files such as size, import/export counts, and other basic features.
- Single text column which contains full paths to PE files on the same machine running DAI
- Multiple numerical columns
No limitations
- lief
Features derived from the PE file header and option header.
- Single text column which contains full paths to PE files on the same machine running DAI
- 63 numerical columns
No limitations
- lief
Extracts section characteristics from PE files.
- Single text column which contains full paths to PE files on the same machine running DAI
- Multiple numerical columns
No limitations
- lief
The counts for each byte value in a PE file. These counts are then normalized.
- Single text column which contains full paths to PE files on the same machine running DAI
- 256 numerical columns
No limitations
- lief
Features derived from the PE file data directory
- Single text column which contains full paths to PE files on the same machine running DAI
- 30 numerical columns
No limitations
- lief
Features derived from the PE file imports
- Single text column which contains full paths to PE files on the same machine running DAI
- 1280 numerical columns
No limitations
- lief
Features derived from the export data section of the PE file.
- Single text column which contains full paths to PE files on the same machine running DAI
- 128 numerical columns
No limitations
- lief