- Complete the C interface / dylib: deserialize, save/load file, map with bytes output
- Test compressed maps in CFFI
- Test (de)serialization in CFFI
- Test that CFFI doesn't leak memory
- Rename the C FFI types?
- Test on Armv7 and x86, RISC-V and some big-endian system
- Demo apps
- Optimize results for small maps? Here storing the hash key and rounding up to one block (=32 bits) are costly.
- First possible approach: add linear constraints that certain bits of the last block must match (functions of) bits elsewhere.
- Then encode the other bits of the last block separately in packed form, separately.
- On decode, fill in the extra bits.
- Second possible approach: arrange that fewer entries end up in the last block, and then directly constrain some of its bits (eg to be zero).
- Third possible approach: make ribbon offsets in bytes (but still block-sized), and change the solver.
- First possible approach: add linear constraints that certain bits of the last block must match (functions of) bits elsewhere.
- Distinguish between "out of memory", "can't create thread" etc, and "matrix is not invertible"
- Enable mmap?
- Allow other implementations such as binary fuse filters?
- Reduce C dylib size?
- Make a one-shot C vector to Rust map compression call.
- Why is SipHasher so slow on Intel?
- Multithread hashing even if we aren't multithreading bucketsort. (Using Rayon??)
- Improve optimization of the threaded version
- Improve pseudoinverse
- More profile-driven optimization
- Compare which parts are still faster in C
- Somehow LTO speeds up nonuniform maps but regresses uniform query.
- Whitepaper
- Make production-quality (1.0).
- no_std core for embedded systems
- Add SSSE3 version? ARM SEV??
- Better interface for tile matrices; release as its own crate?
- Test on very large data sets (eg CompressedRandomMap with 1 billion entries; needs lots of memory to build)
- Prove correctness; it may also give insights on optimal matrix shapes.