You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I may be wrong, but I believe that SequenceId is unnecessary. It eats up 8 bytes per entry in memory, and >=16 bytes per entry on disk (one for each entry in tombstone/index/data files).
My reasoning:
One reason that SequenceId is required, is so that during initialization, tombstones (deletes) and index updates (put/replace) do not process out of order. Tombstones are processed last, but do not apply to keys that were updated 'after' them.
If Tombstones are sequenced in order inside the index file as they occur, then rebuilding the index in the order it was written would resolve the above, without sequenceId.
The other reason that SequenceId is required, is to support concurrent initialization by multiple threads. For simple 'puts', the fileId is enough to resolve which update should win. But if tombstones are interleaved with index loading as I suggest above, then concurrent threads from different files doing puts and deletes on the same key will have race conditions (if file 2 does a delete on key X, then file 1 does a put, it would not know that file 2 has already removed it).
I see two solutions to the above, assuming tombstones are sequenced in the index/data file in the order they happened relative to the 'puts':
When a delete happens, leave the fileId in the in memory map with the key, marked as deleted (possible with closed addressing but not the current data structures).
Initialize the database in order, from oldest file to newest file, and split the data by hash so that a thread per segment can do the update. This would be limited by how fast one thread could compute the hash of the keys in the tombstone/index file. One optimization would be if compaction was able to merge and split files so that files are on disjoint key hash ranges. For example, if compacting file 1,2, 3, 4 all together yeilded four files "4.a, 4.b, 4.c, 4.d" each representing a distinct hash range perhaps split by the top two bits of the hash into 4 buckets. Then the initialization could read all four in parallel as their updates would be disjoint and apply to different Segments.
My biggest concerns with such a change is that it is a major overhaul of the file format and in memory layout, and would not easily share code with the older implementation. The work to make the code support the old and new formats simultaneously could easily be most of the effort.
The text was updated successfully, but these errors were encountered:
I may be wrong, but I believe that SequenceId is unnecessary. It eats up 8 bytes per entry in memory, and >=16 bytes per entry on disk (one for each entry in tombstone/index/data files).
My reasoning:
My biggest concerns with such a change is that it is a major overhaul of the file format and in memory layout, and would not easily share code with the older implementation. The work to make the code support the old and new formats simultaneously could easily be most of the effort.
The text was updated successfully, but these errors were encountered: