SequenceId seems unnecessary #43

scottcarey · 2019-11-21T21:47:02Z

I may be wrong, but I believe that SequenceId is unnecessary. It eats up 8 bytes per entry in memory, and >=16 bytes per entry on disk (one for each entry in tombstone/index/data files).

My reasoning:

One reason that SequenceId is required, is so that during initialization, tombstones (deletes) and index updates (put/replace) do not process out of order. Tombstones are processed last, but do not apply to keys that were updated 'after' them.
If Tombstones are sequenced in order inside the index file as they occur, then rebuilding the index in the order it was written would resolve the above, without sequenceId.
The other reason that SequenceId is required, is to support concurrent initialization by multiple threads. For simple 'puts', the fileId is enough to resolve which update should win. But if tombstones are interleaved with index loading as I suggest above, then concurrent threads from different files doing puts and deletes on the same key will have race conditions (if file 2 does a delete on key X, then file 1 does a put, it would not know that file 2 has already removed it).
I see two solutions to the above, assuming tombstones are sequenced in the index/data file in the order they happened relative to the 'puts':
1. When a delete happens, leave the fileId in the in memory map with the key, marked as deleted (possible with closed addressing but not the current data structures).
2. Initialize the database in order, from oldest file to newest file, and split the data by hash so that a thread per segment can do the update. This would be limited by how fast one thread could compute the hash of the keys in the tombstone/index file. One optimization would be if compaction was able to merge and split files so that files are on disjoint key hash ranges. For example, if compacting file 1,2, 3, 4 all together yeilded four files "4.a, 4.b, 4.c, 4.d" each representing a distinct hash range perhaps split by the top two bits of the hash into 4 buckets. Then the initialization could read all four in parallel as their updates would be disjoint and apply to different Segments.

My biggest concerns with such a change is that it is a major overhaul of the file format and in memory layout, and would not easily share code with the older implementation. The work to make the code support the old and new formats simultaneously could easily be most of the effort.

bellofreedom added enhancement New feature or request question Need more discussion HaloDB-1.0 Feature for HaloDB 1.0 and removed HaloDB-1.0 Feature for HaloDB 1.0 labels Dec 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SequenceId seems unnecessary #43

SequenceId seems unnecessary #43

scottcarey commented Nov 21, 2019

SequenceId seems unnecessary #43

SequenceId seems unnecessary #43

Comments

scottcarey commented Nov 21, 2019