-
Notifications
You must be signed in to change notification settings - Fork 0
Page layout
Slightly outdated visualisation is available here.
Page holds a b+tree node. There are two types of nodes:
- internal (hold links to other nodes, number of links is N+1, where N is the number of keys)
- leafs (hold values, number of values is exactly the same as the number of keys)
Both nodes hold keys, for internal nodes this is the main binary payload, copying of which we avoid by using indirection in form of offsets. It is convenient to have different layouts for internal and leaf nodes:
- internal nodes hold links separately from the keys, corresponding array of BigEndian-encoded uint32's is located in the page right after the offsets array (length of links array greater by 1 compared to offsets array)
- leaf nodes hold values close to the keys, in cells which offsets point to
This design is a good fit, as it allows us to reorder links separately from the keys. The downside here is that we have to rewrite all of the links after, say, inserting one in the middle. But since links have a fixed size this is acceptable for the sake of simplicity.
Modifications of the page can take the following forms:
- insertion of a key (and a link) to an internal page - 2 writes:
- to persist header, offsets and links (around 1221 bytes for page with 100 keys)
- to write a key (arbitrary amount of bytes)
- insertion of a key (and value) to a leaf page - also 2 writes;
- adding a brand new page after a split - 1 write of the whole page
- we can buffer added keys, and flush them in one write as they most likely will take consecutive offsets
In the first implementation we won't have overflow pages which limits the key/value pair size, so it can be guaranteed that page will always have place for maximum of K keys, which is defined by b+tree configuration. This limit should be eliminated.