-
Notifications
You must be signed in to change notification settings - Fork 11
Object Ids and Namespaces
Everything in Querki is a Thing -- Spaces, Models and Properties are all Things deep down. And each Thing has a Name. But those Names aren't unique across the whole world, just inside of your Space. (Otherwise, we'd have name conflicts all over the place.)
So each Thing also has an Object ID (OID), which is its true identifier. Within an entire Querki installation, there can be only one Thing with a given OID. (If we ever get into federation we'll have to come up with meta-identifiers, but that's a problem for another day.)
While Things are generally displayed by Name, internally we mostly refer to them by OID. For instance, when we show the Properties of a Thing, we display each Property by Name. But what we actually store is the OID of that Property. Indeed, while you write QL expressions mainly in terms of Names, those get translated to OIDs when we save the expressions.
OIDs are displayed as six-to-eight-character strings externally, and are often stored that way. They are actually composed of two elements: a Shard ID and a Local Object ID.
The Local Object ID is the lower 32 bits of the OID -- this allows about 4 billion total rows per Shard, which seems like a good limit.
The Shard ID is the index of the database itself. In principle, we are allowing 2^32 Shards, although obviously we don't expect the reality to come within a fraction of that.
Shard 0 is special and reserved: that is a pseudo-Shard for hardcoded Things built into the system. All of the System Space will be hardcoded, with predefined OIDs in Shard 0. No runtime-created objects may be placed in Shard 0, so that we have plenty of namespace to continue adding Things to the code.
Shard 1 will be the initial namespace for querki.net proper. This should be enough room for the first year or so, but we do expect to run out of it eventually.
Shard 2 may be reserved for tests, but that remains to be seen.
As mentioned above, OIDs will be displayed to the user as short text strings. These will be the base-36 encoding of the binary 64-bit OIDs. Base-36 == digits plus letters, case-insensitive.
(I had originally planned on using base-62, case-sensitive, which would save a little database space since the strings would be shorter. But I have decided that this is potentially inconvenient for external uses: case-sensitive is a bloody pain in the ass for end users if they ever need to type these things manually, and some things such as URLs are not reliably preserved as case-sensitive. So we'll survive slightly longer string representations in the name of not making life harder.)
As mentioned earlier, we expect the initial OIDs to be around eight characters long, but you are not allowed to assume that. When we expand beyond a few shards (which is likely if the company is successful enough), these string will expand as the namespace does. For most purposes, simply treat them as opaque strings.
Note that each Shard maintains an internal counter for generating OIDs, but this does not guarantee that the OID lives on that Shard. In principle, a Space can be moved from Shard to Shard for load-balancing purposes down the road, and this might result in a Space containing OIDs with different Shard prefixes. Again, it is best to treat an OID as opaque once it is generated.
We will use this stringified OID most of the time. Indeed, assuming that the data store for Things requires String keys (which all the options I know of do), we will use the string form for that. The binary form of an OID is likely to only be used as the internal id field for the Thing's row/document in the database.