Skip to content

Channel Data Model

Dave Rigby edited this page Jan 7, 2016 · 5 revisions

This is a high-level overview of how channel and access information is stored in the _sync metadata.

What are channels?

Functionally, a channel defines read-side security for documents. Documents are assigned to zero or more channels, and users are granted access to zero or more channels. In order for a user to read a document through the Sync Gateway REST API (and by extension, via client replication), the document needs to be assigned to a channel that the user has access to.

Channels are associated with a particular revision of a document. All channel information for a document is stored in Sync Gateway's private _sync metadata block in the document itself. The _sync metadata isn't accessible to clients for read or write - it's managed by Sync Gateway, and is stripped from documents before they are replicated.

Channel information stored in Sync Gateway metadata

1. Channel information in the revision tree

Sync Gateway stored channel assignment in the revision tree for the document. For each revision in the rev tree, we store the list of channels that are assigned to that revision of the document.

2. Channel presence history

In addition to the information in the rev tree, we also store channel history for the active revision independently. The channels property in the _sync metadata lists of all channels the document has ever been assigned. For channels that aren't assigned to the active revision, the channels property includes the revision when the document was removed from that channel.

This property simplifies tracking of document channel assignment without requiring a full rebuild of the rev tree - particularly for use by views and on performance-sensitive Sync Gateway processing.

Sample data:

"channels": {
      "channel_B": null,
      "channel_A": {
        "seq": 22,
        "rev": "2-f38e39675f20d50803192f14232f2226"
      }
    },

In the above example, the current (active) revision of the document belongs to channel channel_B. It previously was assigned to channel_A, but that assignment was removed at revision 2-f38e39675f20d50803192f14232f2226.

If a document is added/removed from a channel multiple times, only the most recent removal is tracked in the _channels metadata.

Channel metadata usage

1. Simple document retrieval

On a simple GET of a document through the SG REST API, the set of channels for the document will be compared to the set of channels granted to the user.

2. Channel Index processing

Sync Gateway maintains it's own channel index (i.e. identifying the set of documents that belong to a particular channel). This is done based on the Couchbase Server DCP feed - Sync Gateway nodes monitor the feed, and use the channel metadata included inline in the document to build the channel index. Prior to SG 1.2, this is an in-memory index that's supplemented by MR views (see #2). Post-1.2, SG also supports a persistent index that's shared across a SG cluster. In both cases, the processing of the DCP feed and indexing of documents into channels is a very performance sensitive operation. The channel index is the main driver for replication to Couchbase Lite.

3. Sync Gateway internal views

Sync Gateway maintains a set of internal MR views. The channels view is used to supplement the indexing work described in #1 - it returns the set of documents belonging to a given channel, ordered by sequence. The relevant part of the map function in the channels view is below - it's processing the channel timed set from the document's _sync metadata, and handling both documents actively in a channel (the first emit), and documents that have been removed from a channel at a given revision (the second emit).

var channels = sync.channels;
if (channels) {
	for (var name in channels) {
		removed = channels[name];
		if (!removed)
			emit([name, sequence], value);
		else {
			var flags = removed.del ? %d : %d; // channels.Removed/Deleted
			emit([name, removed.seq], {rev:removed.rev, flags: flags});
		}
	}
}

4. Sync Gateway custom views

When end users create a view through the Sync Gateway REST API, we need to apply channel security to the results of that view. Currently this is done by modifying any user created views to also emit the channel metadata, and then filter on that metadata when returning view results.

Clone this wiki locally