Home

Scribe

This is a forked repository of Facebook's Scribe, a logging system. This branch differs from the Facebook's in a number of important ways.

ZooKeeper-based network store discovery.

Sites that require high-availability logging may wish to discover remote scribes at runtime instead of hard-coding a remote scribe. Sites without hardware load balancers can provide significantly better logging system uptime, and operational flexibility, by discovering a downstream scribe at connection time.

Discovery is achieved with ephemeral ZooKeeper registrations. Scribes who wish to be discovered (typically an aggregator tier) register themselves at a given znode and discovered by (typically by frontend) scribes at connection time.

Global discovery options:

zk_server - ZooKeeper cluster connection string. Any scribe wishing to discover, or be discovered, must set this option.

zk_registration_prefix - Path to registration znode, such as /scribe/aggregator. To reduce ZooKeeper load, only scribes that need to be discovered should register themselves.

Upstream scribes discover aggregators by specifying a znode as the network store remote host. Note remote_port is specified, however, that value is overridden by the discovered port.
```
<store>
  category=travis
  type=network
  remote_host=zk://zk.foo.com/scribe/aggregator
  remote_port=1463
</store>
```
Load Balancing

Over time scribe aggregators processes accumulate connections, leading to large connection count imbalances. This "rich get richer" scenario leads to painful failure modes when a long-running process ends (through planned maintenance or other means) as a large number of connections must be reestablished at other aggregators. Should a large number of connections land on an already connection-rich aggregator its likely to exceed its memory limit and cause a cascading failure.

Two network store options have been added to balance connections through periodic disconnects. A feedback-based system was implemented (via counters published to the registration znode) but did not perform as well as this periodic disconnect method.

Network store configuration options:

default_max_msg_before_reconnect - Number of messages to send before reconnecting. If connection pools are used, the lowest value will be used.

allowable_delta_before_reconnect - Avoid "thundering herds" by staggering disconnects by this amount.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

Scribe

Clone this wiki locally