Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
Alexey Zhiltsov committed Dec 1, 2017
2 parents 3ff9d41 + f3f6c69 commit 0cb4e82
Show file tree
Hide file tree
Showing 1,139 changed files with 13,277 additions and 678 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
*~
*.pickle
*.line
52 changes: 52 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Changelog
All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).

## [Unreleased]

## [0.4.0] - 2017-08-17
### Added

* Support for FNV1a hashing compatible with [carbon-c-relay][1] hash method
`fnv1a_ch`. Issue #17

### Changed

* Server, Port, and Instance that uniquely identify a carbon-cache daemon
in the hash ring (and tune how the hashring works) are now always specified
by `SERVER[:PORT][=INSTANCE]`. This is backwards incompatible, but fixes
issues where the port and instance values could be confused. Issue #17

## [0.3.2] 2017-06-21

### Fixed

* Support both Tuples and Lists which are now handled differently in the
updated ogorek vendored package

## [0.3.1] 2017-06-21

### Added

* Unit tests for bucky-pickle-relay

### Changed

* Inverted delete option in bucky rebalance. Delete is now off by default.
* Conform to Go best practices for repo layout
* Update vendored packages

### Fixed

* Fix tar/restore after Snappy changes

## [0.3.0] 2017-04-27

### Added

* Use Snappy framing format for Whisper data over the wire. This makes
transfer of time series databases significantly faster.

[1]: https://github.com/grobian/carbon-c-relay
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
The MIT License (MIT)

Copyright (c) 2014 42 Lines, Inc.
Copyright (c) 2014 - 2017 42 Lines, Inc.
Original author: Jack Neely <[email protected]>

Permission is hereby granted, free of charge, to any person obtaining a copy
Expand Down
30 changes: 0 additions & 30 deletions Makefile

This file was deleted.

122 changes: 100 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ These are the tools included and their functionality.
* **rebalance** -- Move inconsistent metrics to the correct location
and delete the source immediately after successful backfill.
* **restore** -- Restore from a tar archive.
* **severs** -- List each server's known hash ring and verify that
* **servers** -- List each server's known hash ring and verify that
all hash rings are consistent.
* **tar** -- Make an archive of a list or regular expression of metric
names and dump it in tar format to STDOUT.
Expand Down Expand Up @@ -99,16 +99,26 @@ pass to the daemon as arguments the members of the consistent hash ring.

setuid graphite

exec /path/to/buckyd --node graphite010-g5 \
-b 192.168.1.1:5678 \
exec /path/to/buckyd -node graphite010-g5 \
-sparse -hash carbon -b 192.168.1.1:5678 \
graphite010-g5:a graphite010-g5:b \
graphite011-g5:a graphite011-g5:b \
graphite012-g5:a graphite012-g5:b

Here `--node` is the name of this Graphite node (if different from what
is derived from the host name). `-b` or `--bind` is the address to bind
to. You can also specify `--prefix`` where your Whisper data store is
and `--tmpdir` where the daemon can write temporary files.
Here `-node` is the name of this Graphite node in the hashring (if different
from what is derived from the host name). `-b` or `-bind` is the address to
bind to. You can also specify `-prefix` where your Whisper data store is and
`-tmpdir` where the daemon can write temporary files. The `-sparse` option
instructs buckyd to create sparse whisper files that take less disk space.
The `-hash` option chooses the hashring algorithm.

The non-option arguments
are the servers and instances that make up the hashring. Order is important.
The hashring members can be specified in the following formats:

* `SERVER`
* `SERVER:INSTANCE`
* `SERVER:PORT:INSTANCE`

This exposes a REST API that is documented in REST_API_NOTES.md.

Expand Down Expand Up @@ -140,30 +150,92 @@ Other common flags are:
Examples
========

Rebalance a cluster with newly added storage nodes:
Rebalance a cluster with newly added storage nodes. Check if you need to
use the `-no-delete` flag. The default behavior is to move metrics and
delete the source after a successful copy.

GOMAXPROCS=4 bucky rebalance -h graphite010-g5:4242 \
-w 100 2>&1 | tee rebalance.log
$ bucky rebalance -h graphite010-g5:4242 \
-w 25 2>&1 | tee rebalance.log

Discover the exact storage used by a set of metrics:

export BUCKYHOST=-h graphite010-g5:4242
bucky du -r '^1min\.ipvs\.'
$ export BUCKYHOST=-h graphite010-g5:4242
$ bucky du -r '^1min\.ipvs\.'

To Do / Bugs
============
Make a backup of all of the metrics in the `carbon` namespace. Using the
[pigz][2] parallel gzip compression tool. (Normal gzip would otherwise bottleneck
the process.)

$ bucky tar -w 25 -r '^carbon\.' | pigz > filename.tar

Backfill or rename metrics with a JSON hash of old name to new name. This
does not delete the source metric. It is a copy/fill operation.

$ bucky backfill -w 25 foo.json

Find inconsistent metrics or metrics that are in the wrong place in the
cluster according to the hashring:

$ bucky inconsistent

Building from Source
====================

To build from the Go source:

* Make sure your `GOPATH` environment variable is set to your Go
[workspace][3].
* Run: `go get github.com/jjneely/buckytools`
* Change directory into `$GOPATH/src/github.com/jjneely/buckytools`
* Run: `go install ./...`
* Binaries should now be installed to `$GOPATH/bin`

This can also be built as a Debian/Ubuntu package. (Tested on Ubuntu Trusty,
and Xenial.) The [git-buildpackage][4] is what I use to produce builds.
This requires `golang` debian packages.

* Run: `gbp buildpackage`

Notes
=====

Deleting Metrics
----------------

The daemon makes no effort to remove possibly empty directories when deleting
a metric. This can potentially cause race conditions with carbon-cache.py
creating a new metric in a would be deleted directory. Once carbon-cache.py
closes the file handle to a file in a deleted directory that file will also
be deleted. The delete action must not cause harm to other metrics.

To prune old or empty directories from your Graphite whisper store use a
cron job similar to this:

/usr/bin/find ${prefix}/storage/whisper -type d -empty -mtime +1 -delete

This checks that the directory has not been modified in more than 1 day
which, in most cases, avoids race conditions.

Google Snappy Compression
-------------------------

To further scale the speed at which this tool will move metric data from
one location to another it uses Snappy compression by default. This can be
disabled with the `-no-encoding` flag. When using many workers this can
double (or more) the throughput. The Snappy compression frame protocol also
handles CRC checks for data integrity.

To Do / Bugs / Contributing
===========================

Contributions are welcome! Please make a GitHub pull request. Below are
some low hanging fruit (and some more annoying issues) that need help.

* Unit tests with Go's `net/http/httptest` package. Test that the buckyd
daemon manipulates the on disk Whisper files correctly.
* Authentication -- Negotiate and Kerberos support. Probably Basic as well.
* Code clean up -- I wrote a lot of this quickly, it needs love in a
lot of places.
* tests
* Make all modules aware of possible duplicate metrics.
* Speed testing and improvements.
* Move the lower level GET, POST, DELETE, HEAD functions into a single
common file/place.
* Retries
* Rebalance needs to optionally be aware of machines not in the hash ring that
the rebalance should vacate.
* Graceful restarts and shutdowns? https://github.com/facebookgo/grace
* graphite-project/carbon's master branch contains this change:

Expand All @@ -172,4 +244,10 @@ To Do / Bugs
This will cause a few metrics to be assigned a different position in the
hash ring. We need to account for this algorithm change somehow.

Buckytools supports multiple different hashing algorithms and this can be
setup as a different support hashing type.

[1]: https://github.com/grobian/carbon-c-relay
[2]: http://zlib.net/pigz/
[3]: https://golang.org/doc/code.html
[4]: http://honk.sigxcpu.org/projects/git-buildpackage/manual-html/gbp.html
6 changes: 6 additions & 0 deletions REST_API_NOTES.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,12 @@ Methods:
data point is null. See Carbonate's whisper-fill.py.
* DELETE - Remove this metric from the file system.

GET requests will encode the response with Google's Snappy compression
algorithm when the header "Accept-Encoding: snappy" is present in the
headers of the GET request. PUT and POST accept "Content-Encoding: snappy"
for Snappy compressed Whisper data as well. Otherwise, the identity
encoding is assumed. Encoding requests have no affect on HEAD or DELETE.

/hashring
---------

Expand Down
33 changes: 2 additions & 31 deletions buckytools.go
Original file line number Diff line number Diff line change
@@ -1,43 +1,14 @@
package buckytools

import (
"encoding/json"
)

const (
// Buckytools suite version
Version = "0.1.9"
Version = "0.4.0"
)

// SupportedHashTypes is the string identifiers of the hashing algorithms
// used for the consistent hash ring. This slice must be sorted.
var SupportedHashTypes = []string{
"carbon",
"fnv1a",
"jump_fnv1a",
}

// MetricStatType A JSON marshalable FileInfo type
type MetricStatType struct {
Name string // Filename
Size int64 // file size
Mode uint32 // mode bits
ModTime int64 // Unix time
}

// JSONRingType is a datastructure that identifies the name of the server
// buckdy is running on and contains a slice of nodes which are
// "server:instance" (where ":instance" is optional) formatted strings
type JSONRingType struct {
Name string
Nodes []string
Algo string
Replicas int
}

func (j *JSONRingType) String() string {
blob, err := json.Marshal(j)
if err != nil {
return err.Error()
}
return string(blob)
}
File renamed without changes.
File renamed without changes.
Loading

0 comments on commit 0cb4e82

Please sign in to comment.