-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hashring alignment and cache reads #26
Comments
I don't think that it will work like this.
It has no sense - if carbon daemon gets no writes then it will have no cache for reading. |
Hmm, judging by the docs, it seems that instance names may solve this...
After taking this for a test drive, it seems that buckyd doesn't like it: |
You need to give buckyd the exact same host/port/instance strings as you do for carbon-c-relay. So:
|
Ah, thx @jjneely. The readme seems a bit misleading on this syntax. Does it require an update?
Also, when I use
But when I use
Not sure what is causing the cluster to go inconsistent or if it is something to be concerned about. |
You need to run buckyd on all instances, so e.g. two buckyd on same port on srv1
|
@deniszh One buckyd per carbon-cache instance, not per server? So, would the typical deployment only run one carbon-cache per server? |
I only run one carbon-cache / go-carbon daemon per server. The way replication/load balancing works I want to make sure I have 2 copies of the same metric on different servers, and not assigned to 2 different daemons that happen to live on the same host. (I'll hopefully have some replication support in buckytools in the next month or so.) In the far distant past I did run multiple carbon-cache daemons per server to handle my throughput, but the storage requirements grew so much that I had more disk IO than the ingestion could ever keep up with. |
Thx, @jjneely. Let me provide some more transparency regarding my goal. Below is the carbon-c-relay config. I am not using a replication factor, just duplicating the metrics to two separate clusters. I would like to use bucky to manage each cluster independently. Is reducing the number of carbon-cache instances on each server to one the only reasonable way to integrate bucky?
|
At this point, yes, that's the easiest way to that goal. Although, I guess the real bug here is making bucky aware of multiple instances on the same physical host. |
There are presumably two things going on here, @jjneely:
I suppose my thinking is more along the lines of ignoring the fact that multiple instances are on the same physical host, except for hashring purposes. |
I made some tweaks to bucky client to support multi-instance. I removed the check to verify that the cluster members length equals the hashring length, since this would not be true if a cluster member has multiple hashring entries. I also removed duplicates from the servers slice. I have no idea if this is a breaking change for anything else, but
|
I'd like to point out that unlike carbon_ch, fnv1a_ch does include port in its hash-key. Since you use that hash, I think there should be no such thing as "duplicate" cluster members or something. @jjneely wrote this imo in #26 (comment). |
Let me see if my assumptions are correct, @grobian. Please let me know if any of this is amiss. Bucky client derives the list of cluster hosts from the destinations in the hashring. Regardless of hash, the same cluster host can appear more than once in the hashring. Bucky client doesn't seem to like when it derives the same host more than once from the hashring. This begs a question. Can the same destination appear more than once in the hashring? Seems like it could provide a weighting factor for heterogeneous hardware. I am trying to figure out how to shoehorn carbon-c-relay and buckytools into a preexisting cluster which was scaled up with multiple instances of carbon-cache. |
I wonder if this is solvable in carbon-c-relay with a cluster of clusters approach.
|
Does it? Not really sure.
IMO no - by definition of hashring.
That's an also a problem - I do not really understand what's your problem and what are you're trying to achieve? |
@deniszh, allow me to illustrate with a truncated example from the buckytools readme.
It shows the same host, graphite010-g5, appearing multiple times in the hashring, once for each carbon-cache instance on the host. This is precisely the carbon-cache deployment that I have. The challenge I am having is that |
Could you perhaps describe your setup from the initial relay down to the carbon-cache processes? Getting a good idea about the flow of your metrics is key to get this right IMO. |
Sure, @grobian. Metrics->load balancer->multiple VMs with carbon-c-relay->multiple physical boxes each running multiple instances of carbon-cache. Carbon-c-relay config is identical on all VMs--consistent hash to all carbon-cache instances. I believe this is all working as intended--each graphite key is sent to a specific carbon-cache instance. Now, I am trying to integrate the cluster management piece. |
Just summing up what has been said above to ensure we're all on the same page:
Due to 5. bucky and other tools get a tough job, because you probably /share/ the /var/lib/carbon/whisper directory amoung the multiple instances. It also makes future scaling out or down of instances on each server impossible because it will change the hash allocation (due to jump). To solve this, people typically use a c-relay on the storage server that simply any_of's all incoming metrics to a set of carbon-cache instances on the local host, thereby hiding any of this from tools like bucky. Your best start would be to implement this to be able to do 6. but it will cause metrics to move between srv1 and srv2 (and similar for srv3 and srv4 of course). |
1, correct. 2, correct. 3, I am not familiar enough with the inner workings of the hashes to say whether I completely understand your point regarding the final ordering--an example would certainly clarify this for me. 4, awesome. 5, correct. 6, correct. To put this in terms of configuration, it seems you are suggesting the following (leaving the mirror cluster out for brevity). Will this achieve hashring alignment across c-relay and bucky? Front Relay
Back Relay srv1:2052
Back Relay srv2:2052
Also, I am curious if the use of multiple carbon-cache instances per host is common enough to solve the problem without the use of a second layer of relays. It seems like it would be trivial to support two layers of hashing in a single c-relay instance. Thoughts, @grobian? |
You want to avoid having multiple tiers of (c-)relays, is that correct? While I understand the rationale, it currently isn't possible and I don't see it high priority to implement double-hashing or anything. |
Sounds good, @grobian. Thanks for the guidance! |
@dolvany we had 12 carbon-cache processes per host with carbon-c-relay on the same host on front of them in order to distribute the load. At certain point it stopped work perfomance-wise and we switched to go-carbon which at current setup can easily handle 300K points/sec and with some tuning and external iSCSI storage up to 1000K points/sec sustained. It also eliminates carbon-c-relay on the host and you will be able to reduce amount of destinations in relay configs. It plays nice with bucky. Just have a look. |
I would concur with azhiltsov's suggestion. carbon-cache.py isn't multi-threaded, hence running multiple in parallel. A c-relay in front of it is just a workaround, in reality it should've been multi-threaded by itself. go-carbon solves that nicely (and avoids the need for a local c-relay). |
@azhiltsov @grobian So, would go-carbon be fronted with c-relay? It looks like go-carbon is a replacement for carbon-cache. What would the design look like? |
@grobian If I use |
sender -> c-relay -> go-carbon wrt CARBONLINK_HOSTS, I think that doesn't work at all anyway because fnv1a_jump_ch is not understood by graphite-web. This is the reason why we started carbonzipper. This "smart proxy" acts as a single carbon-cache to graphite-web, later we also replaced the latter with carbonapi. |
@grobian But I could use |
No, only if you use a single cluster with |
I would like to consult the trifecta of graphite wisdom @jjneely @deniszh @grobian regarding some general questions.
First, I would like to understand how to align the carbon-c-relay hashring with the buckytools hashring given the following configurations.
Would this result in aligned hashrings even though carbon-c-relay is sending to multiple cache processes on each server?
Does it make sense to have cache instances dedicated to writing and cache instances dedicated to reading? Would this make reads and writes more performant?
The text was updated successfully, but these errors were encountered: