-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can I use buckytools to rebalance a fnv1a_ch cluster? #17
Comments
I presently support the However, it probably wouldn't take much code if you are interested. I've already got the fnv1a hashing function coded in and working as part of the jump hash. You'd just need to implement a hashing.HashRing interface and plugging it into the buckyd and bucky commands should be fairly straight forward. Glad to give you a hand to get some patches. |
I'm definitely interested. I've got a 12 node cache cluster that's kind of hitting the ceiling in terms of of a throughput bottleneck. I need to add a 13th node, but I can't do it unless I can rebalance the cluster. Do you need anything from me in terms of putting together a patch? |
I'd like to gently encourage some progress on this .... We just got notice that one of the instances in our relay cluster is scheduled for termination in the next week. We've got to get the data off of there and onto a new node. Hopefully I can do this with buckytools if its ready for that time. Otherwise I'm going to have to use rsync or some similar brute force method and cut over to the new host when most or all of that data has been transfered. |
I wasn't aware of this bug, I did a try in PR #18 |
I'll test this PR ... |
@grobian your patch didn't work for me. When I run something like |
Sorry for the delay here....I should be able to spend some time here this coming week. Although, I know that cuts it close for that EC2 termination. |
@jjneely I already completed the migration, but I definitely still need buckytools to support fnv1a_ch. For two reasons: (a) there's duplicate metrics spread around the cluster which need to be consolidated, and (b) if we ever need to scale horizontally I need to be able to add more hosts and rebalance the cluster. |
@mwtzzz can you explain exactly how you setup bucky? What I did to test this in a very simple manner. on the server:
then from the client
that returned on the client something like:
Does something like this work for you at all? I admit I don't fully understand the hostnames and how they are used by bucky, but it looks as if buckyd tells bucky where to connect to, so ensure buckyd has a correct list of hostnames for the hash-ring hosts. |
FNV1a support is now merged in as 0.4.0. Bug reports appreciated. Also note the change in how hashrings are specified by a list of |
Ah! ... Just getting around to seeing this (I got pulled away on other stuff at work)... Sorry I missed this earlier. Let me take a look at it today. |
If I run any other command (list, inconsistent, etc), the following appears:
|
I see file descriptor 5 for buckyd process is iterating through the whisper files.... Perhaps it just takes time before buckyd has results ready? on a different note, the
|
If you are asking about the sleeping bits, yes, it takes a bit for buckyd to build a cache. The bucky CLI will wait for them. |
@mwtzzz : could you please share your relay config and buckyd command line options? |
This |
@deniszh My relay config looks like this with 12 nodes:
My buckyd command line (which I am running identically on each of the 12 hosts) looks like this (notice I'm not using
|
Why not using |
@deniszh mostly because it's more work for me to include it and I thought it was optional (?). But if it's necessary, I'll definitely include it.... Should I put it in? |
ah my bad. just noticed the documentation, |
ok, this is looking much better. |
@azhiltsov I'm assuming the rebalance would make use of bucky-fill at some point, which could possibly corrupt some of my archive sums? .... It looks like Civil made a PR with his fix, I might just go ahead and merge that into my local copy. |
I'm having an issue running
The output from the command shows a bunch of "Results from radar-be-x not available. Sleeping", then shows a metrics count for only six of the twelve nodes:
The buckyd log file doesn't show much, some "get /metrics" and:
I ran it a second time. This time only 4 of the nodes returned metrics before "Killed." |
Check your syslog. It looks like OOM killer, so bucky consumes too much memory, which is totally possible for 12-15 mln metrics x 12 nodes.... |
Was about to write the same thing. The client you are running the bucky CLI on doesn't have enough memory. |
Ah, good suggestion. Indeed it was oom-killer that nuked it. It looks like increasing the RAM on these instances is not an option. Do you have any suggestions on how I can get it to work? Does |
Spawn another instance with enough ram? It should host |
copy/paste error. the full paste is:
(note that "radar-be-f" is the shortened hostname. I remove the domain before posting.) |
bucky rebalance --no-op shows this:
|
Are there any tools we can use to see what carbon-relay is doing to calculate the hashring node, to see what bucky is doing, and to find out why they're giving different results? I see buckytools has a |
Yes, if you launch your carbon-c-relay with the -t (test) flag, it will prompt for data input and show you how it would route the input (assuming it is a metric string). So, in your case, just start the relay with |
@thanks for letting me know about the test flag. Very useful. Now I know what's going on. First of all, it's not bucky's fault. Bucky is computing the correct placement of metric based on the information it's given. The problem is that we are rewriting the metric name on the backend right before it goes to the local cache on that node. Our front end relay is sending the metric with "agg" prepended to the name. The backend relay receives this metric and then removes "agg" before writing it to its local cache. Bucky doesn't know about this rewrite, so it thinks the metric is on the wrong node. Technically it is on the wrong node given the metric name. But it is on the right node if the name has "agg" prepended to it. So my problem is: how to rebalance this cluster, placing metrics whose name contains xxxx.sum_all.hosts into the node where they would go if the name contained agg.xxxx.sum_all.hosts. Any thoughts? Here are the details:
The metric arrives at radar-be-i node where it is summed again and then "agg" is stripped from the metric name and then it is written to local whisper file as atlantic_exchange.usersync.cookiepartner.TAPAD.syncs.sum_all.hosts:
The culprit here is the following rules in the relay conf file:
It rewrites the name to \2 and then sends it to local cache. I suppose right before the last last I could insert a new rule that sends anything with "sum_all.hosts" back to the relay so that it gets routed to the correct host according to the hash. This is the only thing I can think of unless bucky has (or could have?) a way to balance a cluster based on some rewrite rules. |
Indeed. You should send new metric back to relay and not to local cache. |
the good news is, bucky is doing things correctly. I'm looking forward to being able to add more nodes to the cluster and using it to rebalance. |
in testing this out, I came across a new unrelated issue. My carbon-cache instances write their own metrics to /media/ephemeral0/carbon/storage/whisper/carbon.radar702 as per a directive I set in the carbon-cache config file:
Is there a way to deal with this? |
You are correct about carbon-cache.py. It writes its own metrics directly to disk and they cannot go through the relay. Usually, these are prefixed with |
made the change from excellent tool. I used it on our small qa cluster and it rebalanced 45,000 out of 320,000 metrics in a matter of a couple seconds. |
Okay, so what issues remain here? The rebalance and corruption? |
No issues remaining. It seems to be working correctly. Thanks for your help on this, much appreciated! |
@jjneely I've noticed a new issue. I completed a rebalance on our main production cluster. Everything great, except there are a handful (about 700) metrics that the relays are putting on node "radar-be-k" while bucky thinks they should be on node "radar-be-i". The curious thing is that this is only happening on the one node. The other eleven nodes don't have this discrepansy. I ran some of the metric names through in this case, it seems bucky is incorrect about the placement. |
We'd need the exact metric name, so we can debug the hash on both c-relay and bucky. |
That's what I figured. The metric names include our ec2 instance hostnames. Can I private-message to you directly? |
Yes of course. email is fine too. |
@grobian I just sent you an email from my gmail account. |
all three metrics you sent me, end up on the same hash position (4379), and more annoyingly, that hash position is occupied by two of your servers, f and g. Now you mention k and i, so that's slightly odd, but it could very well be that carbon-c-relay is choosing the last matching entry, whereas the bucky implementation picks the first. This situation is exactly graphite-project/carbon@024f9e6, which I chose NOT to implement in carbon-c-relay because that would make the input order of servers define the outcome of the ring positions. |
Likely reason is that carbon-c-relay nowadays uses a binary-search, which means it approaches the duplicates from the right, instead of the left as the original bisect-left did. |
Python implements this:
IoW if carbon-c-relay uses a binary search, it does it wrong, because it should select the leftmost matching key. |
Is there anything I should do on my end to correct this? |
Thanks Fabian, is there anything I should do on my end to correct it?
…On Mon, Oct 9, 2017 at 11:39 AM, Fabian Groffen ***@***.***> wrote:
Python implements this:
if lo < 0:
raise ValueError('lo must be non-negative')
if hi is None:
hi = len(a)
while lo < hi:
mid = (lo+hi)//2
if a[mid] < x: lo = mid+1
else: hi = mid
return lo
IoW if carbon-c-relay uses a binary search, it does it wrong, because it
should select the leftmost matching key.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#17 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGfAIdHvtqSZDOyOGfCkMc4F8NgCvgDsks5sqmhdgaJpZM4OEzjc>
.
--
---
Michael Martinez
http://www.michael--martinez.com
|
It seems to me both bucky and carbon-c-relay implement bisectLeft wrongly. The focus for me is now to align at least bucky and carbon-c-relay (which they are not at the moment). I'm thinking about how to align the algorithm in carbon-c-relay with the bucky implementation (which equals the old carbon-c-relay implementation iirc) as I suspect this will result in minimal (perhaps zero) changes in routing at least for some of my boundary tests. Yesterday I got stuck in trying to understand why my algorithm is so extremely complex compared to the simplicity the python folks came up with. Their implementation is a downright disaster because a lot of metrics will change destination. |
Thanks for looking at it. On my end, currently there's about 800 metrics that bucky has misplaced. That's 800 out of 150 million, so a very low percentage. As you mentioned, if any changes are made to relay it would be great if they result in minimal changes in routing. |
When collisions occur, we would stable-sort them such that the ring would always be the same, regardless of input order. However, the binary search method (historical mistake) could end up on a dupicate pos, and take the server as response, clearly not honouring the contract of returning the /first/ >= pos match. This change ensures collisions on pos are voided, and basically restores pre-binary-search distribution introduced in v3.1. This change should match the ring output with what bucky expects jjneely/buckytools#17
Now the only thing necessary is that bucky ensures that servers are sorted/ordered in the same way carbon-c-relay does, then the output must be the same. |
@grobian do you want me to test that commit? |
Yes please, it should result in those 800 metrics being sent to the nodes bucky wants them to be. |
Is it going to preserve the locations of the other metrics? |
If you want to be sure, try and build a list of metric names (can get it off disk with find, replace all / with . and strip .wsp) then run it through carbon-c-relay -t and ensure all of them return the host you grabbed the conf from. Put differently, if you do this for your current running version and the latest HEAD, you should find running diff -u old new is rather small, where the new points to the box, and old points to another. |
The developers at carbon-c-relay mentioned that I could use this to rebalance a fnv1a_ch hashring. But when i run buckyd, I get the following message:
[root@ec2-xxx radar123 bin]$ setuidgid uuu ./buckyd -node ec2-xxx.compute-1.amazonaws.com -hash fnv1a_ch 2017/06/25 22:08:54 Invalide hash type. Supported types: [carbon jump_fnv1a]
Does buckytools support this type of hash? If not, do you know of how I can rebalance my cluster upon adding a new cache host?
The text was updated successfully, but these errors were encountered: