Breaking up haret into a few libraries? #115

erickt · 2017-06-08T01:56:28Z

In the why.md document, there is a discussion about how haret was designed to isolate off the protocol from the client-facing and data-storage parts of the system. Would there be any interest in formalizing this into multiple libraries? I'm personally interested in exploring a Zookeeper client wire-compatible frontend a la zetcd, so having a looser coupling between the subsystems would make this a bit easier to do.

andrewjstone · 2017-06-08T03:23:54Z

The intention of that section was to allow plugging in different protocol implementations, not to allow different APIs. We always intended the API to be a disjoint set of what zookeeper provides with some new primitives baked in, and other things intentionally left out. e.g instead of allowing ephemeral, auto-incrementing nodes which are often used for leader election, we'd provide a leader election primitive instead. The goal was to provide a ready-made, opinionated system to allow users to safely coordinate their systems, not to build a toolkit, as the toolkit approach makes it harder to use and debug the different combinations of setups in production.

In practice, isolating even the protocol hasn't been perfect. While most of the VR specific code lives in /src/vr, there are artifacts of the fact that Haret uses VR littered throughout the codebase For instance the namespace manager knows about VrCtxs and the 3 different start modes (startup, recovery, reconfiguration) of replicas. Since implementing a consensus system like VR on top of a lightweight process architecture requires a management layer like the namespace manager (using gossip in this case), in order to start replicas on different nodes and learn of new consensus groups after partition, it was easy to fall into this trap. Ideally this management layer would be agnostic of the consensus protocol as well, but I haven't spent the time to go back and fix it. It isn't high on the priority list right now, although it may help provide cleaner, more structured code.

I'm actually in the middle of a major refactoring of the FSMs that I hope to open a PR for in the next week or two. It's possible that all this code could live in it's own VRR library, but I'm not sure how useful it would be outside of Haret. I'm also hesitant to independently version it at this early state, when I'm the primary developer, as it just adds another layer of management for me. The reality of adding another consensus protocol at this juncture is very remote, so taking the time right now to do this is not a priority.

As far as decomposition of the system, a bunch of things are already in their own libraries. Haret relies on rabble for the cluster system and lightweight processes and vertree for the trie based backend.

It is possible to also abstract out the front end API, but it is harder, as the API is heavily tied to the capabilities of the backend. It is also useless in and of itself.

It appears that zetcd is an independent proxy process that sits in front of etcd. That doesn't require splitting up the code at all, but it does require features, such as subscriptions, that aren't yet built into Haret. It also will either require emulating other non-native features such as ephemeral nodes, or not implementing them altogether. That all seems doable, but again isn't really a priority for me right now. My chief goal is building a correct and stable system. After some stability it will be much more actionable to talk about extension and different front end APIs.

In summary, I'm not fully opposed to this idea, but feel it is a bit of a distraction at this early time. If however you see specific parts of the code that you feel are not properly abstracted and should be split out into their own libraries, I am definitely willing to consider that.

This starts pulling apart haret (vmware-archive#115), specifically cli client binary into it's own module. The main reason to do this is for a few reasons. First, it allows us to start framing out a higher level library interface for Haret (haret-client). Second, it allows us to shave off some dependencies if we only need a subset for a particular application. haret-client will need a lot more attention over time, since right now it just responds with string output. Note that I've added a .gitignore to `haret-client`, as the standard practice in the community is to only lock down dependencies in the application crates, but leave it up to the library consumers to decide what dependency versions they want to use.

erickt · 2017-06-09T15:15:44Z

Hi @andrewjstone! You are welcome of course to want to move at your pace and turn all this down :) As I was starting to go through the code, it seemed like there was a natural decoupling between the interior communication between the nodes, and the client/server communication. At least for me, it seemed like it'd be a little easier to contribute on those portions of haret without needing to have a lot of understanding on how VR works. As best as I can tell, it doesn't seem to hard to pull the client/server out of the core library, and it has the nice benefit of reducing dependencies and revealing what needs to be public and private.

Regarding the zookeeper compatible client interface, that's more of a toy experiment to compare/contrast some workloads. I thought it might be a nice way to get some people from that community to pay some attention to the project. I don't think you should feel compelled to add any features to support it.

andrewjstone · 2017-06-09T15:28:20Z

Hi @erickt,

I can't tell you how much I appreciate you taking an interest in Haret. After looking at your changes to the cli-client lately and thinking more about this, I am less concerned about pulling things apart. I was never really that concerned about separating the code, but more about having to support multiple APIs. However, there is no reason I have to support multiple APIs :) Community projects are completely fine and reasonable

Additionally, you are correct that the client/server API part is well isolated from the internal communication, so separation shouldn't be that hard. As you state it is also certainly an easier way to start contributing. Furthermore, it could be very useful to have an HTTP interface using JSON in addition to protobuf. Implementing that for the admin client would be most useful in particular.

With all that said, have at it! I will be happy to review any changes you are interested in making, and from what I've seen so far will likely merge them in quickly. If you want to discuss complex things before implementation we can do that also.

Cheers!

jrgarcia · 2017-08-29T22:39:11Z

@erickt @andrewjstone Should this be closed now that things have been separated accordingly?

andrewjstone · 2017-08-30T14:51:22Z

Let's leave it open for now. I still want to split out the VR code and possibly other things into their own crates. I'm working on the successor to rabble which uses boxed Any. This will make separation of the internal code possible as right now everything is tightly coupled to the parameterized Msg type. I tried separating out the VR code a month or so back and ran into some serious hiccups. Another month or so and I hope to have everything ported over to the new system, and nicely separated.

…

________________________________ From: J.R. Garcia <[email protected]> Sent: Tuesday, August 29, 2017 6:39:12 PM To: vmware/haret Cc: Andrew Stone; Mention Subject: Re: [vmware/haret] Breaking up haret into a few libraries? (#115) @erickt<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_erickt&d=DwMCaQ&c=uilaK90D4TOVoH58JNXRgQ&r=EYqGjyxD08UEoHBE9BvW7xJFsytY9rePqghteUQ7CqE&m=EdVCmHeD-HB2Y7SY4Pk0iZjqtPZRXOZ0H9El4Tji5-Q&s=KH-RCpnZhJhn8jfUR8MFPqhMs95-T6M2S22BzdxXIwE&e=> @andrewjstone<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_andrewjstone&d=DwMCaQ&c=uilaK90D4TOVoH58JNXRgQ&r=EYqGjyxD08UEoHBE9BvW7xJFsytY9rePqghteUQ7CqE&m=EdVCmHeD-HB2Y7SY4Pk0iZjqtPZRXOZ0H9El4Tji5-Q&s=Y21myBoxPsi6E4PREYR_HJBqLbWqCo1afF3cIs7oOsA&e=> Should this be closed now that things have been separated accordingly? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_vmware_haret_issues_115-23issuecomment-2D325824803&d=DwMCaQ&c=uilaK90D4TOVoH58JNXRgQ&r=EYqGjyxD08UEoHBE9BvW7xJFsytY9rePqghteUQ7CqE&m=EdVCmHeD-HB2Y7SY4Pk0iZjqtPZRXOZ0H9El4Tji5-Q&s=A_cDseot2SWS-UtmsaQF3Zl1Y4TkwER5pStJwytLLlA&e=>, or mute the thread<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AAcf-2DQLg7jxDm8en-2D3Z88-2DoeaJhbiYjOks5sdJMQgaJpZM4NzfEE&d=DwMCaQ&c=uilaK90D4TOVoH58JNXRgQ&r=EYqGjyxD08UEoHBE9BvW7xJFsytY9rePqghteUQ7CqE&m=EdVCmHeD-HB2Y7SY4Pk0iZjqtPZRXOZ0H9El4Tji5-Q&s=OlIl0XDn2UlevKp1TiYwoWcV2kr2b1gEiYRHbDPDiwo&e=>.

jrgarcia · 2017-08-30T14:52:57Z

Sounds good. I was just looking through here to pick something up and came across this.

erickt mentioned this issue Jun 8, 2017

Extract haret-cli-client into a lib and bin crate #116

Merged

erickt mentioned this issue Jun 10, 2017

Extract haret-{admin,devconfig,server} from haret lib #121

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Breaking up haret into a few libraries? #115

Breaking up haret into a few libraries? #115

erickt commented Jun 8, 2017

andrewjstone commented Jun 8, 2017

erickt commented Jun 9, 2017

andrewjstone commented Jun 9, 2017

jrgarcia commented Aug 29, 2017

andrewjstone commented Aug 30, 2017 via email

jrgarcia commented Aug 30, 2017

Breaking up haret into a few libraries? #115

Breaking up haret into a few libraries? #115

Comments

erickt commented Jun 8, 2017

andrewjstone commented Jun 8, 2017

erickt commented Jun 9, 2017

andrewjstone commented Jun 9, 2017

jrgarcia commented Aug 29, 2017

andrewjstone commented Aug 30, 2017 via email

jrgarcia commented Aug 30, 2017