-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expansion of serialization feature #97
Comments
Is there any reason why this new higher-level API would have to be part of the |
There's no reason why it couldn't be separate, except I needed to make modifications to Bin which cannot be made outside of the current crate. I had initially intended to make it separate, but couldn't due to Bin's lifetime requirements (as noted above). |
The only problem i see is that we tried to keep the rust API as close as possible to the other clients (Go/Java/C).
I guess the main reason for that is because the client was not really actively maintained for a long time. That might also be the reason for the over complicated usage on some parts. It was build with a pretty old rust version and i guess nobody had the time to keep it up to the changes of the rust language. |
I appreciate the idea of keeping all of the clients in-line. But that said, this is a Rust crate and it should also work Rust's idioms. One of the major things that people use Rust for is (de)serialization performance especially when moving data around. This change makes it much easier for the library user not to have to rebuild that code. I don't think that every language would have something like this because it makes much more sense for Rust itself. However, I actually kept the changes for the Bin separate (and locally it's in a separate branch), so it's very possible we could just use that instead of the above. I'd be fine with that. I didn't think that it would make sense to present the changes to Bin in isolation without showing why. I do think that the changes to lifetimes are very much a Rust implementation problem and not something that would be found across all of the implementations of the library. Also, and I lost the reference to this, but the standard library within Rust had a similar problem with lifetimes in at least a couple of APIs and they were dropped because they just don't work when you start to really use them as a library. |
From my perspective, if you can remove the lifetimes without producing a lot of memory allocation/copying (explicit or implicit) in the process, go for it. The logic behind the current design is that years ago when I was prototyping it, I wanted to avoid the |
That would be great. But i dont really see that in this client. The client is sync from its base. So switching to async would again be a big change leaving many traces of the old parts. The client has bigger problems right now as far as i see ( #8 #9 #10 ). |
Nothing I said negates what you say. I also didn't say we should convert the client to async, just add async API in addition to the sync API. I don't see how the issues you noted above are design problems. They are just missing features and require work, not a redesign. |
Ok then i misunderstood that. I thought you want to switch to async completely. |
This is all sort of devolving from my original thought, but ... Do we have a roadmap of features/reworks for this client? Even having a checkbox of a list of things that need to be done in some sort of order would be helpful for organization. I'm not all that familiar with aerospike in general and the differences between implementations. Async definitely felt like a hole when I was evaluating (this) Aerospike client. I ended up heavily using Rayon for my needs (which is primarily loading large amounts of batch data). |
There are missing features we can list (corporate wikis for the win) but I don't think that's what this ticket is about. If there are idiomatic Rust features missing in the client, or the client needs minor redesign/clean up work to make it more suitable for users in Rust ecosystem, we'd like to have them as tickets and work them out before adding more features to avoid reworks. While the actual idea expressed in this ticket is beyond the scope of this library, I think the general idea of supporting direct de/serialization in the client is very appealing. |
I'm just unclear what this means. I've got a trait above, but you've mentioned that's beyond the scope - so what do you mean by supporting serde more directly? Also where is the correct place to have more discussion based things? |
Your trait actually performs operations. My idea of serialization would be a trait more or less like this (off the top of my head, don't take it literally. Been a while since I wrote actual Rust code): pub trait AerospikeSerializer: ... {
// this would directly write the FieldHeader and data bytes to the command buffer
fn aerospike_serialize(&self, bins: Vec<String>, buf: &mut Option<&mut Buffer>) -> Result<usize>;
}
pub struct RecordHeader {
/// Record modification count.
pub generation: u32,
/// Date record will expire, in seconds from Jan 01 2010, 00:00:00 UTC.
pub expiration: u32,
}
pub trait AerospikeDeserializer: ... {
// this would directly assign values to the struct fields, skipping Record, HashMap and the rest
fn aerospike_deserialize(&self, bin_name: String, value: Value);
// this would allow a callback for metadata values; Conversely, RecordHeader could become
// a struct and just passed once.
fn aerospike_deserialize_header(&self, header: &RecordHeader);
} Now, we can have API like this: pub fn get_obj(&self, policy: &ReadPolicy, key: &Key, obj: mut AerospikeDeserializer) -> Result<()>;
pub fn put_obj(&self, policy: &WritePolicy, key: &Key, bins: Option<Vec<String>>, obj: AerospikeSerializer) -> Result<()>; No need for |
I get that you don't want to overallocate, but as a user of the library, this just forces me (the plural me...users of the aerospike client) to do allocate. And that means that the plural me now has many different implementations of this kind of get/set and likely many of these are non-optimal. It doesn't make sense to me to micro-optimize the client at the expense of the user base. Rust has Generics and they are fantastic in Traits. Your version still puts all of the onus on actually building Aerospike structures in the user level code (which as noted above are not great for users due to the lifetimes). You can argue that as a user, I should understand the allocations and should want to control them... but ultimately, I want an API that does this: my_custom_structures.iter().for_each(|my_custom_data| set(&my_custom_data, ...).expect("Struct was not pushed")); Your proposal is still so low level that I'd need to build the above Likewise, I want something that pulls data out. let records: Vec<MyStruct> = get(...).expect(...); Note that |
I'd like to point out that there is no single me. There are very many different use cases of the Aerospike database. Some use it as the highest performance distributed Key-Value store, some use it as a document store, etc. The optimal API needed for each use case would be different. The only way we could meet the demand is for us to provide a performant client, so it would be possible to build other APIs on top of it. In that case, if a user would want to trade some CPU cycles or memory space for convenience, they could easily do that. |
I never really had problems related to the Bin lifetime before. Thats mainly because i never really had to work with that directly. |
I understand the motivation to be performant, and to reduce memory allocations. I worked on embedded systems for over a decade and a half, and a full decade of that was flight controls for military aircraft. So I really do understand constraints especially related to memory and performance. I equally think that engineering decisions should also be data driven. Before I posted my original ask, I did my homework on performance. There's no notable change in actual performance on my local machine by removing the bin lifetime requirement (which uses a |
That's one question I had as well. @brianbruggeman you mentioned this earlier as well:
If we do remove the lifetimes on I'm generally not opposed to making these changes, esp. if benchmarking shows no significant performance impact. @brianbruggeman how did you test the performance impact of your changes? Using the bundled client benchmark? |
@jhecking I did. With the Makefile change I made, it's also included as a make option. I didn't look very closely at the benchmarking, but it's pretty slim. I suspect the benchmarking may also need to be expanded for a better comparison. |
I realize that there's a small team, but it feels like this project is pretty defunct in terms of being able to move it forward from a community perspective. If there's a lack of adoption and requests on the Rust side, I suspect this is at least partially why. At this point, I need to get things done on my side, and I can't really wait for navigating through some sort of arbitrary political minefield. So, I'm just going to state - hey I tried to play ball, but I'm forking and moving on. |
I think we were very clear as to what we would consider changing, and were waiting for you to come back with the life times you'd like to see removed from the library. The trait is something that you'd have to implement by yourself after we accommodated your requests. Since this was a breaking change, we wanted to have a healthy debate before the change. |
I have structured data and I'd like a convenient mechansim for keeping that structure while pushing and pulling to and from aerospike. It looks like there may have been some initial work done here, but I'd like to push that bar further.
I'm proposing a trait, AerospikeRecord, that can be applied to an arbitrary Rust structure provided it meets certain criteria. The API would look something like this:
The trait would have a variety of different methods to support the API above
For a default behavior, the only thing that would need to be implemented for a given structure would be the
aerospike_record_key
. I think this makes sense because the key could be any kind of value within a given record. It might even need to be a composite key made of several values or a calculated set of values. But this allows the maintainer to decide how to build that key explicitly.Any of the other traits can easily be overwritten:
One caveat is that I'd need to make changes to the Bin lifetime. Right now, the way that it's implemented, it's impossible to do this:
Bin<'a> requires that the control of the lifetime be outside of its use. As a result, you'd need to carry that lifetime all the way up to the top of the call stack or set it to Bin<'static> and remain for the lifetime of the program. It's hard to know why the lifetimes are there in the first place as they only seem to make the API more complicated than is necessary. As a more general statement, I suspect some of the other lifetime requirements could also be removed but I don't have a use case for them right now. And, I wonder if this may be why the API doesn't get as much usage.
The text was updated successfully, but these errors were encountered: