Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replacement of Value and Bins with Writable/Readable derive macros #138

Open
wants to merge 13 commits into
base: async
Choose a base branch
from

Conversation

jonas32
Copy link
Contributor

@jonas32 jonas32 commented Jun 20, 2023

New PR instead of #126 with the actual changes. I think its time to close the other one.

Writable:

  • Derive Implementation for Writable Bins
  • Derive Implementation for Writable Values
  • Implementations for the most common data types as Writable Bins/Values
  • Implementations for CDT List/Maps as Writable Bins/Values

Readable:

  • Derive Implementation for Readable Bins
  • Derive Implementation for Readable Values
  • Implementations for the most common data types as Readable Bins/Values
  • Implementations for CDT List/Maps as Readable Bins/Values

Client Update:

  • All references within code updated to accept generic Types
  • All references within docs updated to accept generic Types
  • Update Client functions for better usability with generics

Exp/Op:

  • Implement readable/writable generics for operations and expressions

CDT:

  • GeoJSON solution besides of using old Value Enum
  • HLL solution besides of using old Value Enum
  • OrderedMap solution besides of using old Value Enum

@jonas32
Copy link
Contributor Author

jonas32 commented Jul 2, 2023

Currently reads return a Result<Record<T>>. The T only replaces what used to be bins: HashMap<String, Value> with bins: T.
Is that the wanted behavior or should Record also be replaced so that the T also includes record meta?

@CMoore-Darwinium
Copy link

Hi @jonas32, may I confirm that the purpose of this patch is to directly deserialize from the wire format (and serialize to the wire format) rather than using the Value enum?

If so, it looks like exactly what I need.

@jonas32
Copy link
Contributor Author

jonas32 commented Jul 5, 2023

Hey @CMoore-Darwinium, yes. Goal is to directly build structs from the wire and write to it without intermediate steps and as low redundant allocations as possible. In the end, the target would be to more or less deprecate Bins and Value.

@CMoore-Darwinium
Copy link

Hey @CMoore-Darwinium, yes. Goal is to directly build structs from the wire and write to it without intermediate steps and as low redundant allocations as possible. In the end, the target would be to more or less deprecate Bins and Value.

That's exactly what I've been hoping for. Profiling shows that creating and destroying aerospike::Value is our current limiting factor.

Just playing devil's advocate here and I see you've already invested a lot of time into this patch.

You've got these traits, ReadableValue / WritableValue that interact directly with PreParsedValue, which is essentially aerospike's wire format.

I was wondering if we could use serde::Serialize and serde::Deserialize instead. That way implementations of serde::Deserializer and serde::Serializer, MapAccess, SeqAccess, etc could be provided for structpack, CDT, bins, formats etc could be passed as required.

It would have the advantages:

  1. Dealing with weird cases like, "it might be a string in old records and a sequence in new records" can be done nicely using a visitor instead of wire format and looking at particle.
  2. Serde's derive macros already have all of the magic sugar to handle defaults, multiple ways of handling enums, rename_all, etc.
  3. Wire format is hidden within this crate rather than being exposed in the public interface.

If you're open to the idea, I would be able to help with the coding.

@jonas32
Copy link
Contributor Author

jonas32 commented Jul 6, 2023

i thought about this too initially. But there are a lot of tricky details with the wire format that might become hard to handle without working in the scope of the whole buffer. Lists/Maps are a good example, where you basically get X amount of bins with the same name and X amount of entries inside each and then have to join them together after parsing both. And on top, a different encoding for CDT returns.

Not minding about particle type would also be hard, because that often has implications on the way values are encoded.
According to @khaf's answer to my question in the other PR, automatic casting of values to other types is not wanted behavior.

Also having serde that deep inside of the client would build a big dependency on that crate, since there would be no way to interact with the Server outside of serde.

Is it easily possible to add additional magic sugar for client specific things? Like ordered maps, geojson, hll etc.?

In general I'm still open for that idea if it does not impact on performance/allocations. But i would like to hear @khaf's opinion on that first before i start digging into that.
Most of the work i spent here was figuring out how the wire format works and directly implementing it for each type. So it would not be a lot of wasted time to switch that now.

@CMoore-Darwinium
Copy link

With Lists / Map serde works quite well.
It provides these two traits that would expose the underlying buffer to the type being deserialized as an iterator.
serde::de::SeqAccess
serde::de::MapAccess
It gives a lot of flexibility to the deserializer to provide various typed key-values. So we would be implementing it on either the buffer type itself, or a wrapper type around it.

The particle type would be used. The idea is, the particle would be used in the deserializer which would call the right function on a Visitor specific to the type being destreamed https://docs.rs/serde/latest/serde/de/trait.Visitor.html the type itself would be responsible for either implementing or not implementing the appropriate visit function depending on whether the streamed data type makes sense to what you're streaming into.

Serde itself has no dependencies. Its mostly just a set of traits that gets compiled down to essentially nothing so I don't think there is a huge problem in introducing it.

Magic sugar for geojson and hll is more of a concern. I'll need to look into that. Some serde deserializers have their own special target types but it might be hacky. It could certainly be made to work, but finding a clean way might be harder.

Serde does not allocate apart for a few very specific situations. Flattening structures is the only one I can think of off hand. It is mostly just a few layers of generics without much actual logic actually happening.

@jonas32
Copy link
Contributor Author

jonas32 commented Jul 6, 2023

Sounds good.
Ill try to read a bit deeper into how serde could be implemented over the next days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants