Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add #[speedy(non_exhaustive)] struct attribute #20

Open
merlleu opened this issue Jun 17, 2022 · 6 comments
Open

Add #[speedy(non_exhaustive)] struct attribute #20

merlleu opened this issue Jun 17, 2022 · 6 comments

Comments

@merlleu
Copy link
Contributor

merlleu commented Jun 17, 2022

Hello,

I think we could add a non_exhaustive attribute for structs:

  • a u8 containing the number of fields inside the struct is added before the struct fields when serializing
  • when the struct is read, the parser first reads the first byte (n): and knows how many fields are present in the serialized data
    If n is lesser than the current count of fields for this struct, we read the first n elements of the struct, fill others with default values.
    This allows users to add fields to structs without breaking compatibility (even if the struct is in a Vec...)

Example of code describing the current issue:

use speedy::{Writable, Readable};

#[derive(Writable, Readable, Debug)]
pub struct Struct1 {
    pub field1: bool,
    pub field2: Struct2,
    pub field3: Vec<Struct3>
}

#[derive(Writable, Readable, Debug)]
pub struct Struct2 {
    pub field2_1: u8
}

#[derive(Writable, Readable, Debug)]
pub struct Struct3 {
    pub field3_1: u64
}

fn dump() {
    let s = Struct1{field1: true, field2: Struct2 { field2_1: 8 }, field3: vec![Struct3 { field3_1: 99999 }] };
    s.write_to_file("struct1.bin").unwrap();
}

fn read() {
    println!("{:?}", Struct1::read_from_file("struct1.bin").unwrap());
}

Adding a field at the end of Struct2 or Struct3 will break compatibility with previously serialized data.
(Adding a field to the end of Struct1 with the default_on_eof attribute will still work because he is at the end of the buffer).

Whereas with speed(non_exhaustive) attribute:

pub use speedy::{Writable, Readable};

#[derive(Writable, Readable)]
#[speedy(non_exhaustive)]
pub struct Struct1 {
    pub field1: bool,
    pub field2: Struct2,
    pub field3: Vec<Struct3>
}

#[derive(Writable, Readable)]
#[speedy(non_exhaustive)]
pub struct Struct2 {
    pub field2_1: u8
}

#[derive(Writable, Readable)]
#[speedy(non_exhaustive)]
pub struct Struct3 {
    pub field3_1: u64
}

fn dump() {
    let s = Struct1{field1: true, field2: Struct2 { field2_1: 8 }, field3: vec![Struct3 { field3_1: 99999 }]};
    s.write_to_file("struct1.bin").unwrap();
}

fn read() {
    Struct1::read_from_file("struct1.bin").unwrap();
}

In this scenario you can append a field to any struct with the speedy(non_exhaustive) attribute and it will still be able to read those structs.

Pros:

  • It allows users to append fields to existing structs without breaking everything.
  • It would still be way faster than using crates like rmp-serde.

Cons:

  • Adding this to a struct will break compatibility with previously serialized data
  • It adds one byte for each serialized struct
  • It will result in slightly slower serialization/deserialization

I don't know if this is doable directly on the speedy crate, otherwise I think I'll just create a fork.
Don't hesitate to give any feedbacks.

@merlleu merlleu changed the title Add #[speedy(non_exhaustive)] attribute Add #[speedy(non_exhaustive)] struct attribute Jun 17, 2022
@merlleu
Copy link
Contributor Author

merlleu commented Jun 17, 2022

I've done a basic implementation on https://github.com/merlleu/speedy, still need to write tests for this.

@jeromegn
Copy link
Contributor

jeromegn commented Oct 7, 2022

This would be pretty interesting for us for forward compatibility.

@merlleu
Copy link
Contributor Author

merlleu commented Oct 26, 2022

Hello @koute ,
Is there any chance for this pull request to be merged (after conflicts are resolved), or do you think it should be a separate crate ?

Thanks.

@koute
Copy link
Owner

koute commented Oct 28, 2022

I do want to add something like this, but I'm not necessarily convinced that this is the right approach.

You can already achieve forward compatibility today without any extra changes necessary by just using an enum and then having each version of the struct be a new variant of that enum (so essentially the tag of the enum would be the "version" of the struct), and then manually implement the From trait to just convert it to the most recent version of the struct. So I have been thinking that maybe some sort of an approach based on that would be the most appropriate.

So instead of having the first implicit field be the number of fields it'd be the version number, and then the fields would be optionally tagged with an attribute specifying the range of the version numbers for which the field is present. This kind of approach would also allow you to remove fields from your structure definition once you migrate your data. (With just writing the number of fields you can't really do that.) I'm not yet sure what to do when writing the data; I guess it'd always write only the newest version of the struct, but then that'd mean it'd have to either implicitly ignore old fields, or generate a runtime error if they're not empty, and I don't really like either of those. I guess maybe some sort of a procedural attribute that'd automatically convert a versioned struct into an enum + automatically generate the From impl to convert to the newest version would be the best here?

@merlleu
Copy link
Contributor Author

merlleu commented Oct 28, 2022

Thanks for the comment,
I think you have a good point and versioning structs would be better. However, I'm not a big fan of the enum approach for this (requires a lot of user code for this).

What do you think of something like this:

#[derive(Writable, Readable)]
#[speedy(version = "4")]
pub struct Struct1 {
    #[speedy(before_version = "2")] // only present in versions 1
    pub field0: u8,
    // no version attribute, so always expected
    pub field1: bool,
    #[speedy(after_version = "2", before_version = "4")] // only present in versions 2 & 3
    pub field2: Struct2,
    #[speedy(after_version = "3")] // present in version 3,4,...
    pub field3: Vec<Struct3>
}

This approach lets us add & remove fields as we want without using enums at all. (requires a bit more code than using an enum & a From impl, but it should have a bit less overhead).

Then we could just implement almost the same logic used for the enum but under the hood. (Unless you have something against this).

@hecatia-elegua
Copy link

hecatia-elegua commented Apr 10, 2023

At that point you could also use semantic version numbers and use whatever lib Cargo uses for parsing and handling them, so:

#[speedy(version = "<2")]
pub field0: u8,
#[speedy(version = ">1, <4")]
pub field2: Struct2,

...although in this case there are only major version changes.
Thinking further, patches could also be supported, similar to other ser/de frameworks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants