diff --git a/README.md b/README.md index c0f2602..d786ba0 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,6 @@ [![license: MIT/Apache-2.0](https://img.shields.io/badge/license-MIT%2FApache--2.0-blue.svg)](LICENSE-MIT) [![crates.io](https://img.shields.io/crates/v/merde_json.svg)](https://crates.io/crates/merde_json) [![docs.rs](https://docs.rs/merde_json/badge.svg)](https://docs.rs/merde_json) -[![cursed? mildly](https://img.shields.io/badge/cursed%3F-not%20really-b56f1b.svg)](https://github.com/bearcove/merde_json) # merde_json @@ -9,16 +8,12 @@ Do you want to deal with JSON data? Are you not _that_ worried about the performance overhead? (ie. you're writing a backend in Rust, but if it was written in Node.js nobody would bat an eye). -Are you tired of waiting for proc macros to compile, and dealing with super -generic traits? +Do you value short build times at the expense of some comfort? -Do you not care about any formats other than JSON? +Then head over to the crate documentations: -Are you ready to give up the comforts of `#[serde(rename_all)]`, `#[serde(flatten)]`, etc.? - -Then the bag of compromises known as `merde_json` might just work for you! - -Head over to the [Rust docs](https://docs.rs/merde_json) to learn more. + * [merde_json](./merde_json/README.md) + * [merde_json_types](./merde_json_types/README.md) ## FAQ diff --git a/merde_json/Cargo.toml b/merde_json/Cargo.toml index 3cc0c19..1ad4641 100644 --- a/merde_json/Cargo.toml +++ b/merde_json/Cargo.toml @@ -5,7 +5,7 @@ edition = "2021" authors = ["Amos Wenger "] description = "Serialize and deserialize JSON with jiter and declarative macros" license = "Apache-2.0 OR MIT" -readme = "README.md" +readme = "../README.md" repository = "https://github.com/bearcove/merde_json" keywords = ["json", "serialization", "deserialization", "jiter"] categories = ["encoding", "parser-implementations"] diff --git a/merde_json/README.md b/merde_json/README.md new file mode 100644 index 0000000..735d646 --- /dev/null +++ b/merde_json/README.md @@ -0,0 +1,425 @@ +[![license: MIT/Apache-2.0](https://img.shields.io/badge/license-MIT%2FApache--2.0-blue.svg)](LICENSE-MIT) +[![crates.io](https://img.shields.io/crates/v/merde_json.svg)](https://crates.io/crates/merde_json) +[![docs.rs](https://docs.rs/merde_json/badge.svg)](https://docs.rs/merde_json) + +# merde_json + +`merde_json` covers the "90% use case" for JSON manipulation via traits, declarative macros, and a bit of discipline. + +It optimizes for low compile-times and avoiding copies (but not all allocations). It's well-suited +for use in web servers, if you're willing to give up some of the comforts of [proc macros](https://crates.io/crates/serde). + +The underlying JSON parser is [jiter](https://crates.io/crates/jiter), which provides an event-based interface +you can choose to use when merde_json's performance simply isn't enough. + +## Conventions + migrating from `serde_json` + +[serde](https://crates.io/crates/serde) lets you derive `Serialize` and `Deserialize` traits using +a proc macro: + +```rust +use serde::{Deserialize, Serialize}; + +#[derive(Debug, PartialEq, Serialize, Deserialize)] +struct MyStruct { + name: String, + age: u8, +} +``` + +By contrast, merde_json provides declarative macros: + +```rust +use merde_json::Fantome; +use std::borrow::Cow; + +#[derive(Debug, PartialEq)] +struct MyStruct<'src, 'val> { + _boo: Fantome<'src, 'val>, + + name: Cow<'val, str>, + age: u8, +} + +merde_json::derive! { + impl(JsonSerialize, JsonDeserialize) for MyStruct { + name, + age + } +} +``` + +Declarative macros = less work to do at compile-time, as long as we follow a couple rules: + + * All structs have exactly two lifetimes parameters: 'src and 'val + * All structs have a `_boo` field, for structs that don't use their lifetime parameter + * Field names are listed twice: in the struct and in the macro (limitation of declarative macros) + * Use `Cow<'val, str>` for all your strings, instead of choosing between `&str` and `String` on a case-by-case basis + +Read [The Secret Life Of Cows](https://deterministic.space/secret-life-of-cows.html) for a good introduction to Rust's "Copy-on-Write" types. + +## Deserializing + +[from_str][] is a thin wrapper above jiter's API, the underlying JSON parser. +It gives you a `JsonValue`, which you can then destructure into a Rust value +via the [JsonDeserialize] trait: + +```rust +# use merde_json::{Fantome, JsonDeserialize, JsonSerialize, ToRustValue}; +# use std::borrow::Cow; +# +# #[derive(Debug, PartialEq)] +# struct MyStruct<'src, 'val> { +# _boo: Fantome<'src, 'val>, +# +# name: Cow<'val, str>, +# age: u8, +# } +# +# merde_json::derive! { +# impl(JsonSerialize, JsonDeserialize) for MyStruct { name, age } +# } +# +# fn main() -> Result<(), merde_json::MerdeJsonError> { +let input = String::from(r#"{"name": "John Doe", "age": 30}"#); +let value = merde_json::from_str(&input)?; +let my_struct = MyStruct::json_deserialize(Some(&value)); +println!("{:?}", my_struct); +# Ok(()) +# } +``` + +For convenience, you can use [ToRustValue::to_rust_value]: + +```rust +# use merde_json::{Fantome, JsonDeserialize, JsonSerialize, ToRustValue}; +# use std::borrow::Cow; +# +# #[derive(Debug, PartialEq)] +# struct MyStruct<'src, 'val> { +# _boo: Fantome<'src, 'val>, +# name: Cow<'val, str>, +# age: u8, +# } +# +# merde_json::derive! { +# impl(JsonSerialize, JsonDeserialize) for MyStruct { name, age } +# } +# +# fn main() -> Result<(), merde_json::MerdeJsonError> { +let input = String::from(r#"{"name": "John Doe", "age": 30}"#); +let value = merde_json::from_str(&input)?; +// Note: you have to specify the binding's type here. +// We can't use a turbofish anymore than we can with `Into::into`. +let my_struct: MyStruct = value.to_rust_value()?; +println!("{:?}", my_struct); +# Ok(()) +# } +``` + +However, don't lose sight of the fact that `my_struct` borrows from `value`, which borrows from `input`. + +We _need_ three explicit bindings, as tempting as it would be to try and +inline one of them. This fails to compile with a "temporary value dropped while borrowed" error: + +```compile_fail +# use merde_json::{Fantome, JsonDeserialize, JsonSerialize, ToRustValue}; +# use std::borrow::Cow; +# +# #[derive(Debug, PartialEq)] +# struct MyStruct<'src, 'val> { +# _boo: Fantome<'src, 'val>, +# name: Cow<'val, str>, +# age: u8, +# } +# +# merde_json::derive! { +# impl(JsonSerialize, JsonDeserialize) for MyStruct { name, age } +# } +# +# fn main() -> Result<(), merde_json::MerdeJsonError> { +let input = String::from(r#"{"name": "John Doe", "age": 30}"#); +let value = merde_json::from_str(&input).unwrap(); +let my_struct = MyStruct::json_deserialize(Some(&merde_json::from_str(&input).unwrap())); +println!("{:?}", my_struct); +# Ok(()) +# } +``` + +## Moving deserialized values around + +How do you return a freshly-deserialized value, with those two annoying lifetimes? + +Set them both to `'static`! However, this fails because the deserialized value is +not `T<'static, 'static>` — it still borrows from the source (`'src`) and the +`JsonValue` that was deserialized (`'val`). + +This code fails to compile: + +```compile_fail +# use merde_json::{Fantome, JsonDeserialize, JsonSerialize, ToRustValue}; +# use std::borrow::Cow; +# +# #[derive(Debug, PartialEq)] +# struct MyStruct<'src, 'val> { +# _boo: Fantome<'src, 'val>, +# name: Cow<'val, str>, +# age: u8, +# } +# +# merde_json::derive! { +# impl(JsonSerialize, JsonDeserialize) for MyStruct { name, age } +# } +# +fn return_my_struct() -> MyStruct<'static, 'static> { + let input = String::from(r#"{"name": "John Doe", "age": 30}"#); + let value = merde_json::from_str(&input).unwrap(); + let my_struct: MyStruct = value.to_rust_value().unwrap(); + my_struct +} +# fn main() -> Result<(), merde_json::MerdeJsonError> { +let my_struct = return_my_struct(); +println!("{:?}", my_struct); +# Ok(()) +# } +``` + +...with: + +```text +---- src/lib.rs - (line 157) stdout ---- +error[E0515]: cannot return value referencing local variable `value` + --> src/lib.rs:177:5 + | +21 | let my_struct: MyStruct = value.to_rust_value().unwrap(); + | ----- `value` is borrowed here +22 | my_struct + | ^^^^^^^^^ returns a value referencing data owned by the current function +``` + +Deriving the [ToStatic] trait lets you go from `MyStruct<'src, 'val>` to `MyStruct<'static, 'static>`: + +```rust +# use merde_json::{Fantome, JsonDeserialize, JsonSerialize, ToRustValue, ToStatic}; +# use std::borrow::Cow; +# +# #[derive(Debug, PartialEq)] +# struct MyStruct<'src, 'val> { +# _boo: Fantome<'src, 'val>, +# name: Cow<'val, str>, +# age: u8, +# } +# +merde_json::derive! { + // 👇 + impl(JsonSerialize, JsonDeserialize, ToStatic) for MyStruct { name, age } +} + +fn return_my_struct() -> MyStruct<'static, 'static> { + let input = String::from(r#"{"name": "John Doe", "age": 30}"#); + let value = merde_json::from_str(&input).unwrap(); + let my_struct: MyStruct = value.to_rust_value().unwrap(); + my_struct.to_static() +} +# fn main() -> Result<(), merde_json::MerdeJsonError> { +let my_struct = return_my_struct(); +println!("{:?}", my_struct); +# Ok(()) +# } +``` + +Of course, [ToStatic::to_static] often involves heap allocations. If you're just temporarily +processing some JSON payload, consider accepting a callback instead and passing it a shared +reference to your value — that works more often than you'd think! + +## Deserializing mixed-type arrays + +Real-world JSON payloads can have arrays with mixed types. You can keep them as [Vec] of [JsonValue] +until you know what to do with them: + +```rust +use merde_json::{Fantome, JsonDeserialize, JsonSerialize, ToRustValue, JsonValue, MerdeJsonError}; + +#[derive(Debug, PartialEq)] +struct MixedArray<'src, 'val> { + _boo: Fantome<'src, 'val>, + items: Vec<&'val JsonValue<'src>>, +} + +merde_json::derive! { impl(JsonDeserialize) for MixedArray { items } } + +fn main() -> Result<(), merde_json::MerdeJsonError> { + let input = r#"{ + "items": [42, "two", true] + }"#; + let value = merde_json::from_str(input)?; + let mixed_array: MixedArray = value.to_rust_value()?; + + println!("Mixed array: {:?}", mixed_array); + + // You can then process each item based on its type + for (index, item) in mixed_array.items.iter().enumerate() { + match item { + JsonValue::Int(i) => println!("Item {} is an integer: {}", index, i), + JsonValue::Str(s) => println!("Item {} is a string: {}", index, s), + JsonValue::Bool(b) => println!("Item {} is a boolean: {}", index, b), + _ => println!("Item {} is of another type", index), + } + } + + Ok(()) +} +``` + +_Note: that's why we need both lifetimes: `JsonValue<'s>` is invariant over `'s`. `JsonValue<'val>` is not +a subtype of `JsonValue<'src>` even when `'src: 'val`._ + +Other options here would have been to keep `items` as a [JsonArray], or even a [JsonValue]. Or, `items` could +be of type `Items` which has a manual implementation of [JsonDeserialize]. See the `mixed` example for inspiration. + +## Deserializing types from other crates + +You're going to need to use newtype wrappers: you can't implement `JsonSerializer` +(a type outside your crate) onto `time::OffsetDateTime` (also a type outside your crate), +as per the [orphan rules](https://github.com/Ixrec/rust-orphan-rules). + +But you can implement it on `YourType` — and that works +especially well with date-time types, because, I like RFC3339, but you may want +to do something else. + +The [merde_json_types](https://crates.io/crates/merde_json_types) crate aims to collect such wrapper +types: it's meant to be pulled unconditionally, and has a `merde_json` feature that conditionally +implements the relevant traits for the wrapper types, making it a cheap proposition if someone +wants to use your crate without using `merde_json`. + +## Serializing + +Serializing typically looks like: + +```rust +# use merde_json::{Fantome, JsonSerialize, JsonDeserialize, ToRustValue}; +# use std::borrow::Cow; +# +# #[derive(Debug, PartialEq)] +# struct MyStruct<'src, 'val> { +# _boo: Fantome<'src, 'val>, +# name: Cow<'val, str>, +# age: u8, +# } +# +# merde_json::derive! { +# impl(JsonSerialize, JsonDeserialize) for MyStruct { name, age } +# } +# +# fn main() -> Result<(), merde_json::MerdeJsonError> { +let original = MyStruct { + _boo: Default::default(), + name: "John Doe".into(), + age: 30, +}; + +let serialized = original.to_json_string(); +println!("{}", serialized); + +let ms = merde_json::from_str(&serialized)?; +let ms: MyStruct = ms.to_rust_value()?; +assert_eq!(original, ms); +# Ok(()) +# } +``` + +## Reducing allocations when serializing + +If you want more control over the buffer, for example you'd like to re-use the same +`Vec` for multiple serializations, you can use [JsonSerializer::from_vec]: + +```rust +# use merde_json::{Fantome, JsonSerialize, JsonDeserialize, ToRustValue}; +# use std::borrow::Cow; +# +# #[derive(Debug, PartialEq)] +# struct MyStruct<'src, 'val> { +# _boo: Fantome<'src, 'val>, +# name: Cow<'val, str>, +# age: u8, +# } +# +# merde_json::derive! { +# impl(JsonSerialize, JsonDeserialize) for MyStruct { name, age } +# } +# +# fn main() -> Result<(), merde_json::MerdeJsonError> { +let original = MyStruct { + _boo: Default::default(), + name: "John Doe".into(), + age: 30, +}; + +let mut buffer = Vec::new(); +for _ in 0..3 { + buffer.clear(); + let mut serializer = merde_json::JsonSerializer::from_vec(buffer); + original.json_serialize(&mut serializer); + buffer = serializer.into_inner(); + + let ms = merde_json::from_slice(&buffer)?; + let ms = ms.to_rust_value()?; + assert_eq!(original, ms); +} +# Ok(()) +# } +``` + +Note that serialization is infallible, since it targest a memory buffer rather than +a Writer, and we assume allocations cannot fail (like most Rust code out there currently). + +Keeping in mind that a `Vec` that grows doesn't give its memory back unless you ask for it +explicitly via [Vec::shrink_to_fit] or [Vec::shrink_to], for example. + +## Caveats & limitations + +Most of this crate is extremely naive, on purpose. + +For example, deep data structures _will_ blow up the stack, since deserialization is recursive. + +Deserialization round-trips through [jiter::JsonValue], which contains types like [std::sync::Arc], +small vecs, lazy hash maps, etc. — building them simply to destructure from them is a waste of CPU +cycles, and if it shows up in your profiles, it's time to move on to jiter's event-based parser, +[jiter::Jiter]. + +If you expect an `u32` but the JSON payload has a floating-point number, it'll get rounded. + +If you expect a `u32` but the JSON payload is greater than `u32::MAX`, you'll get a +[MerdeJsonError::OutOfRange] error. + +There's no control over allowing Infinity/NaN in JSON numbers: you can work around that +by calling [jiter::JsonValue::parse] yourself. + +Serialization can't be pretty: it never produces unnecessary spaces, newlines, etc. +If your performance characteristics allow it, you may look into [formatjson](https://crates.io/crates/formatjson) + +Serialization may produce JSON payloads that other parsers will reject or parse incorrectly, +specifically for numbers above 2^53 or below -2^53. + +There is no built-in facility for serializing/deserializing strings from numbers. + +If `merde_json` doesn't work for you, it's very likely that your use case is not supported, and +you should look at [serde](https://crates.io/crates/serde) instead. + +## FAQ + +### What's with the `Option` in the `JsonDeserialize` interface? + +This allows `Option` to ignore missing values. All other implementations should +return `MerdeJsonError::MissingValue` if the option is `None` — this is later turned +into `MerdeJsonError::MissingProperty` with the field name./ + +### What do I do about `#[serde(rename_all = "camelCase")]`? + +Make your actual struct fields `camelCase`, and slap `#[allow(non_snake_case)]` on +top of your struct. Sorry! + +### What do I do about `#[serde(borrow)]`? + +That's the default and only mode — use `Cow<'a, str>` for all strings, do `.to_static()` +if you need to move the struct. diff --git a/merde_json/src/error.rs b/merde_json/src/error.rs index c420438..97e4611 100644 --- a/merde_json/src/error.rs +++ b/merde_json/src/error.rs @@ -76,7 +76,7 @@ pub enum MerdeJsonError { /// While calling out to [FromStr::from_str](std::str::FromStr::from_str) to build a [HashMap](std::collections::HashMap), we got an error. InvalidKey, - /// While parsing a [time::Date] or [time::Time], we got an error. + /// While parsing a datetime, we got an error InvalidDateTimeValue, /// An I/O error occurred. diff --git a/merde_json_types/README.md b/merde_json_types/README.md new file mode 100644 index 0000000..df1f55c --- /dev/null +++ b/merde_json_types/README.md @@ -0,0 +1,92 @@ +[![license: MIT/Apache-2.0](https://img.shields.io/badge/license-MIT%2FApache--2.0-blue.svg)](LICENSE-MIT) +[![crates.io](https://img.shields.io/crates/v/merde_json_types.svg)](https://crates.io/crates/merde_json_types) +[![docs.rs](https://docs.rs/merde_json_types/badge.svg)](https://docs.rs/merde_json_types) + +# merde_json_types + +`merde_json_types` is a companion crate to [merde_json](https://crates.io/crates/merde_json), +providing wrapper types that solve two problems at once. + +## Problem 1: Most crates have types that do not implement the `merde_json` traits + +I'm thinking about the [time crate](https://crates.io/crates/time), the [chrono crate](https://crates.io/crates/chrono), [camino](https://crates.io/crates/camino), etc. + +If you have, say, a `time::OffsetDateTime` in one of your structs, +then merde_json's derive macro will not work. You _are_ going to need +a wrapper of some sort, and that's the kind of type this crate provides. + +If you enable the `time-serialize`, `time-deserialize`, and `merde_json` +features, you can do this: + +```rust +use merde_json::{from_str, JsonSerialize, ToRustValue}; +use merde_json_types::time::Rfc3339; + +let dt = Rfc3339(time::OffsetDateTime::now_utc()); +let serialized = dt.to_json_string(); +let deserialized: Rfc3339 = + merde_json::from_str(&serialized).unwrap().to_rust_value().unwrap(); +assert_eq!(dt, deserialized); +``` + +## Problem 2: Keeping `merde_json` optional + +The [time::Rfc3339] type is exported by this crate as soon as the `time-types` +feature is enabled. But `merde_json_types` doesn't even depend on `merde_json` +(or provide serialization/deserialization implementations) unless you activate +its `merde_json` feature! + +That means, you can have your crate unconditionally depend on `merde_json_types`, +and use `Rfc3339` in your public structs: + +```rust +use merde_json::{Fantome, JsonSerialize, ToRustValue}; +use merde_json_types::time::Rfc3339; + +#[derive(Debug, PartialEq, Eq)] +pub struct Person<'src, 'val> { + pub name: String, + pub birth_date: Rfc3339, + + pub _boo: Fantome<'src, 'val>, +} + +merde_json::derive! { + impl (JsonSerialize, JsonDeserialize) for Person { name, birth_date } +} +``` + +And still only depend on `merde_json` when your _own_ feature gets activated: + +```toml +[dependencies] +merde_json_types = "2" +merde_json = { version = "2", optional = true } + +[features] +merde_json = ["dep:merde_json", "merde_json_types/merde_json"] +``` + +Of course, for that to work, we need to get rid of any unconditional mention of +`merde_json` in our code, which would become something like: + +```rust +use std::marker::PhantomData; +use merde_json_types::time::Rfc3339; + +#[derive(Debug, PartialEq, Eq)] +pub struct Person<'src, 'val> { + pub name: String, + pub birth_date: Rfc3339, + + /// This field still _has_ to be named `_boo`, but we can't use + /// the `Fantome` type here without pulling in `merde_json`: so, + /// we use `PhantomData` instead. + pub _boo: PhantomData<(&'src (), &'val ())>, +} + +#[cfg(feature = "merde_json")] +merde_json::derive! { + impl (JsonSerialize, JsonDeserialize) for Person { name, birth_date } +} +``` diff --git a/merde_json_types/src/lib.rs b/merde_json_types/src/lib.rs index fdc91c0..5e8c556 100644 --- a/merde_json_types/src/lib.rs +++ b/merde_json_types/src/lib.rs @@ -1,41 +1,5 @@ #![deny(missing_docs)] - -//! `merde_json_types` is a companion crate to `merde_json`, providing wrapper -//! types, solving two problems at once: -//! -//! - not all crates implement the `merde_json` traits, so a newtype -//! is required anyway. -//! - we might want to have some structs be part of our public interface, -//! but only conditionally implement the `merde_json` traits (to avoid -//! polluting the dependents' tree with `merde_json` if they don't need it). -//! -//! As a result, have your crate depend on `merde_json_types` unconditionally (which has -//! zero dependencies), and forward your own `merde_json` cargo feature to `merde_json_types/merde_json`, like so: -//! -//! ```toml -//! [dependencies] -//! merde_json_types = { version = "0.1", features = ["merde_json"] } -//! -//! [features] -//! merde_json = ["merde_json_types/merde_json"] -//! ``` -//! -//! Then, in your crate, you can use the `merde_json_types` types, and they will -//! be conditionally implemented for you. -//! -//! For example, if you have a crate `my_crate` that depends on `merde_json_types`, -//! and you want to use the `time` crate's `OffsetDateTime` type, you can do: -//! -//! ```rust -//! use merde_json::{from_str, JsonSerialize, ToRustValue}; -//! use merde_json_types::time::Rfc3339; -//! -//! let dt = Rfc3339(time::OffsetDateTime::now_utc()); -//! let serialized = dt.to_json_string(); -//! let deserialized: Rfc3339 = -//! merde_json::from_str(&serialized).unwrap().to_rust_value().unwrap(); -//! assert_eq!(dt, deserialized); -//! ``` +#![doc = include_str!("../README.md")] #[cfg(feature = "time-types")] pub mod time;