diff --git a/src/functional/optics.md b/src/functional/optics.md index e78cce37..31ca20d8 100644 --- a/src/functional/optics.md +++ b/src/functional/optics.md @@ -1,11 +1,11 @@ # Functional Language Optics -Optics is a type of API design that is common to functional languages. -This is a pure functional concept that is not frequently used in Rust. +Optics is a type of API design that is common to functional languages. This is a +pure functional concept that is not frequently used in Rust. -Nevertheless, exploring the concept may be helpful to understand other -patterns in Rust APIs, such as [visitors](../patterns/behavioural/visitor.md). -They also have niche use cases. +Nevertheless, exploring the concept may be helpful to understand other patterns +in Rust APIs, such as [visitors](../patterns/behavioural/visitor.md). They also +have niche use cases. This is quite a large topic, and would require actual books on language design to fully get into its abilities. However their applicability in Rust is much @@ -16,14 +16,13 @@ example, as it is one that is difficult for many to to understand from simply the API documentation. In the process, different specific patterns, called Optics, will be covered. -These are _The Iso_, _The Poly Iso_, and _The Prism_. +These are *The Iso*, *The Poly Iso*, and *The Prism*. ## An API Example: Serde -Trying to understand the way _Serde_ works by only reading the API is a -challenge, especially the first time. -Consider the `Deserializer` trait, implemented by any library -which parses a new data format: +Trying to understand the way *Serde* works by only reading the API is a +challenge, especially the first time. Consider the `Deserializer` trait, +implemented by any library which parses a new data format: ```rust,ignore pub trait Deserializer<'de>: Sized { @@ -66,19 +65,20 @@ pub trait Visitor<'de>: Sized { There is a lot of type erasure going on here, with multiple levels of associated types being passed back and forth. -But what is the big picture? Why not just have the `Visitor` return the pieces the -caller needs in a streaming API, and call it a day? Why all the extra pieces? +But what is the big picture? Why not just have the `Visitor` return the pieces +the caller needs in a streaming API, and call it a day? Why all the extra +pieces? One way to understand it is to look at a functional languages concept called -_optics_. +*optics*. This is a way to do composition of behavior and proprieties that is designed to facilitate patterns common to Rust: failure, type transformation, etc.[^1] -The Rust language does not have very good support for these directly. -However, they appear in the design of the language itself, and their concepts -can help to understand some of Rust's APIs. -As a result, this attempts to explain the concepts with the way Rust does it. +The Rust language does not have very good support for these directly. However, +they appear in the design of the language itself, and their concepts can help to +understand some of Rust's APIs. As a result, this attempts to explain the +concepts with the way Rust does it. This will perhaps shed light on what those APIs are achieving: specific properties of composability. @@ -94,10 +94,9 @@ As an example, suppose that we have a custom Hash table structure used as a concordance for a document.[^2] It uses strings for keys (words) and a list of indexes for values (file offsets, for instance). -A key feature is the ability to serialize this format to disk. -A "quick and dirty" approach would be to implement a conversion to and -from a string in JSON format. (Errors are ignored for the time being, they -will be handled later.) +A key feature is the ability to serialize this format to disk. A "quick and +dirty" approach would be to implement a conversion to and from a string in JSON +format. (Errors are ignored for the time being, they will be handled later.) To write it in a normal form expected by functional language users: @@ -144,8 +143,8 @@ But that is where our next subject comes in: Poly Isos. The previous example was simply converting between values of two fixed types. This next block builds upon it with generics, and is more interesting. -Poly Isos allow an operation to be generic over any type while -returning a single type. +Poly Isos allow an operation to be generic over any type while returning a +single type. This brings us closer to parsing. Consider what a basic parser would do ignoring error cases. Again, this is its normal form: @@ -159,8 +158,8 @@ case class Serde[T] { Here we have our first generic, the type `T` being converted. -In Rust, this could be implemented with a pair of traits in the standard library: -`FromStr` and `ToString`. The Rust version even handles errors: +In Rust, this could be implemented with a pair of traits in the standard +library: `FromStr` and `ToString`. The Rust version even handles errors: ```rust,ignore pub trait FromStr: Sized { @@ -177,8 +176,8 @@ pub trait ToString { Unlike the Iso, the Poly Iso allows application of multiple types, and returns them generically. This is what you would want for a basic string parser. -At first glance, this seems like a good option for writing a parser. -Let's see it in action: +At first glance, this seems like a good option for writing a parser. Let's see +it in action: ```rust,ignore use anyhow; @@ -213,8 +212,8 @@ That seems quite logical. However, there are two problems with this. First, `to_string` is to a very good way to explain "this is JSON." Every type would need to agree on a JSON representation, and many of the types in the Rust -standard library already don't. -Using this is a poor fit. This can easily be resolved with our own trait. +standard library already don't. Using this is a poor fit. This can easily be +resolved with our own trait. But there is a second, subtler problem: scaling. @@ -224,8 +223,8 @@ and possibly different JSON libraries -- to do it themselves, it will turn into a mess very quickly! The answer is one of Serde's two key innovations: an independent data model to -represent Rust data in structures common to data serialization languages. -The result is that it can use Rust's code generation abilities to create an +represent Rust data in structures common to data serialization languages. The +result is that it can use Rust's code generation abilities to create an intermediary conversion type it calls a `Visitor`. This means, in normal form (again, skipping error handling for simplicity): @@ -242,8 +241,8 @@ case class Visitor[T] { } ``` -The result is one Poly Iso and one Iso (respectively). -Both of these can be implemented with traits: +The result is one Poly Iso and one Iso (respectively). Both of these can be +implemented with traits: ```rust trait Serde { @@ -294,8 +293,8 @@ It's wonky, but it works... until we get to the elephant in the room. The only format currently supported is JSON. How would we support more formats? -The current design requires completely re-writing all of the code generation -and creating a new Serde trait. That is quite terrible and not extensible at all! +The current design requires completely re-writing all of the code generation and +creating a new Serde trait. That is quite terrible and not extensible at all! In order to solve that, we need something more powerful. @@ -317,14 +316,12 @@ Unfortunately because `Visitor` is a trait (since each incarnation requires its own custom code), this would require a kind of generic type boundary that Rust does not support. -Fortunately, we still have that `Visitor` type from before. -What is the `Visitor` doing? It is attempting to allow each data structure to -define the way +Fortunately, we still have that `Visitor` type from before. What is the +`Visitor` doing? It is attempting to allow each data structure to define the way it is itself parsed. -Well what if we could add one more interface for the generic format? -Then the `Visitor` is just an implementation detail, and it would "bridge" the -two APIs. +Well what if we could add one more interface for the generic format? Then the +`Visitor` is just an implementation detail, and it would "bridge" the two APIs. In normal form: @@ -350,16 +347,17 @@ as traits! Thus we have the Serde API: -1. Each type to be serialized implements `Deserialize` or `Serialize`, equivalent - to the `Serde` class +1. Each type to be serialized implements `Deserialize` or `Serialize`, + equivalent to the `Serde` class 1. They get a type (well two, one for each direction) implementing the `Visitor` - trait, which are usually (but not always) through macro-generated code. - This contains the logic to construct or destruct between the data type and the + trait, which are usually (but not always) through macro-generated code. This + contains the logic to construct or destruct between the data type and the format of the Serde data model. 1. The type implementing the `Deserializer` trait handles all details specific to the format, being "driven by" the `Visitor`. -This splitting and Rust type erasure is really to achieve a Prism through indirection. +This splitting and Rust type erasure is really to achieve a Prism through +indirection. You can see it on the `Deserializer` trait @@ -413,20 +411,22 @@ pub trait Deserialize<'de>: Sized { This has been abstract, so let's look at a concrete example. -How does actual Serde deserialize a bit of JSON into `struct Concordance` from earlier? +How does actual Serde deserialize a bit of JSON into `struct Concordance` from +earlier? 1. The user would call a library function to deserialize the data. This would create a `Deserializer` based on the JSON format. -1. Based on the fields in the struct, a `Visitor` would be created (more on - that in a moment) which knows how to create each type in a generic data - model that was needed to represent it: `Vec` (list), `u64` and `String`. +1. Based on the fields in the struct, a `Visitor` would be created (more on that + in a moment) which knows how to create each type in a generic data model that + was needed to represent it: `Vec` (list), `u64` and `String`. 1. The deserializer would make calls to the `Visitor` as it parsed items. 1. The `Visitor` would indicate if the items found were expected, and if not, raise an error to indicate deserialization has failed. For our very simple structure above, the expected pattern would be: -1. Begin visiting a map (_Serde_'s equivalent to `HashMap` or JSON's dictionary). +1. Begin visiting a map (*Serde*'s equivalent to `HashMap` or JSON's + dictionary). 1. Visit a string key called "keys". 1. Begin visiting a map value. 1. For each item, visit a string key then an integer value. @@ -442,11 +442,11 @@ For our very simple structure above, the expected pattern would be: But what determines which "observation" pattern is expected? A functional programming language would be able to use currying to create -reflection of each type based on the type itself. -Rust does not support that, so every single type would need to have its own -code written based on its fields and their properties. +reflection of each type based on the type itself. Rust does not support that, so +every single type would need to have its own code written based on its fields +and their properties. -_Serde_ solves this usability challenge with a derive macro: +*Serde* solves this usability challenge with a derive macro: ```rust,ignore use serde::Deserialize; @@ -461,39 +461,37 @@ struct IdRecord { That macro simply generates an impl block causing the struct to implement a trait called `Deserialize`. -This is the function that determines how to create the struct itself. -Code is generated based on the struct's fields. -When the parsing library is called - in our example, a JSON parsing library - -it creates a `Deserializer` and calls `Type::deserialize` with it as a -parameter. +This is the function that determines how to create the struct itself. Code is +generated based on the struct's fields. When the parsing library is called - in +our example, a JSON parsing library - it creates a `Deserializer` and calls +`Type::deserialize` with it as a parameter. The `deserialize` code will then create a `Visitor` which will have its calls -"refracted" by the `Deserializer`. -If everything goes well, eventually that `Visitor` will construct a value -corresponding to the type being parsed and return it. +"refracted" by the `Deserializer`. If everything goes well, eventually that +`Visitor` will construct a value corresponding to the type being parsed and +return it. -For a complete example, see the [_Serde_ documentation](https://serde.rs/deserialize-struct.html). +For a complete example, see the +[*Serde* documentation](https://serde.rs/deserialize-struct.html). The result is that types to be deserialized only implement the "top layer" of -the API, and file formats only need to implement the "bottom layer". -Each piece can then "just work" with the rest of the ecosystem, since generic -types will bridge them. +the API, and file formats only need to implement the "bottom layer". Each piece +can then "just work" with the rest of the ecosystem, since generic types will +bridge them. -In conclusion, -Rust's generic-inspired type system can bring it close to these concepts and -use their power, as shown in this API design. -But it may also need procedural macros to create bridges for its generics. +In conclusion, Rust's generic-inspired type system can bring it close to these +concepts and use their power, as shown in this API design. But it may also need +procedural macros to create bridges for its generics. -If you are interested in learning more about this topic, please check the following -section. +If you are interested in learning more about this topic, please check the +following section. ## See Also - [lens-rs crate](https://crates.io/crates/lens-rs) for a pre-built lenses implementation, with a cleaner interface than these examples -- [Serde](https://serde.rs) itself, which makes these concepts intuitive for - end users (i.e. defining the structs) without needing to understand the - details +- [Serde](https://serde.rs) itself, which makes these concepts intuitive for end + users (i.e. defining the structs) without needing to understand the details - [luminance](https://github.com/phaazon/luminance-rs) is a crate for drawing computer graphics that uses similar API design, including procedural macros to create full prisms for buffers of different pixel types that remain generic