-
Notifications
You must be signed in to change notification settings - Fork 39
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #79 from fee1-dead/milestone-4-rust
Flow.rs SDK Milestone 4
- Loading branch information
Showing
1 changed file
with
215 additions
and
0 deletions.
There are no files selected for viewing
215 changes: 215 additions & 0 deletions
215
submissions/issue-20/milestone-4/fee1-dead/20211025-milestone-4.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,215 @@ | ||
# `Flow.rs`/`flow-sdk` Milestone 4 | ||
|
||
Since this is the last Milestone, this document serves to show what `flow-sdk` has to offer, and how the | ||
library is different to `flow-rust-sdk`, by MarshallBelles. | ||
|
||
## How is `flow-sdk` different? | ||
|
||
`flow-rust-sdk` uses `prost` for protobuf encoding and decoding. `flow.rs` uses `otopr`, which is a library I | ||
built from scratch. `otopr` is not capable of generating Rust sources from .proto definitions, but it works | ||
better than `prost` in some ways. | ||
|
||
### `otopr`: The protobuf library that focuses on speed and flexibility | ||
|
||
`otopr` has a `derive` macro, which can automatically generate implementations for `EncodableMessage` and | ||
`DecodableMessage` traits. Unlike `prost` which has one trait for both encoding and decoding, the library | ||
separates types that can be encoded and types that can be decoded. This design choice is for reducing the | ||
amount of copying needed for encoding `string`s or `bytes`. | ||
|
||
For example, in Rust, a literal string in the source code `"Hello world"` has type `&'static str`. The | ||
`'static` lifetime indicates the string is valid for the entirety of the program. On common targets, literals | ||
are compiled to a known location in the binary output. | ||
|
||
However, when you try to encode a `&str` in a message with `prost`, you will find that it is quite hard. | ||
This is because `prost` does not have good support for encoding references. `otopr` is built from scratch | ||
with support of encoding references in mind. | ||
|
||
#### Owned buffers vs shared buffers | ||
|
||
Rust has a very interesting concept of ownership. When you "own" a string, or a byte vector, you are responsible | ||
for deallocating it when you need to "drop" it. However, in times of serializing data, we don't need to "own" | ||
the string if we just want to read the data from it. That is why we just need to accept references. References | ||
are pointers in Rust, which an extra guarrantee that the data it points to is valid. | ||
|
||
Why is this important? Well, this is important because `prost` only supports owned buffers. This would be quite | ||
limited, as every time when someone wants to encode bytes but they only have a shared reference, they would need | ||
to allocate a new chunk of memory and copy the data the reference points to to the newly allocated memory. `otopr` | ||
removes the need of doing this. | ||
|
||
#### Preencoded field tags | ||
|
||
Protocol buffers records fields in a message with field tags (the wire type and the field number combined) followed | ||
by the value. | ||
|
||
Rust has a powerful compile-time function evaluation (CTFE) engine. With the type of a field and the field number, | ||
we can preencode the field tags to compile-time arrays. Ideally, this can reduce runtime overhead to just a memcpy. | ||
|
||
Similiar things can be done when decoding. We know the field tags beforehand, so we just read it and `match` (Rust | ||
equivalent of the `switch` statement) on it. | ||
|
||
I did not have time to write benchmarks to prove that `otopr` is indeed faster than `prost`, but I am confident with | ||
the reasons above. | ||
|
||
### Design choices | ||
|
||
`flow.rs` wants to ensure that users will be able to write code that are readable, performant, and non-repetitive. | ||
|
||
Consider an example to run a script: | ||
|
||
This is how to do it with `flow-rust-sdk`: | ||
|
||
```rust | ||
use std::error::Error; | ||
|
||
use flow_rust_sdk::*; | ||
|
||
#[tokio::main] | ||
async fn main() -> Result<(), Box<dyn Error>> { | ||
let mut connection = FlowConnection::new("http://access.mainnet.nodes.onflow.org:9000").await?; | ||
|
||
let script = b" | ||
pub struct User { | ||
pub var balance: UFix64 | ||
pub var address: Address | ||
pub var name: String | ||
init(name: String, address: Address, balance: UFix64) { | ||
self.name = name | ||
self.address = address | ||
self.balance = balance | ||
} | ||
} | ||
pub fun main(name: String): User { | ||
return User( | ||
name: name, | ||
address: 0x1, | ||
balance: 10.0 | ||
) | ||
} | ||
"; | ||
let arguments: Vec<Vec<u8>> = vec![to_vec(&Argument::string("John Doe".to_string())).unwrap()]; | ||
let result: Value = from_slice( | ||
&connection | ||
.execute_script(script.to_vec(), arguments) | ||
.await? | ||
.value, | ||
)?; | ||
println!("Script result: {}", result["value"]); | ||
Ok(()) | ||
} | ||
``` | ||
|
||
There are many problems with this example. | ||
|
||
First of all, I had to look up the mainnet access node URL in order to execute the script on mainnet. There could | ||
be a helper function that connects to the mainnet access node instead of having me to look it up and paste it. | ||
|
||
Secondly, the library doesn't make common uses as easy to write as possible. For example, I have to use | ||
`from_slice`, which is a function from the `serde_json` library that encodes and decodes JSON values. I would be | ||
out of idea if I was a beginner and I want to parse the response. | ||
|
||
Thirdly, *look at how many allocations I need to make*. I had to copy "John Doe" to the heap (w/ `.to_string()`), | ||
I also had to copy *the entire script* to the heap (w/ `to_vec()`). Internally, | ||
[it also copies the URL to the heap](https://github.com/MarshallBelles/flow-rust-sdk/blob/1fc3e89e9dfaa37a50e4fbad86668c1bda57b95b/src/lib.rs#L49) (via `to_owned()`). These allocations can be entirely | ||
avoided. | ||
|
||
Lastly, I can't really make use of the result. It is JSON, not Cadence JSON. This is what I get from running it: | ||
|
||
``` | ||
Script result: {"fields":[{"name":"balance","value":{"type":"UFix64","value":"10.00000000"}},{"name":"address","value":{"type":"Address","value":"0x0000000000000001"}},{"name":"name","value":{"type":"String","value":"John Doe"}}],"id":"s.ccbac9e72ee36be8881671e8939b970bb8bc81fb8cfd695c6fd848cf75248802.User"} | ||
``` | ||
|
||
Consider the `flow.rs` example: | ||
|
||
```rust | ||
use std::error::Error; | ||
|
||
use cadence_json::ValueRef; | ||
|
||
use flow_sdk::access::ExecuteScriptAtLatestBlockRequest; | ||
use flow_sdk::prelude::TonicHyperFlowClient; | ||
|
||
const COMPLEX_SCRIPT: &str = " | ||
pub struct User { | ||
pub var balance: UFix64 | ||
pub var address: Address | ||
pub var name: String | ||
init(name: String, address: Address, balance: UFix64) { | ||
self.name = name | ||
self.address = address | ||
self.balance = balance | ||
} | ||
} | ||
pub fun main(name: String): User { | ||
return User( | ||
name: name, | ||
address: 0x1, | ||
balance: 10.0 | ||
) | ||
} | ||
"; | ||
|
||
#[tokio::main] | ||
async fn main() -> Result<(), Box<dyn Error + Send + Sync>> { | ||
let mut client = TonicHyperFlowClient::mainnet()?; | ||
client.ping().await?; | ||
|
||
let ret = client | ||
.send(ExecuteScriptAtLatestBlockRequest { | ||
script: COMPLEX_SCRIPT, | ||
arguments: [ValueRef::String("John Doe")], | ||
}) | ||
.await? | ||
.parse()?; | ||
|
||
println!("{:#?}", ret); | ||
|
||
Ok(()) | ||
} | ||
``` | ||
|
||
I can just use `::mainnet()` to connect to the mainnet access node, I can just use `.parse()` to parse the | ||
response, and I don't have to make any allocations, at least on my side. | ||
|
||
To be honest, sometimes allocations cannot be avoided. When encoding `ValueRef`s the library needs to allocate a | ||
JSON buffer and then write into the protobuf writer. The library also needs to allocate a buffer to encode the | ||
protobuf message before sending it to the network. But the goal is not to have no allocations at all, but to | ||
*eliminate unnecessary allocations*. In this example, only two allocations had to be made from `flow.rs`, one for | ||
encoding the argument, and another for encoding the whole message. | ||
|
||
And lastly, the result is much more clean: | ||
|
||
```rust | ||
Struct { | ||
id: "s.ccbac9e72ee36be8881671e8939b970bb8bc81fb8cfd695c6fd848cf75248802.User", | ||
balance: 10.00000000, | ||
address: 0x0000000000000001, | ||
name: "John Doe", | ||
} | ||
``` | ||
|
||
We have a cadence JSON library, and it handles UFix64 parsing and address parsing before returning to the user. | ||
|
||
### Security and dependency choices | ||
|
||
`flow.rs` has a number of carefully chosen dependencies. | ||
|
||
secp256k1 signing is provided by the `secp256k1` crate, which is an FFI wrapper around [libsecp256k1], | ||
a C library by Pieter Wuille that is used in many bitcoin projects. | ||
|
||
sha3 hashing is provided by the [`tiny-keccak`] crate, which claims to have [better performance] than | ||
the `sha3` crate used by `flow-rust-sdk`. | ||
|
||
[libsecp256k1]: https://github.com/bitcoin-core/secp256k1/ | ||
[`tiny-keccak`]: https://github.com/debris/tiny-keccak | ||
[better performance]: https://github.com/debris/tiny-keccak#benchmarks | ||
|
||
`flow-rust-sdk` has adopted their own fork of `p256`, this has two problems: 1. The `p256` crate is not production ready. It was never audited for | ||
security and thus has potential issues. 2. Maintaining a fork can be a burden and it makes it harder to recieve security updates from upstream. | ||
|
||
### Conclusion | ||
|
||
Competition fosters development. I respect my competitor and hope they take my feedback to improve. |