Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flow.rs SDK Milestone 4 #79

Merged
merged 1 commit into from
Nov 2, 2021
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
215 changes: 215 additions & 0 deletions submissions/issue-20/milestone-4/fee1-dead/20211025-milestone-4.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,215 @@
# `Flow.rs`/`flow-sdk` Milestone 4

Since this is the last Milestone, this document serves to show what `flow-sdk` has to offer, and how the
library is different to `flow-rust-sdk`, by MarshallBelles.

## How is `flow-sdk` different?

`flow-rust-sdk` uses `prost` for protobuf encoding and decoding. `flow.rs` uses `otopr`, which is a library I
built from scratch. `otopr` is not capable of generating Rust sources from .proto definitions, but it works
better than `prost` in some ways.

### `otopr`: The protobuf library that focuses on speed and flexibility

`otopr` has a `derive` macro, which can automatically generate implementations for `EncodableMessage` and
`DecodableMessage` traits. Unlike `prost` which has one trait for both encoding and decoding, the library
separates types that can be encoded and types that can be decoded. This design choice is for reducing the
amount of copying needed for encoding `string`s or `bytes`.

For example, in Rust, a literal string in the source code `"Hello world"` has type `&'static str`. The
`'static` lifetime indicates the string is valid for the entirety of the program. On common targets, literals
are compiled to a known location in the binary output.

However, when you try to encode a `&str` in a message with `prost`, you will find that it is quite hard.
This is because `prost` does not have good support for encoding references. `otopr` is built from scratch
with support of encoding references in mind.

#### Owned buffers vs shared buffers

Rust has a very interesting concept of ownership. When you "own" a string, or a byte vector, you are responsible
for deallocating it when you need to "drop" it. However, in times of serializing data, we don't need to "own"
the string if we just want to read the data from it. That is why we just need to accept references. References
are pointers in Rust, which an extra guarrantee that the data it points to is valid.

Why is this important? Well, this is important because `prost` only supports owned buffers. This would be quite
limited, as every time when someone wants to encode bytes but they only have a shared reference, they would need
to allocate a new chunk of memory and copy the data the reference points to to the newly allocated memory. `otopr`
removes the need of doing this.

#### Preencoded field tags

Protocol buffers records fields in a message with field tags (the wire type and the field number combined) followed
by the value.

Rust has a powerful compile-time function evaluation (CTFE) engine. With the type of a field and the field number,
we can preencode the field tags to compile-time arrays. Ideally, this can reduce runtime overhead to just a memcpy.

Similiar things can be done when decoding. We know the field tags beforehand, so we just read it and `match` (Rust
equivalent of the `switch` statement) on it.

I did not have time to write benchmarks to prove that `otopr` is indeed faster than `prost`, but I am confident with
the reasons above.

### Design choices

`flow.rs` wants to ensure that users will be able to write code that are readable, performant, and non-repetitive.

Consider an example to run a script:

This is how to do it with `flow-rust-sdk`:

```rust
use std::error::Error;

use flow_rust_sdk::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
let mut connection = FlowConnection::new("http://access.mainnet.nodes.onflow.org:9000").await?;

let script = b"
pub struct User {
pub var balance: UFix64
pub var address: Address
pub var name: String

init(name: String, address: Address, balance: UFix64) {
self.name = name
self.address = address
self.balance = balance
}
}

pub fun main(name: String): User {
return User(
name: name,
address: 0x1,
balance: 10.0
)
}
";
let arguments: Vec<Vec<u8>> = vec![to_vec(&Argument::string("John Doe".to_string())).unwrap()];
let result: Value = from_slice(
&connection
.execute_script(script.to_vec(), arguments)
.await?
.value,
)?;
println!("Script result: {}", result["value"]);
Ok(())
}
```

There are many problems with this example.

First of all, I had to look up the mainnet access node URL in order to execute the script on mainnet. There could
be a helper function that connects to the mainnet access node instead of having me to look it up and paste it.

Secondly, the library doesn't make common uses as easy to write as possible. For example, I have to use
`from_slice`, which is a function from the `serde_json` library that encodes and decodes JSON values. I would be
out of idea if I was a beginner and I want to parse the response.

Thirdly, *look at how many allocations I need to make*. I had to copy "John Doe" to the heap (w/ `.to_string()`),
I also had to copy *the entire script* to the heap (w/ `to_vec()`). Internally,
[it also copies the URL to the heap](https://github.com/MarshallBelles/flow-rust-sdk/blob/1fc3e89e9dfaa37a50e4fbad86668c1bda57b95b/src/lib.rs#L49) (via `to_owned()`). These allocations can be entirely
avoided.

Lastly, I can't really make use of the result. It is JSON, not Cadence JSON. This is what I get from running it:

```
Script result: {"fields":[{"name":"balance","value":{"type":"UFix64","value":"10.00000000"}},{"name":"address","value":{"type":"Address","value":"0x0000000000000001"}},{"name":"name","value":{"type":"String","value":"John Doe"}}],"id":"s.ccbac9e72ee36be8881671e8939b970bb8bc81fb8cfd695c6fd848cf75248802.User"}
```

Consider the `flow.rs` example:

```rust
use std::error::Error;

use cadence_json::ValueRef;

use flow_sdk::access::ExecuteScriptAtLatestBlockRequest;
use flow_sdk::prelude::TonicHyperFlowClient;

const COMPLEX_SCRIPT: &str = "
pub struct User {
pub var balance: UFix64
pub var address: Address
pub var name: String

init(name: String, address: Address, balance: UFix64) {
self.name = name
self.address = address
self.balance = balance
}
}

pub fun main(name: String): User {
return User(
name: name,
address: 0x1,
balance: 10.0
)
}
";

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error + Send + Sync>> {
let mut client = TonicHyperFlowClient::mainnet()?;
client.ping().await?;

let ret = client
.send(ExecuteScriptAtLatestBlockRequest {
script: COMPLEX_SCRIPT,
arguments: [ValueRef::String("John Doe")],
})
.await?
.parse()?;

println!("{:#?}", ret);

Ok(())
}
```

I can just use `::mainnet()` to connect to the mainnet access node, I can just use `.parse()` to parse the
response, and I don't have to make any allocations, at least on my side.

To be honest, sometimes allocations cannot be avoided. When encoding `ValueRef`s the library needs to allocate a
JSON buffer and then write into the protobuf writer. The library also needs to allocate a buffer to encode the
protobuf message before sending it to the network. But the goal is not to have no allocations at all, but to
*eliminate unnecessary allocations*. In this example, only two allocations had to be made from `flow.rs`, one for
encoding the argument, and another for encoding the whole message.

And lastly, the result is much more clean:

```rust
Struct {
id: "s.ccbac9e72ee36be8881671e8939b970bb8bc81fb8cfd695c6fd848cf75248802.User",
balance: 10.00000000,
address: 0x0000000000000001,
name: "John Doe",
}
```

We have a cadence JSON library, and it handles UFix64 parsing and address parsing before returning to the user.

### Security and dependency choices

`flow.rs` has a number of carefully chosen dependencies.

secp256k1 signing is provided by the `secp256k1` crate, which is an FFI wrapper around [libsecp256k1],
a C library by Pieter Wuille that is used in many bitcoin projects.

sha3 hashing is provided by the [`tiny-keccak`] crate, which claims to have [better performance] than
the `sha3` crate used by `flow-rust-sdk`.

[libsecp256k1]: https://github.com/bitcoin-core/secp256k1/
[`tiny-keccak`]: https://github.com/debris/tiny-keccak
[better performance]: https://github.com/debris/tiny-keccak#benchmarks

`flow-rust-sdk` has adopted their own fork of `p256`, this has two problems: 1. The `p256` crate is not production ready. It was never audited for
security and thus has potential issues. 2. Maintaining a fork can be a burden and it makes it harder to recieve security updates from upstream.

### Conclusion

Competition fosters development. I respect my competitor and hope they take my feedback to improve.