-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Node Validator API #1771
Add Node Validator API #1771
Conversation
The `node-metrics` is designed to provide information about the block chain and the nodes connected to the block chain network. The name of the crate was suggested by Charles so that it might be used by other systems. However, in general it is intended to be used to serve information for the Node Validator Dashboard. This crates defines various messages and logic for consuming and distributing information about the Espresso Block Chain and the nodes connected to the Espresso Sequencer network. The current implementation is a basic implementation in an effort to get the simplest setup working. At the moment, there are two issues that prevent anything from building. - The first is that the `Sink` implementation of `Connection` from `tide-disco` requires the message being send to bea borrowed value instead of the value itself. I feel that this is unexpected, and it makes it difficult to actually send messages as a passthrough. - The second is that the `sequencer` crate is being consumed in an effort to extract the `SeqTypes` type. By doing this, the idea is that we should be able to use the actual types defined by the sequencer so that we are able to consume the exact information we will be given in a production environment. However, due to this import the tests that have been written cannot be evaluated. I believe that this means that this cannot work as expected, and will need to be rethought.
Remove bad test from main
Add more types for Server and Client message connection Refactor thread based blocking processing to async runtime processing Add several tests to ensure correct behavior
Rename some local variables in tests for better clarity
There will be a need to attempt to be resilient and be able to recover from certain classes of errors. As such it is helpful to be able to create a Stream from a given Client as needed. This will allow for future changes to be able to re create the stream in the event of a disconnection (which will happen).
By default `BitVec`s are `usize` in their representation. This works great for systems that are able to decode `u64` types from a JSON representation, but for languages that have a unified number system, such as Javascript, these are not representable accurately when decoded. In order to support these cases it is easier to just have the `BitVec` represent it's data in a format that is supported. `u32` could be used here, but for maximum compatibility, I've chosen to use `u16` instead.
In order to retrieve the StakeTable it is beneficial to have a function implementation to retrieve the StakeTable from a Sequencer.
The Node Identity stored in DataState was stored as a tuple of keys to `NodeIdentity`. However, `NodeIdentity` already stores the public key, so it is unnecessary, especially following the update where the `NodeIdentity` itself has every other field as being optional.
Due to the potential difficulty there can be in knowing your standing IP Address on cloud platforms, the requirement of knowing the ip address of a node has been replaced with a public url instead. This public url should be the base url for an API endpoint for the sequencer.
We intended to be able to retrieve Node Identity information from the prometheus metrics. As such we need a function that can handle this interaction. Using the plugin for `prometheus-parse`, and `reqwest` we can retrieve the data given a valid base URL to work with. From there we can parse the data that is available in the resulting `Scrape` object to fill in the missing pieces of Node Identity information.
A wallet address is able to be decoded from a string using the `FromStr` trait.
The key "latitude" is used for both latitude and longitude values. This corrects the mistake by using "longitude" for longitude. Fixes an off-by-one error.
The Wallet address is meant to be a 160 bit address, which is 20 bytes. So it's hexidecimal representation should be 40 characters instead of the 32 that have been being set up to this point. e
The logic for verifying the node identity in a Scrape, and then the data's population is a little large and daunting to consume all at once. It's better to split these pieces into separate functions so that the related parsing pieces are together.
The test `test_process_client_handling_stream_subscribe_voters` never completes and just runs forever. The reason for this is due to a flaw in the creation of the test itself. The test is intended to check on the real-time submission of voters data, and the distribution of that data to the subscribers. However, the setup has the users subscribe to `NodeIdentity` updates instead of `Voters` updates. To fix the issue we just need to change the subscribe calls to refer to voters instead.
In order to verify the behavior of the ConnectedNetwork under various assumptions specific scenario unit tests are desired in order to ensure that they behave as expected. Fix external event handler imports
sequencer/src/lib.rs
Outdated
std::env::var("ESPRESSO_SEQUENCER_IDENTITY_NODE_NAME").unwrap_or("".into()), | ||
std::env::var("ESPRESSO_SEQUENCER_IDENTITY_WALLET_ADDRESS").unwrap_or("".into()), | ||
std::env::var("ESPRESSO_SEQUENCER_IDENTITY_COMPANY_NAME").unwrap_or("".into()), | ||
std::env::var("ESPRESSO_SEQUENCER_IDENTITY_COMPANY_WEBSITE").unwrap_or("".into()), | ||
std::env::var("ESPRESSO_SEQUENCER_IDENTITY_OPERATING_SYSTEM").unwrap_or("".into()), | ||
std::env::var("ESPRESSO_SEQUENCER_IDENTITY_NODE_TYPE") | ||
.unwrap_or(format!("espresso-sequencer {}", Ver::VERSION)), | ||
std::env::var("ESPRESSO_SEQUENCER_IDENTITY_NETWORK_TYPE").unwrap_or("".into()), | ||
]); | ||
|
||
// Expose Node Identity Location via the status/metrics API | ||
metrics | ||
.text_family( | ||
"node_identity_location".into(), | ||
vec!["country".into(), "latitude".into(), "longitude".into()], | ||
) | ||
.create(vec![ | ||
std::env::var("ESPRESSO_SEQUENCER_IDENTITY_COUNTRY_CODE").unwrap_or("".into()), | ||
std::env::var("ESPRESSO_SEQUENCER_IDENTITY_LATITUDE").unwrap_or("".into()), | ||
std::env::var("ESPRESSO_SEQUENCER_IDENTITY_LONGITUDE").unwrap_or("".into()), | ||
]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of reading these directly from the environment, it would be better to get them from the clap options, that way they show up in the auto-generated help. For example, we could create a struct like
#[derive(clap::Parser)]
struct Identity {
#[clap(long = "identity-company-name", env = "ESPRESSO_SEQUENCER_IDENTITY_COMPANY_NAME")]
company_name: Option<String>,
// etc
}
and then add to Options
:
struct Options {
...
#[clap(flatten)]
identity: Identity,
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in ba8048c.
Though, to take advantage of std::env::consts::OS
, I ended up implementing Default
for the Identity
structure, and taking advantage of
#[clap(skip)]
to customize how it gets populated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: instead of #[clap(skip)]
/default
, try #[clap(..., default_value = std::env::consts::OS)]
on the operating_system
field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved in e2a2180.
Also, apparently the Default::default
call prevented the program from being able to parse all other derived arguments. I'm not certain as to the exact reason for this, but either way by populating using default_value
, and reverting to flatten
, the issue is resolved.
node-metrics/src/api/node_validator/v0/create_node_validator_api.rs
Outdated
Show resolved
Hide resolved
node-metrics/src/api/node_validator/v0/create_node_validator_api.rs
Outdated
Show resolved
Hide resolved
node-metrics/src/api/node_validator/v0/create_node_validator_api.rs
Outdated
Show resolved
Hide resolved
The `InternalClientMessage` enum largely follows the `ClientMessage`, and as such it has many duplicate cases. As pointed out by @jbearer in this comment: #1771 (comment). It makes more sense to wrap the `ClientMessage`s instead of having duplicate cases for them. This also has the added benefit of cleaning up much of boilerplate implementation details. Additionally, this relocates the `PartialEq` implementation into the test module so that it is only defined for unit tests.
The `new` call for `DataState` takes a `stake_table` as part of its arguments. Currently, the `DataState` gets called with `Default::default()` arguments, and then the `stake_table` is replaced within it. This is unnecessary as one of the next steps of initialization is to retrieve the `stake_table` from the `stake_table_url_base` url. Instead this should just defer the creation of the `DataState` until after the `stake_table` is retrieved. Suggested by @jbearer in this discussion: #1771 (comment)
@jbearer has noted that the `leaf_chain` within the `Decide` hotshot event returns the highest `Leaf` first: #1771 (comment) The leaf sender is really wanting the leafs to arrive in ascending order. In order to achieve this, the iterator for the `leaf_chain` has been reversed.
As pointed out by @jbearer the documentation for the `node-validator` API was largely copied and then not modified to reflect the specific intentions and features of the `node-validator` endpoints: #1771 (comment) #1771 (comment) Modifies the documentation to reflect the purpose and intention behind the `node-validator` API.
As pointed out by @jbearer, the `drop` calls at the end of the block are not needed: #1771 (comment)
Proposed by @jbearer in comment: #1771 (comment)
It is not clear what valid values for `network_type` and `node_type` is or should be. In order to add clarity comments have been added to these two fields to provides some ideas for what could be valid values for these fields. Change suggested by @jbearer: #1771 (comment)
Add `create_roll_call_response` to handle message serialization Suggested by @jbearer: #1771 (comment) Replace dummy public key with actual public key Suggested by @jbearer: #1771 (comment) Replace `Vec` of `JoinHandle` and `Drop` code with `TaskList` suggested by @jbearer: #1771 (comment)
Based on feedback provided by @jbearer: #1771 (comment) #1771 (comment) Instead of querying all of the environment variables separately, and dynamically after the program has launched, we can take advantage of clap's ability to automatically populate the desired data points from either command line arguments or environment variables. As an added benefit the fields can be grouped together into a single `Identity` structure.
Based on feedback discussion with @jbearer: #1771 (comment) #1771 (comment) In the cases where we are no longer able to make meaningful progress, and in an effort to be better about not failing silently and then having our data slowly stagnate over time, these early task returns have been replaced with panics for improved failure indicators.
From discussion with @jbearer #1771 (comment) Based on the converesation with Jeb the default value can be utilized to populate the operating system value from the environment directly. While implementing this fix, it was also discovered that the previous way of populating the Identity using `Default::default` actually prevented the program from running at all due to bad initialization. I haven't determined the exact reason, but by switching back to default_value population, we can revert to using the `flatten` option, and the issue seems to be fixed as a result.
Discussion with @jbearer: #1771 (comment) Right now we don't have a meaningful definition of what an address for a Sequencer is. As a result it makes more sense to omit the wallet address field rather than to keep it in an effort to create a forward compatible definition. This commit removes the definition, population, and representation of the wallet address from both the sequencer, and from the node-validator api. This will require a change in the Node Validator UI, as it expects a wallet_address to be present.
Add /v0/ to node-validator in docker-compose.yaml Fix `depends_on` conditions in docker-compose.yaml
The task joins were causing the tests to fail due to the closed channels resulting in a panic. Since this is desired behavior, the tests for them have been removed from the tests so as not to cause failures due to desired panic behavior.
Closes #1778
This PR:
Provides a versioned implementation of the Node Validator API. This API is meant to be used in conjunction with the Node Validator UI that has been designed on the Block Explorer.
The create
node-metrics
was chosen overnode-validator
based on a recommendation from @clu8, as the crate and systems may be utilized without the node-validator aspects.Key places to review:
node-metrics/src/service/data_state/mod.rs
node-metrics/src/service/client_state/mod.rs
Usage example test is located in
node-metrics/src/api/node_validator/v0/create_node_validator_api.rs