-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve provided handling #381
Improve provided handling #381
Conversation
coordinator/src/tributary/scanner.rs
Outdated
@@ -114,12 +117,38 @@ pub(crate) async fn handle_new_blocks< | |||
processors: &Pro, | |||
publish_serai_tx: PST, | |||
spec: &TributarySpec, | |||
tributary: &TributaryReader<D, Transaction>, | |||
tributary: &Arc<Tributary<D, Transaction, P>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't need to take &Arc
. &Arc::new(x)
should deref to &x
.
I'm unsure the impact of this (taking the Tributary) on latency. It shouldn't be atrocious with the recent localization I did, yet even if it's perfectly fine (which I'd hope), it shouldn't be necessary. We should add a method to the reader to check if an on-chain provided transaction was provided into the tributary ourselves yet.
IIRC, Provided
are perfectly ordered within some queue, yet aren't guaranteed to have distinct hashes. Accordingly, on local provision, you'd save Provided queue i hash
. Then, on-chain when you see a Provided, you'd check Provided queue i hash
.
There is a pseudo-unsafety here, where Provided queue i
is insufficient as what other validators came to may be distinct from what we came to. Under this, we'd likely enter a wait loop. The question is if the tributary should return an error in such a case. It likely should.
coordinator/src/tributary/scanner.rs
Outdated
return; | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an O(n^2) algorithm which breaks safety. Provided are expected to be ordered, yet this disregards order.
IF we were going to have this unsafe algorithm, though no, we do need to keep the ordering checks, we could at least make it O(n) with a HashSet...
coordinator/src/tributary/scanner.rs
Outdated
let h: [u8; 32] = hash.try_into().unwrap(); | ||
h == tx_hash | ||
}) { | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure if this should be a return or a break. I believe a break.
txn.commit(); | ||
} | ||
|
||
pub(crate) fn provided_transactions(&self) -> Vec<u8> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it's irrelevant, due to my above commentary on how this should be handled, we shouldn't return a raw byte buffer with no meaning no context. The Tributary APIs should be self contained and safe, meaning this should've at least been a Vec<[u8; 32]>
.
self.verify_block::<N>(block, schema)?; | ||
// verify the block, ignore the NonLocalProvided error. | ||
let res = self.verify_block::<N>(block, schema); | ||
if res.as_ref().is_err_and(|e| !matches!(e, BlockError::NonLocalProvided(_))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unsafe as a NonLocalProvided
error may be raised midway through. If it is, the block would pass as valid here, despite half of the checks not actually executing.
We need to pass in a flag to not perform these checks specifically, yet still perform all other checks, and check that doesn't have any error.
self.provided.complete(&mut txn, order, tx.hash()); | ||
} | ||
// We used to call the provided.complete() here | ||
// but that is now moved to coordinator scanner. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The coordinator
should not be acknowledge by the tributary
, a self-contained piece of code.
@@ -796,6 +796,9 @@ impl<N: Network + 'static> TendermintMachine<N> { | |||
if self.block.log.has_consensus(self.block.round().number, Data::Prevote(Some(block.id()))) { | |||
match self.network.validate(block).await { | |||
Ok(_) => (), | |||
// BlockError::Temporal is due to us not having the locally provided tx yet, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tendermint should not acknowledge Tributary.
Though yes, the logic
"If there's a temporal error yet no one else has it, we assume it's our fault"
is perfectly sound :)
ef24b82
to
d083b1e
Compare
1) Uses a single TXN in provided 2) Doesn't continue on non-local provided inside verify_block, skipping further execution of checks 3) Upon local provision of already on-chain TX, compares
3a4f7de
to
54e8a9b
Compare
Will merge upon CI passing :) |
Instead of preventing a node adding a tributary block that contains provided tx(s) that doesn't locally available, we add the block to tributary(since it would be acknowledged the by the network anyway) and let the node participates in the consensus as usual. But we stop handling(scanning) the block by the coordinator until it has the provided txs in the block locally available.
So provided tx completion now happens on coordinator scanner instead of within the tributary.
Tackles #366.