Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deallocation scheduling for validators #394

Closed
kayabaNerve opened this issue Oct 12, 2023 · 5 comments
Closed

Deallocation scheduling for validators #394

kayabaNerve opened this issue Oct 12, 2023 · 5 comments
Labels
discussion This requires discussion runtime

Comments

@kayabaNerve
Copy link
Member

kayabaNerve commented Oct 12, 2023

For a small network secure by just 4 validators, the following risk exists:

  1. Validators flood the pool with XYZ which doesn't actually exist, making inf XYZ valued at 1 SRI.
  2. Validators deallocate due to the now reduced stake requirement.
  3. Validators walk away with both the XYZ and the SRI.

Serai's continued security of having enough stake in the form of SRI w.r.t. lost XYZ is premised on the handover protocol. If the prior validators are so malicious, regardless of if its detected, the new validators will either refuse the handover or will accept the handover and maintain the SRI needed.

Under the above flow however, the new validators accept the handover with an artificially low SRI requirement, breaking security.

The naive solution is to look at the highest valuation for the token ever created, and always use that in stake requirement valuations. That way draining the pools doesn't decrease the SRI requirement.

The next iteration is to use the highest valuation over a single session, which does offer some reflexivity. For a malicious validator set at session n, and a cooperating malicious validator set at session n+1, this would prevent n+1 from removing their SRI until n+2. Please note this scheme does expand from a single session to an n session write-up.

This would require either:

  1. Other nodes to step in (such as the Serai validators censoring their TXs, preventing the handover, until a network upgrade)
  2. Other actors with economic stake to sign up to be in n+2, preventing the handover, trapping the SRI stake of n+1

What's likely best is for a supermajority of nodes to be able to pause handovers for any external network. This gives time to perform a network upgrade. The only question is how much time would be needed for a supermajority of nodes to decide to pause handovers?

A single session, or one week, should be fine.

Then there's the commentary a validator can decide to remove themselves from the validator set a block before the next validator set is decided, and in doing so, make a large amount of SRI suddenly available. This is compounded by the edge case for the Serai validator set itself where validator-sets doesn't track the current set in InSet yet the upcoming set, due to when pallet-session wants validators decided. This leads to a currently present issue within validator-sets where the current Serai validators are not considered active.

If we only let the prior validator set (n-1) actually reclaim their SRI once validator set n+1 has been handed over to, this would extend the time period for the above discussion to two sessions (two weeks) and partially correct the current issue within the validator-sets pallet. It would also ensure notice before large portions of SRI become unlocked.

The reason it only partially corrects the current issue is it only retains the SRI stake until the current set retires (fixing the immediate issue, yet not giving any response time to issues raised before the set retired). For the full flow, written out:

  1. The next validator set is decided, and becomes InSet
  2. The next validator set becomes current, deciding the next validator set, and the new next set become InSet
  3. The current validator set retires, and the new next set becomes current, with the decision making for the new new next set
  4. With the new new next set, the first next set is relatively n-2. Since we lock SRI from n-1 until n+1 has been handed over, which is also a delta of 2, the first next set (which became the current set and is presumably malicious) has their SRI unlocked.
@kayabaNerve kayabaNerve added critical This is critical discussion This requires discussion runtime labels Oct 12, 2023
@kayabaNerve
Copy link
Member Author

Considering there's a few tasks here,

  1. Preventing deallocations if the deallocations take the current set below the requirement given by the prior set's peak price
  2. Preventing hand-overs upon vote
  3. Potentially delaying deallocations to give advance notice
  4. Fixing the handling of whether or not Serai validators are active according to the validator-sets pallet

This likely deserves to be broken into 2 or 3 issues.

kayabaNerve added a commit that referenced this issue Oct 12, 2023
…lusion starts, plus a one session cooldown period

Part of #394.
@kayabaNerve kayabaNerve removed the critical This is critical label Oct 12, 2023
@kayabaNerve
Copy link
Member Author

Removing critical now that Serai validators can't deallocate while active.

This leaves:

  1. Preventing deallocations if the deallocations take the current set below the requirement given by the prior set's peak price
  2. Preventing hand-overs upon vote

@kayabaNerve
Copy link
Member Author

The ability to drain the SRI used as liquidity does create a stake requirement:

((sri_liquidity + xyz_value) * 1.5) + margin

Which I just want noted.

@kayabaNerve
Copy link
Member Author

Draining the SRI present as liquidity would be done via improper mintage of sriXYZ, hence #402 detailing a mechanism to prevent improper mintage of sriXYZ.

Alternatively, we can define all mintage of sriXYZ as acceptable so long as the validators' stake can be used to acquire that much XYZ upon slash. While validators count mint sriXYZ to drain the SRI liquidity, their stake requirement increase from minting the sriXYZ would exceed the SRI swapped for, making this attack unprofitable, and enabling reducing stake requirements to just the XYZ value.

Then the issue is determining the price of XYZ. If the quote price was used, then improper mintage of sriXYZ (which would reduce the quote price when swapped to SRI) would enable further mintage, making this scheme ineffective.

The highest price seen across a validator set could be used, yet this enables a DoS where a malicious attacker spikes the price (causing the on-chain oracle to always report the spiked price), preventing legitimate mints in the future.

I propose the highest sustained price. The oracle would provide security so long as:

  1. Legitimately highest prices are sustained on-chain for some amount of time OR trivially higher
  2. Upon malicious actions causing the oracle to stop functioning (censorship, improperly minted sriXYZ being swapped artificially driving the quote price down), the value of XYZ does not non-trivially exceed the highest observed sriXYZ value

And only allow the aforementioned DoS if a malicious attacker can artificially inflate the quote price and maintain it against arbitragers for several blocks. While the Serai validator set can trivially do this, they can already censor transactions from the XYZ set (causing them to suffer a DoS).

Triviality is defined by whatever margin is used in the stake requirement. A non-trivial divergence is any divergence which causes the margin to be ineffective and the validator set to lose its economic security.

This can be efficiently implemented on-chain via:

  • With each swap, tracking if it's the highest quote price for that block
  • Using a StorageMap where the key is amounts, stored as big endian byte sequences, with no hashing, to store amounts from the most recent block

This adds a flat cost to each swap and a flat cost to each new block (setting its storage value, clearing the storage value for the block now outside the window of time used).

The highest sustained price for the sliding window is simply the amount stored in the first key present in the map, when iterating in lexicographic order (as offered by Substrate with logarithmic performance). It is the lowest price observed over the window of blocks, and accordingly the highest price observed or exceeded by all of the blocks. If this exceeds the prior highest sustained price, it becomes the oracle value.

On new set, we would clear the oracle price (resetting it to the lowest price observed by the current window) to ensure responsiveness (the stake requirements don't go only up and can go back down).

For a more complicated security analysis, and less security, yet more responsiveness, a TWAP could be constructed.


Please also note the above formula, ((sri_liquidity + xyz_value) * 1.5) + margin is largely invalid in practice if the quote price is used as it ignores the additionally capturable value from effectively scamming arbitragers. For 100 SRI to 100 sriXYZ (creating a 300 SRI + margin stake requirement), improperly minted sriXYZ can be created to drain the SRI while only minimally increasing the stake requirement (as the quote price will trend to 0 as the amount of swaps continue). This sets a bound that 67% of the stake requirement prior enforced (and still slashable due to deallocation scheduling) must be less than potential profit (ignoring the margin) in order to ensure this attack isn't profitable. This formula is oblivious to how as improper sriXYZ is minted and swapped, arbitrageurs will swap back, making even more SRI available as liquidity to drain. The margin would not only have to handle price fluctuations, yet all potential/likely swaps by arbitrage bots in response to the sriXYZ being sold for an amount significantly less than expected.


Such an oracle removes the benefit from #402. While #402 prevents improper minting, this economically secures improper minting. Accordingly, with this oracle being implemented, #402 can be closed.

kayabaNerve added a commit that referenced this issue Nov 25, 2023
Relevant to #394.

Prevents hand-over due to hand-over occurring via a `Batch` publication.

Expects a new protocol to restore functionality (after a retirement of the
current protocol).
@kayabaNerve
Copy link
Member Author

All tasks aforementioned are now resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion This requires discussion runtime
Projects
None yet
Development

No branches or pull requests

1 participant