Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qip-0015: UTXO Trimming #45

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jdowning100
Copy link

No description provided.

qip-0015.md Show resolved Hide resolved
qip-0015.md Outdated
## Motivation

The Qi UTXO set must be held on the disk of every node in the network in order for nodes to come into consensus, and the set can grow infinitely large. There are currently about [187.3 million UTXOs](https://www.blockchain.com/explorer/charts/utxo-count) in Bitcoin, with at ~70ish bytes per UTXO, takes up about 13gb of space on disk.
I fully believe that to build a monetary system for every individual the UTXO set for a given zone could grow to as large as 20 billion, which is about 200 UTXOs per user for 100 million users per zone. Qi UTXOs are about 57 bytes: 2 bytes for the leveldb prefix, 32 bytes for the txhash, 2 bytes for the index, 20 bytes for the address that may spend the UTXO, 1 byte for the denomination, and 1 byte or more for the lock (if applicable). This means that the UTXO set on disk could grow to as large as 1.14 TB.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need some math to back up your assumptions here. e.g. How do you determine every user needs 200 UTXOs? How did you arrive at 100M users per zone?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's worthwhile in this QIP to explain why a user needs a certain amount of UTXOs or how many users we should support. That debate should happen elsewhere. These numbers are reasonable. If you don't think they are reasonable, you may explain why, but again, it should be done in a forum post or something.

Copy link
Contributor

@wizeguyy wizeguyy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two primary questions:

  1. why time based expiration? We were going to do fill-level expiration last we talked.
  2. still need a specification section
  • most of what you wrote is implementation specific, and doesn't belong in a protocol spec
  • some of it is useful as supporting rationale for a protocol design, but that stuff should go into a "rationale" secion.
  • nowhere do you layout your proposed protocol changes. Please specify (in the specification section) exactly what data structures and protocol rules you propose to change.

qip-0015.md Outdated

The Qi UTXO set must be held on the disk of every node in the network in order for nodes to come into consensus, and the set can grow infinitely large. There are currently about [187.3 million UTXOs](https://www.blockchain.com/explorer/charts/utxo-count) in Bitcoin, with at ~70ish bytes per UTXO, takes up about 13gb of space on disk.
I fully believe that to build a monetary system for every individual the UTXO set for a given zone could grow to as large as 20 billion, which is about 200 UTXOs per user for 100 million users per zone. Qi UTXOs are about 57 bytes: 2 bytes for the leveldb prefix, 32 bytes for the txhash, 2 bytes for the index, 20 bytes for the address that may spend the UTXO, 1 byte for the denomination, and 1 byte or more for the lock (if applicable). This means that the UTXO set on disk could grow to as large as 1.14 TB.
Because the UTXO set can grow to be infinitely large, a well-funded attacker could launch a DDoS attack to grow the UTXO set to a size that is untenable for the average node to store on disk. In order to prevent this attack, this QIP proposes burning old, unused small-value UTXOs (below 1 Qi), such that the DDoS bloating attack would be expontentially more expensive (10 billion Qi vs 10 million Qi, for example).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You claim here that the attack gets exponentially more expensive, but I don't see any math later to show that. Could you provide something showing the cost of attack before and after this QIP?

qip-0015.md Outdated

### UTXO Trimming

The following table displays small-value denominations and the timeframe after which each respective denomination should be trimmed (removed from the set). Note that denomination 0 is equivalent to 0.001 Qi or 1 "Qit" (the smallest unit of Qi).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Time based expiration? I thought we were going to delete based on fill level. This feels more complicated to implement for a few reasons, but I'm curious why the expiration on fill level was abandoned?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expiration on fill level would require us to search through the entire set, determine the size each individual denomination takes up, and delete each UTXO of a respective denomination. It is at least an O(n) process.

qip-0015.md Outdated
| 5 | 0.25 Qi | 10 months
| 6 | 0.5 Qi | 12 months

Every block will trim UTXOs created by a block at a depth that is determined to be approximately equal to each timeframe. That means that every block will trim UTXOs of a certain denomination from 6 different blocks. This can be compute intensive, so it's recommended that the implementation perform the trimming of each block in parallel, which requires a CPU with 7 threads at minimum.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels cost prohibitive to change-makers. They'll be sitting on mountains of small denomination UTXOs and probably operating at razor thin margins. Now we expect them to rotate all their 1 Qit UTXOs every 2 weeks? and all their 5 Qit UTXOs every month? etc etc

Copy link
Author

@jdowning100 jdowning100 Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change makers (that don't exist yet) will create an implementation that accounts for this and will still be able to provide a service that users pay for. I expect 1 Qit UTXOs to be solely used for fees.

qip-0015.md Outdated

### UTXO Key Organization
Because we must trim blocks at depth, we must store the UTXOs created by each block on disk. The UTXO set is a flat database, not keyed by block hash, but rather keyed by transaction hash and index. One way to do this is to add the block number and denomination into the key as a prefix, and then iterate through all keys on disk with the block number and denomination as the prefix and trim the UTXOs under that prefix.
The issue with this approach is that it requires the user or sender of a transaction to provide the block number and denomination into the input so the UTXO can be looked up for verification. Denomination is easy, as the user knows how much they will spend, but block number is slightly more difficult, as the block height that created a UTXO can change at the tip. A wallet might mistakenly mark that a UTXO was created in a certain block, only for that information to be invalidated 5 minutes later.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but block number is slightly more difficult

this is an understatement. Its practically impossible.

5 minutes later

or 0.1 seconds later

qip-0015.md Outdated
A different approach is to store the 9 bytes in a different way - store the key of every UTXO created by a block on disk, but shorten the key to 9 bytes: The first 8 bytes of the txhash and the 1-byte denomination. This is a separate index that maps block height to shortened UTXO keys. The storage requirements are similar to the prior approach, but the user does not have to provide extra information that takes up bandwidth in transaction propagation. The UTXO index is not necessary, as all indices of a transaction for a certain block and denomination will be trimmed. In addition, only keys for UTXOs of denominations 0 through 6 need to be stored, as all other UTXOs will never be trimmed.

The birthday problem formula provides an estimate of the entropy provided by an 8-byte hash. For 5 billion random elements, the chance of one unintentional collision is 50%. However, a malicious actor could intentionally create collisions. To account for this, I suggest a simple rule: if there is a duplicate key, and the denomination is the same, we delete both UTXOs. While this is incredibly unlikely, it would be very difficult to determine which UTXO was created in the block we are trying to trim without having the full historical block data, which most nodes will prune out.
Even in such an unlikely case, the value that is trimmed will be below 1 Qi, so it is a reasonable approach to take to reduce data storage across all nodes in the network. If a malicious actor intentionally creates more duplicate keys than can reasonably be trimmed in a single block, the duplicate key is stored to disk, and the next block will continue the trimming process to remove the rest of the duplicates.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a malicious actor intentionally creates more duplicate keys than can reasonably be trimmed in a single block, the duplicate key is stored to disk, and the next block will continue the trimming process to remove the rest of the duplicates.

How do you determine what "can reasonably be done in a single block"? Either it is part of the block processing rules or it isnt.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a suggested limit per key per block

qip-0015.md Show resolved Hide resolved
@jdowning100 jdowning100 force-pushed the master branch 3 times, most recently from d862446 to fb47d83 Compare September 18, 2024 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants