Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep data in fails cases in sync service #2361

Merged
merged 36 commits into from
Dec 2, 2024

Conversation

AurelienFT
Copy link
Contributor

@AurelienFT AurelienFT commented Oct 15, 2024

Linked Issues/PRs

Closes #2357

Description

This pull request introduces a caching mechanism to the sync service to avoid redundant data fetching from the network. The most important changes include adding a cache module, modifying the Import struct to include a cache, and updating related methods to utilize this cache.

Caching Mechanism:

  • crates/services/sync/src/import.rs: Added a new cache module and integrated it into the Import struct. Updated methods to use the cache for fetching and storing headers and blocks.
  • Cache mechanism allow use to retrieve a stream of batches of either cached headers, cached full blocks, or range to fetch data.

Test Updates:

  • Update the P2P port in mocks to use async to simulate more complex tests needed for this feature.

This PR contains 50% of changes in the tests and addition of tests in the cache.

Checklist

  • Breaking changes are clearly marked as such in the PR description and changelog
  • New behavior is reflected in tests
  • The specification matches the implemented behavior (link update PR if changes are needed)

Before requesting review

  • I have reviewed the code myself
  • I have created follow-up issues caused by this PR and linked them here

@AurelienFT AurelienFT marked this pull request as ready for review October 16, 2024 16:44
@AurelienFT AurelienFT requested a review from a team October 16, 2024 16:44
Copy link
Contributor

@netrome netrome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the import task well enough to approve right now. I need clarification on the following points:

  1. How do we ensure this cache doesn't grow forever? Is the Import task short-lived? While the import task launches short-lived streams, it seems like a long-living task to me.
  2. How can we be sure we'll query exactly the same ranges as we have cached? Where is that invariant maintained.

Let me know if you want to jump on a call to chat about this, or just write if I'm missing something obvious here.

crates/services/sync/src/import.rs Outdated Show resolved Hide resolved
crates/services/sync/src/import.rs Show resolved Hide resolved
crates/services/sync/src/import.rs Outdated Show resolved Hide resolved
@AurelienFT
Copy link
Contributor Author

AurelienFT commented Oct 16, 2024

@netrome Thanks for taking the time to review this Regarding your interrogations :
1 - Yes for me it will leave a long time but all asked data should be ok at some point and so be cleared otherwise we will only have batch_size as number of element in the cache. But I'm not very sure about this that's why I placed a comment about this in "Interrogation" in the PR. Maybe we need a pruning management.
2 - I was thinking that we re-ask all the same ranges because the batch_size doesn't change but the starting point can change to the last synced block and so ranges can change. I think you are right then the ranges can change I will ask few questions to @xgreenx

@AurelienFT AurelienFT changed the base branch from release/v0.40.0 to master October 16, 2024 21:21
Copy link
Contributor

@rafal-ch rafal-ch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far looks good, I need to have a deeper look at the tests though.

CHANGELOG.md Outdated Show resolved Hide resolved
crates/services/sync/src/import.rs Outdated Show resolved Hide resolved
crates/services/sync/src/import/back_pressure_tests.rs Outdated Show resolved Hide resolved
@AurelienFT AurelienFT marked this pull request as draft October 17, 2024 09:17
@AurelienFT
Copy link
Contributor Author

Convert to draft because of big refacto.

@AurelienFT AurelienFT marked this pull request as ready for review October 18, 2024 10:33
Copy link
Collaborator

@xgreenx xgreenx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks really good=)

crates/services/sync/src/import.rs Show resolved Hide resolved
}
}
BlockHeaderData::Cached(CachedDataBatch::None(_)) => {
unreachable!()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it is true, let's return an error and print a log that this place shouldn't be reachable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a log and I returned a malformed batch which is used as error in this whole process. I don't want to change the whole architecture of the module for this error. (the other solution is to panic like it's done here :

.expect("We checked headers are not empty above"),
)

crates/services/sync/src/import.rs Show resolved Hide resolved
crates/services/sync/src/import.rs Show resolved Hide resolved
crates/services/sync/src/import/cache.rs Outdated Show resolved Hide resolved
crates/services/sync/src/import/cache.rs Show resolved Hide resolved
crates/services/sync/src/import/cache.rs Outdated Show resolved Hide resolved
crates/services/sync/src/import/cache.rs Outdated Show resolved Hide resolved
crates/services/sync/src/import/tests.rs Outdated Show resolved Hide resolved
crates/services/sync/src/import/tests.rs Show resolved Hide resolved
@AurelienFT AurelienFT requested a review from xgreenx November 27, 2024 14:09
@AurelienFT
Copy link
Contributor Author

@xgreenx Thanks for the kind comment and I have addressed all of your concerns some may still need some answers :)

Copy link
Contributor

@netrome netrome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice stuff! Some minor questions and comments from me, but overall looks good.

crates/services/sync/src/import/cache.rs Show resolved Hide resolved
crates/services/sync/src/import/cache.rs Show resolved Hide resolved
crates/services/sync/src/import/cache.rs Show resolved Hide resolved
crates/services/sync/src/ports.rs Show resolved Hide resolved
Copy link
Contributor

@netrome netrome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed this previously, but saw the clippy error in CI regarding the arithmetic operation and want to ensure we avoid the array slicing as well since that can also panic.

crates/services/sync/src/import/cache.rs Outdated Show resolved Hide resolved
@AurelienFT AurelienFT requested a review from netrome November 28, 2024 10:14
netrome
netrome previously approved these changes Nov 28, 2024
Copy link
Contributor

@netrome netrome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update 🙏

crates/services/sync/src/ports.rs Show resolved Hide resolved
@xgreenx xgreenx requested a review from netrome November 29, 2024 13:46
@AurelienFT
Copy link
Contributor Author

@xgreenx Why you prefer to make the check inside the loop instead of the split at the end to simplify the logic in the loop ?

@xgreenx
Copy link
Collaborator

xgreenx commented Nov 29, 2024

Why you prefer to make the check inside the loop instead of the split at the end to simplify the logic in the loop ?

It is much easier to read. It is more performant, because the old implementation was doing into_iter, chunks, collect each batch_size blocks.

@AurelienFT AurelienFT merged commit 28c4ae3 into master Dec 2, 2024
31 checks passed
@AurelienFT AurelienFT deleted the sync_service/keep-data-on-stop branch December 2, 2024 10:12
@xgreenx xgreenx mentioned this pull request Jan 15, 2025
xgreenx added a commit that referenced this pull request Jan 15, 2025
## Version v0.41.0

### Added
- [2547](#2547): Replace the
old Graphql gas price provider adapter with the ArcGasPriceEstimate.
- [2445](#2445): Added GQL
endpoint for querying asset details.
- [2442](#2442): Add
uninitialized task for V1 gas price service
- [2154](#2154): Added
`Unknown` variant to `ConsensusParameters` graphql queries
- [2154](#2154): Added
`Unknown` variant to `Block` graphql queries
- [2154](#2154): Added
`TransactionType` type in `fuel-client`
- [2321](#2321): New metrics
for the TxPool:
    - The size of transactions in the txpool (`txpool_tx_size`)
- The time spent by a transaction in the txpool in seconds
(`txpool_tx_time_in_txpool_seconds`)
- The number of transactions in the txpool
(`txpool_number_of_transactions`)
- The number of transactions pending verification before entering the
txpool (`txpool_number_of_transactions_pending_verification`)
- The number of executable transactions in the txpool
(`txpool_number_of_executable_transactions`)
- The time it took to select transactions for inclusion in a block in
microseconds (`txpool_select_transactions_time_microseconds`)
- The time it took to insert a transaction in the txpool in microseconds
(`transaction_insertion_time_in_thread_pool_microseconds`)
- [2385](#2385): Added new
histogram buckets for some of the TxPool metrics, optimize the way they
are collected.
- [2347](#2364): Add activity
concept in order to protect against infinitely increasing DA gas price
scenarios
- [2362](#2362): Added a new
request_response protocol version `/fuel/req_res/0.0.2`. In comparison
with `/fuel/req/0.0.1`, which returns an empty response when a request
cannot be fulfilled, this version returns more meaningful error codes.
Nodes still support the version `0.0.1` of the protocol to guarantee
backward compatibility with fuel-core nodes. Empty responses received
from nodes using the old protocol `/fuel/req/0.0.1` are automatically
converted into an error `ProtocolV1EmptyResponse` with error code 0,
which is also the only error code implemented. More specific error codes
will be added in the future.
- [2386](#2386): Add a flag to
define the maximum number of file descriptors that RocksDB can use. By
default it's half of the OS limit.
- [2376](#2376): Add a way to
fetch transactions in P2P without specifying a peer.
- [2361](#2361): Add caches to
the sync service to not reask for data it already fetched from the
network.
- [2327](#2327): Add more
services tests and more checks of the pool. Also add an high level
documentation for users of the pool and contributors.
- [2416](#2416): Define the
`GasPriceServiceV1` task.
- [2447](#2447): Use new
`expiration` policy in the transaction pool. Add a mechanism to prune
the transactions when they expired.
- [1922](#1922): Added support
for posting blocks to the shared sequencer.
- [2033](#2033): Remove
`Option<BlockHeight>` in favor of `BlockHeightQuery` where applicable.
- [2490](#2490): Added
pagination support for the `balances` GraphQL query, available only when
'balances indexation' is enabled.
- [2439](#2439): Add gas costs
for the two new zk opcodes `ecop` and `eadd` and the benches that allow
to calibrate them.
- [2472](#2472): Added the
`amountU128` field to the `Balance` GraphQL schema, providing the total
balance as a `U128`. The existing `amount` field clamps any balance
exceeding `U64` to `u64::MAX`.
- [2526](#2526): Add
possibility to not have any cache set for RocksDB. Add an option to
either load the RocksDB columns families on creation of the database or
when the column is used.
- [2532](#2532): Getters for
inner rocksdb database handles.
- [2524](#2524): Adds a new
lock type which is optimized for certain workloads to the txpool and p2p
services.
- [2535](#2535): Expose
`backup` and `restore` APIs on the `CombinedDatabase` struct to create
portable backups and restore from them.
- [2550](#2550): Add
statistics and more limits infos about txpool on the node_info endpoint

### Fixed
- [2560](#2560): Fix flaky
test by increasing timeout
- [2558](#2558): Rename `cost`
and `reward` to remove `excess` wording
- [2469](#2469): Improved the
logic for syncing the gas price database with on_chain database
- [2365](#2365): Fixed the
error during dry run in the case of race condition.
- [2366](#2366): The
`importer_gas_price_for_block` metric is properly collected.
- [2369](#2369): The
`transaction_insertion_time_in_thread_pool_milliseconds` metric is
properly collected.
- [2413](#2413): block
production immediately errors if unable to lock the mutex.
- [2389](#2389): Fix
construction of reverse iterator in RocksDB.
- [2479](#2479): Fix an error
on the last iteration of the read and write sequential opcodes on
contract storage.
- [2478](#2478): Fix proof
created by `message_receipts_proof` function by ignoring the receipts
from failed transactions to match `message_outbox_root`.
- [2485](#2485): Hardcode the
timestamp of the genesis block and version of `tai64` to avoid breaking
changes for us.
- [2511](#2511): Fix backward
compatibility of V0Metadata in gas price db.

### Changed
- [2469](#2469): Updated
adapter for querying costs from DA Block committer API
- [2469](#2469): Use the gas
price from the latest block to estimate future gas prices
- [2501](#2501): Use gas price
from block for estimating future gas prices
- [2468](#2468): Abstract
unrecorded blocks concept for V1 algorithm, create new storage impl.
Introduce `TransactionableStorage` trait to allow atomic changes to the
storage.
- [2295](#2295):
`CombinedDb::from_config` now respects `state_rewind_policy` with tmp
RocksDB.
- [2378](#2378): Use cached
hash of the topic instead of calculating it on each publishing gossip
message.
- [2438](#2438): Refactored
service to use new implementation of `StorageRead::read` that takes an
offset in input.
- [2429](#2429): Introduce
custom enum for representing result of running service tasks
- [2377](#2377): Add more
errors that can be returned as responses when using protocol
`/fuel/req_res/0.0.2`. The errors supported are
`ProtocolV1EmptyResponse` (status code `0`) for converting empty
responses sent via protocol `/fuel/req_res/0.0.1`,
`RequestedRangeTooLarge`(status code `1`) if the client requests a range
of objects such as sealed block headers or transactions too large,
`Timeout` (status code `2`) if the remote peer takes too long to fulfill
a request, or `SyncProcessorOutOfCapacity` if the remote peer is
fulfilling too many requests concurrently.
- [2233](#2233): Introduce a
new column `modification_history_v2` for storing the modification
history in the historical rocksDB. Keys in this column are stored in big
endian order. Changed the behaviour of the historical rocksDB to write
changes for new block heights to the new column, and to perform lookup
of values from the `modification_history_v2` table first, and then from
the `modification_history` table, performing a migration upon access if
necessary.
- [2383](#2383): The `balance`
and `balances` GraphQL query handlers now use index to provide the
response in a more performant way. As the index is not created
retroactively, the client must be initialized with an empty database and
synced from the genesis block to utilize it. Otherwise, the legacy way
of retrieving data will be used.
- [2463](#2463): The
`coinsToSpend` GraphQL query handler now uses index to provide the
response in a more performant way. As the index is not created
retroactively, the client must be initialized with an empty database and
synced from the genesis block to utilize it. Otherwise, the legacy way
of retrieving data will be used.
- [2556](#2556): Ensure that
the `last_recorded_height` is set for the DA gas price source.

#### Breaking
- [2469](#2469): Move from
`GasPriceServicev0` to `GasPriceServiceV1`. Include new config values.
- [2438](#2438): The
`fuel-core-client` can only work with new version of the `fuel-core`.
The `0.40` and all older versions are not supported.
- [2438](#2438): Updated
`fuel-vm` to `0.59.1` release. Check [release
notes](https://github.com/FuelLabs/fuel-vm/releases/tag/v0.59.0) for
more details.
- [2389](#2258): Updated the
`messageProof` GraphQL schema to return a non-nullable `MessageProof`.
- [2154](#2154): Transaction
graphql endpoints use `TransactionType` instead of
`fuel_tx::Transaction`.
- [2446](#2446): Use graphiql
instead of graphql-playground due to known vulnerability and stale
development.
- [2379](#2379): Change
`kv_store::Value` to be `Arc<[u8]>` instead of `Arc<Vec<u8>>`.
- [2490](#2490): Updated
GraphQL complexity calculation for `balances` query to account for
pagination (`first`/`last`) and nested field complexity
(`child_complexity`). Queries with large pagination values or deeply
nested fields may have higher complexity costs.
- [2463](#2463):
'CoinsQueryError::MaxCoinsReached` variant has been removed. The
`InsufficientCoins` variant has been renamed to
`InsufficientCoinsForTheMax` and it now contains the additional `max`
field
- [2463](#2463): The number of
excluded ids in the `coinsToSpend` GraphQL query is now limited to the
maximum number of inputs allowed in transaction.
- [2463](#2463): The
`coinsToSpend` GraphQL query may now return different coins, depending
whether the indexation is enabled or not. However, regardless of the
differences, the returned coins will accurately reflect the current
state of the database within the context of the query.
- [2526](#2526): By default
the cache of RocksDB is now disabled instead of being `1024 * 1024 *
1024`.

## What's Changed
* Add metrics to TxPool by @acerone85 in
#2321
* Fix collection of gas price metric by @rafal-ch in
#2366
* Add documentation to run a ignition node in readme by @AurelienFT in
#2363
* Fix collection of tx pool insertion time metric by @rafal-ch in
#2369
* Add versioning to request response protocols by @acerone85 in
#2362
* Return reason of why proof cant be generated by @rafal-ch in
#2258
* p2p: use precalculated topic hash by @yaziciahmet in
#2378
* Remove ignore RUSTSEC-2024-0336 by @AurelienFT in
#2384
* Deal with negative feed back loop in DA gas price by @MitchTurner in
#2364
* Add new flag for maximum file descriptors in rocksdb. by @AurelienFT
in #2386
* Add codeowners for gas price algorithm crate by @rafal-ch in
#2404
* Weekly `cargo update` by @github-actions in
#2373
* chore(gas_price_service): initialize v1 metadata by @rymnc in
#2288
* chore(gas_price_service_v0): remove unused trait impl by @rymnc in
#2410
* Update tai64 to fix the wrong time offset by @AurelienFT in
#2409
* fix(block_producer): immediately return error if lock cannot be
acquired during production by @rymnc in
#2413
* Add a way to fetch transactions in P2P without specifying a peer by
@AurelienFT in #2376
* Add a new code owner for tx pool by @AurelienFT in
#2417
* Satisfy clippy in `gas-price-analysis` by @rafal-ch in
#2418
* Txpool metrics update by @rafal-ch in
#2385
* Improve TxPool tests and documentation by @AurelienFT in
#2327
* feat(gas_price_service_v1): define RunnableTask for GasPriceServiceV1
by @rymnc in #2416
* Return reason of why proof cant be generated (api change) by @rafal-ch
in #2389
* Fuel/Request_Response v0.0.2: More meaningful error messages by
@acerone85 in #2377
* Fix reverse iterator in RocksDB by @AurelienFT in
#2398
* Add test node herself in reserved nodes. by @AurelienFT in
#2390
* Weekly `cargo update` by @github-actions in
#2424
* Weekly `cargo update` by @github-actions in
#2440
* Resolve some falky tests and improve CI times by @AurelienFT in
#2401
* feat: handle `Unknown` transactions, blocks and consensus parameters
by @hal3e in #2154
* fix(p2p): cache responses to serve without roundtrip to db by @rymnc
in #2352
* Replace task `run()` return result with custom enum by @MitchTurner in
#2429
* Fix codeowners by @AurelienFT in
#2444
* fix(graphql_playground): use graphiql instead by @rymnc in
#2446
* Weekly `cargo update` by @github-actions in
#2453
* refactor: remove `Option<BlockHeight>` and use new enum where
applicable by @matt-user in
#2033
* Fixed the error during dry run by @xgreenx in
#2365
* Add decompression traits and a test case by @Dentosal in
#2295
* Versioned Storage for Modifications History by @acerone85 in
#2233
* Allow DA recorded blocks to come out-of-order by @MitchTurner in
#2415
* feat: Change `kv_store::Value` to be Arc<[u8]> instead of Arc<Vec<u8>>
by @netrome in #2411
* Optimize balance-related queries with a cache by @rafal-ch in
#2383
* fix: Add missing features to `fuel-core-tests` by @netrome in
#2467
* Keep data in fails cases in sync service by @AurelienFT in
#2361
* Weekly `cargo update` by @github-actions in
#2470
* Revert balances amount to `U64` and introduce new `amountU128` getter
by @rafal-ch in #2472
* Create uninitialized task for v1 gas price service by @MitchTurner in
#2442
* Port the 0.40.2 fix of TAI on master by @AurelienFT in
#2485
* Ignore RUSTSEC-2024-0421 by @AurelienFT in
#2489
* Ignore receipts from failed transactions in `message_receipts_proof`
by @AurelienFT in #2478
* Add unrecorded blocks abstraction to gas price algo by @MitchTurner in
#2468
* Fix last iteration in sequential opcode by @AurelienFT in
#2479
* fix(gas_price_service_v0): bring back removed fields, causing UB when
trying to access by @rymnc in
#2511
* Refactor fuel-core to use version of StorageRead::read with offset
(Full update to 0.59.1) by @acerone85 in
#2438
* Sync the version of the `fuel-core` with minor hot fixes by @xgreenx
in #2516
* fix(docs): typo preventing ci checks from passing by @rymnc in
#2525
* Integration test for balances and (non)retryable messages by @rafal-ch
in #2505
* Add document for launching Ignition node from source and Local network
from source by @AurelienFT in
#2502
* Make the rocksdb cache optional in config and add policy for column
opening by @AurelienFT in
#2526
* Weekly `cargo update` by @github-actions in
#2530
* chore(rocksdb): getter for inner database handle by @rymnc in
#2532
* Use gas prices from actual blocks to calculate estimate gas prices by
@MitchTurner in #2501
* chore(codeowners): gas price service codeowners by @rymnc in
#2534
* Add zk opcodes by @AurelienFT in
#2439
* Gas price simulation data retriever by @acerone85 in
#2533
* Shared sequencer integration by @Dentosal in
#1922
* Use expiration policy by @AurelienFT in
#2447
* Fixed TPS benchmark to work with latest changes by @xgreenx in
#2515
* Use indexation cache to satisfy "coins to spend" queries by @rafal-ch
in #2463
* feat(txpool|p2p): use seqlock instead of small copy-able RwLocks by
@rymnc in #2524
* Create new index for tracking Asset metadata by @maschad in
#2445
* feat(rocksdb): remove getters for internal rocksdb handles, expose
`backup` instead by @rymnc in
#2535
* Integrate with V1 algo for tests by @MitchTurner in
#2469
* Lock-free `latest_l2_height` in gas price service by @rafal-ch in
#2546
* chore(gas_price_service_v1): strictly ensure last_recorded_height is
set, to avoid initial poll of da source by @rymnc in
#2556
* Replace old Graphql Gas Price adapter with new latest gas price struct
by @MitchTurner in #2547
* Rename cost and rewards without 'excess' by @MitchTurner in
#2558
* Add current pool gas to the node info endpoint by @AurelienFT in
#2550
* Pagination queries for `balances` endpoint by @rafal-ch in
#2490
* 2559 Increase timeout for test by @MitchTurner in
#2560
* Add test expiration policy in executor by @AurelienFT in
#2563

## New Contributors
* @yaziciahmet made their first contribution in
#2378

**Full Changelog**:
v0.40.0...v0.41.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fuel-core-sync should cache the result of responses instead of throwing them away
4 participants