Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(tx_index): fill transaction index gap on enabling automatic backfill #12353

Conversation

akaladarshi
Copy link
Contributor

@akaladarshi akaladarshi commented Aug 7, 2024

Related Issues

Fixes part of: #11007

Proposed Changes

  • Adds EnableAutomaticBackFill in index config in fullNode
  • Adds MaxAutomaticBackFillTxIndexHeight in index config in fullnode
  • Tries to fill the gap between the last populated from current tipset tx hashes till:
    • we reach MaxAutomaticBackFillTxIndexHeight or
    • we reach a conflict in the eth tx hash index database
  • Adds PopulateAfterSnapshot for eth tx hash index to populate data for while loading from snapshot

@akaladarshi
Copy link
Contributor Author

@aarshkshah1992 As per our discussion about waiting till the observer gets the latest head and process that,
I have a few ideas, let's discuss them once we verify this is correct.

Copy link
Contributor

@aarshkshah1992 aarshkshah1992 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round 1 of review. @akaladarshi Please can we also add logic to populate this from a snapshot just like we do for the MsgIndex ?

@@ -78,7 +84,7 @@ func EthModuleAPI(cfg config.FevmConfig) func(helpers.MetricsCtx, repo.LockedRep
}

// Tipset listener
_ = ev.Observe(&ethTxHashManager)
ev.Observe(&ethTxHashManager)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this change.

@@ -621,6 +621,9 @@ type IndexConfig struct {
// EXPERIMENTAL FEATURE. USE WITH CAUTION
// EnableMsgIndex enables indexing of messages on chain.
EnableMsgIndex bool

// EnableAutomaticBackFill enables automatic index back-filling
EnableAutomaticBackFill bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, we need to rename this to EnableAutomaticBackFillTxIndex as we're only doing it for the TX Index.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

return nil
}

// PopulateExistingMappings walk back from the current head to the minimum height and populate the eth transaction hash lookup database
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: check grammar

@@ -23,6 +23,49 @@ func (m *EthTxHashManager) Revert(ctx context.Context, from, to *types.TipSet) e
return nil
}

// FillIndexGap back-fills the gap between the current head and the last populated transaction index
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "last populated transaction index in* the Index"


totalProcessedBlock := 0
ts := m.StateAPI.Chain.GetHeaviestTipSet()
for _, block := range ts.Blocks() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to keep going back in the chain to backfill. This loop terminates after backfilling messages in the heaviest tipset only.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to be careful on two fronts with going backward up the chain for two reasons:

  1. the last processed message CID may not be on the canonical chain anymore so you may never encounter it; and
  2. the last processed message may not even be in the blockstore due to a splitstore compaciton or some other happening (e.g. you've copied across someone else's txhash.db or restored an old one from disk, whatever)

I suspect ChainGetPath might be a good API to try here: get the tipset of the lastTxHash, get the latest tipset, find the path between. I'm just not sure what the error conditions are for that - how big a gap is too large, and how do we deal with splitstore GC problems (I suspect "find me the tipset for lastTxHash would error in that case if it was too old, in which case what's the strategy then?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rvagg I have been going through the ChainGetPath code, but I didn't see it returning a too old error, it only returns a not found error.

If it is not found then nodes have to import the snapshot.
Is there any other way to get the state?

Also ChainGetPath returns both revert and apply changes, so do we have to filter out apply changes and start backfilling?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discuss this with @akaladarshi async. We worked through multiple solutions here. He will post a note about it.

totalProcessedBlock++
}

ts, err = m.StateAPI.Chain.GetTipSetFromKey(ctx, ts.Parents())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to move to the loop above

for _, msg := range msgs {
if msg.Cid() == lastTxCid {
// We've reached the last indexed transaction
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log completion and then return. Add logging around how many entires were inserted, which tipset it started from and which one it ended at etc.

@akaladarshi
Copy link
Contributor Author

@aarshkshah1992 For the observer part, I thought we could do something like this:

  • Create a backfill chan
  • Provide the backfill chan to the observer
    • Once it gets a new block from ChainNotify is sends it to the backfill chan
    • Wait till the backfill chan is closed (means backfill is done)
  • Provide the same backfill chan to the FillIndexGap function in EthTxManager, also make it async
    • It will wait till it receives the new block through the backfill chan from the observer
    • Starts backfilling
    • Once backfilling is done close the backfill chan so the observer can proceed as usual

@akaladarshi akaladarshi force-pushed the akaladarshi/backfill-tx-index-gap branch from 0ec35b0 to 4b56a36 Compare August 12, 2024 04:22
@akaladarshi akaladarshi deleted the akaladarshi/backfill-tx-index-gap branch November 25, 2024 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants