Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem: tendermint tx indexer could miss block when restart #213

Open
jun0tpyrc opened this issue Nov 16, 2021 · 7 comments
Open

Problem: tendermint tx indexer could miss block when restart #213

jun0tpyrc opened this issue Nov 16, 2021 · 7 comments
Assignees

Comments

@jun0tpyrc
Copy link

This is an older block at the time of query (119510)

[root@bprod-cronos-1 ~]# curl localhost:8545 -X POST -H "Content-Type: application/json" -d '{"jsonrpc":"2.0","method":"eth_getBlockByNumber","params": ["0x1D2D6", true],"id":1}' -s | jq

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "difficulty": "0x0",
    "extraData": "0x",
    "gasLimit": "0xffffffff",
    "gasUsed": "0x24d2b8",
    "hash": "0x4324b6e0116d22f6ce615f227276f8f974659c237f71acbd73d02981e225a05c",
    "logsBloom": "0x002000000200000400080000800200020000000c340800001020000080200310c00050008000010000200000000004020840000040e00000200000040021400200201400008200801040000c00001020000002000241000082010040800020000810040002200000000001001080080000020000000816002400401000000000812002000000c004000000000102091000000801d0006008100000408000000006000802101200010020044000002000800602000108000012220020000000800000000200028000880001000102200000020008c010101400002242020020000010110000801000000000000001000000004000400000400800004000002000",
    "miner": "0x81e3e543647e466a5abc824f5844ab0a091b6c6c",
    "mixHash": "0x0000000000000000000000000000000000000000000000000000000000000000",
    "nonce": "0x0000000000000000",
    "number": "0x1d2d6",
    "parentHash": "0xad2432f13c825d3a9163c9e6d518f08633b00ad6a1eb87c747ad7f2427dabe90",
    "receiptsRoot": "0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421",
    "sha3Uncles": "0x1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347",
    "size": "0x909c",
    "stateRoot": "0xea8aae6836bdc465215558bc1c7511717069c914eaae48e984af1415b6564a42",
    "timestamp": "0x6192aef2",
    "totalDifficulty": "0x0",
    "transactions": [],
    "transactionsRoot": "0x8b2974a39d3aeed1cbe3aa1654ab933a9bcb4d7215b411b8a6cf569c1783abde",
    "uncles": []
  }
}
^ compared to expected responses transactions array being empty now 

the node has newer blocks and keep ingest new blocks thoggh

{
  "blockNumber": 125710,
  "blockHash": "0x887475723ba801bb1b2ce68709b74941223652f09317127aaf15a4d49bfeb85e",
  "blockTime": "2021-11-16T04:47:34.000Z",
  "checkTime": "2021-11-16T04:47:50.836Z",
  "timeDiff": 16836
}

To Reproduce
It occurs intermittently only in some of the node
x86_64

@crypto-steve-ng
Copy link

crypto-steve-ng commented Nov 18, 2021

noticed this happened a few times yesterday as well, we had to manually resync the block on subgraph whenever this happen

@thomas-nguy
Copy link
Collaborator

thomas-nguy commented Nov 18, 2021

could be an error on tendermint
@JayT106 is it possible that either e.clientCtx.Client.Block(e.ctx, &height) or e.clientCtx.Client.TxSearch return different value per nodes?

@JayT106
Copy link
Collaborator

JayT106 commented Nov 18, 2021

could be an error on tendermint @JayT106 is it possible that either e.clientCtx.Client.Block(e.ctx, &height) or e.clientCtx.Client.TxSearch return different value per nodes?

First, the block data should be the same across the network nodes, otherwise, the consensus should break. We might debug EthBlockFromTendermint to see why the block data transform fail.

Second. the TxSearch searches the transaction in the indexer, so the node must have the KV indexer enabled. And the indexer might fail to index the transaction and block data when something wrong during indexing (when a new block is committed). So it's possible to have a different result from the different nodes.

@yihuang
Copy link
Collaborator

yihuang commented Nov 24, 2021

could be an error on tendermint @JayT106 is it possible that either e.clientCtx.Client.Block(e.ctx, &height) or e.clientCtx.Client.TxSearch return different value per nodes?

First, the block data should be the same across the network nodes, otherwise, the consensus should break. We might debug EthBlockFromTendermint to see why the block data transform fail.

Second. the TxSearch searches the transaction in the indexer, so the node must have the KV indexer enabled. And the indexer might fail to index the transaction and block data when something wrong during indexing (when a new block is committed). So it's possible to have a different result from the different nodes.

Since the tendermint tx indexer service runs asynchronously with block commit, so it could lag behind.
And eth_getBlockByNumber rpc API rely on /tx_search to find the tx, it's possible that it can't find it, especially for recent blocks.

@yihuang
Copy link
Collaborator

yihuang commented Nov 24, 2021

It reproduced in one of our nodes where tendermint tx indexer fails to index a whole block, and won't recover automatically, matching what's described by OP.

@yihuang
Copy link
Collaborator

yihuang commented Nov 24, 2021

tendermint/tendermint#7312
We found that when restart node, there's chance that the tx indexer could miss a block.

@JayT106
Copy link
Collaborator

JayT106 commented Nov 24, 2021

I tested it has been fixed in this PR. Not sure will it be backported to v0.34
tendermint/tendermint#7312 (comment)

@yihuang yihuang changed the title Problem: transactions array probabilistically missing for old block Problem: tendermint tx indexer could miss block when restart Mar 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants