Fix Memory Access Bug & Improve GetAlgo Handling for Legacy & Testnet Blocks #278

JaredTate · 2025-01-24T00:24:06Z

Fix Memory Access Bug & Improve GetAlgo Handling for Legacy & Testnet Blocks

Summary of the Problem

For years many users have encountered a memory-access error after first launching DigiByte-QT or digibyted during the indexing phase that begins loading/parsing DigiByte block headers. This bug appears most often with older (pre-multi-algo) blocks and certain blocks after the Odo PoW fork.

Symptom & LLDB Trace

During initial block loading and when indexing reaches around 40% the wallet’s memory usage can spike dramatically, sometimes causing crashes or out-of-memory conditions. In LLDB, a this memory error back trace looks like:

* thread #18, name = 'b-qt-init', stop reason = EXC_BAD_ACCESS (code=1, address=0x17d7d847d7d9d7e2)
  * frame #0: 0x000000010084dcc0 DigiByte-QtCBlockIndex::GetBlockHeader() const + 332
    frame #1: 0x0000000101280124 DigiByte-QtCBlockIndex::GetAlgo() const + 344
    …

This indicates a memory violation (EXC_BAD_ACCESS), typically meaning an invalid pointer or out-of-bounds array usage.

Diagnosis

Older Blocks Returning ALGO_UNKNOWN (-1)
In DigiByte v8.22.0, the function that determines a block’s proof-of-work algorithm looked like:

int CBlockHeader::GetAlgo() const
{
    switch (nVersion & BLOCK_VERSION_ALGO)
    {
        case BLOCK_VERSION_SCRYPT:   return ALGO_SCRYPT;
        case BLOCK_VERSION_SHA256D:  return ALGO_SHA256D;
        case BLOCK_VERSION_GROESTL:  return ALGO_GROESTL;
        case BLOCK_VERSION_SKEIN:    return ALGO_SKEIN;
        case BLOCK_VERSION_QUBIT:    return ALGO_QUBIT;
        case BLOCK_VERSION_ODO:      return ALGO_ODO;
    }
    return ALGO_UNKNOWN;  // -1
}

Many older (2014–2015) blocks, or blocks using a version that does not set one of the above algo bits, end up returning ALGO_UNKNOWN = -1.

DigiByte has had 4 different block versioning periods or eras throughout its 11-year history, so we have a variety of block versions that need to be properly accounted for when indexing the chain. Modern multi-algo code is much different than older versions and therefore historical chain data was getting misinterpreted by newer code. Below is where the code returning a ALGO_UNKNOWN = -1 comes from and other bitwise block versioning.

digibyte/src/primitives/block.h

Lines 16 to 44 in c5d9a01

    
           enum {  
        
               ALGO_UNKNOWN = -1, 
        
               ALGO_SHA256D  = 0, 
        
               ALGO_SCRYPT   = 1, 
        
               ALGO_GROESTL  = 2, 
        
               ALGO_SKEIN    = 3, 
        
               ALGO_QUBIT    = 4, 
        
               //ALGO_EQUIHASH = 5, 
        
               //ALGO_ETHASH   = 6, 
        
               ALGO_ODO      = 7, 
        
               NUM_ALGOS_IMPL }; 
        
           const int NUM_ALGOS = 5; 
        
           enum { 
        
               // primary version 
        
               BLOCK_VERSION_DEFAULT        = 2,  
        
               // algo 
        
               BLOCK_VERSION_ALGO           = (15 << 8), 
        
               BLOCK_VERSION_SCRYPT         = (0 << 8), 
        
               BLOCK_VERSION_SHA256D        = (2 << 8), 
        
               BLOCK_VERSION_GROESTL        = (4 << 8), 
        
               BLOCK_VERSION_SKEIN          = (6 << 8), 
        
               BLOCK_VERSION_QUBIT          = (8 << 8), 
        
               //BLOCK_VERSION_EQUIHASH       = (10 << 8), 
        
               //BLOCK_VERSION_ETHASH         = (12 << 8), 
        
               BLOCK_VERSION_ODO            = (14 << 8), 
        
           };

Out-of-Bounds Writes to lastAlgoBlocks[]
In older code, CBlockIndex did something like:
```
lastAlgoBlocks[GetAlgo()] = this;
```
If GetAlgo() returned -1 (unknown), we wrote to lastAlgoBlocks[-1], corrupting memory outside the array. This could go unnoticed until subsequent calls (e.g. GetBlockHeader()) triggered a crash.
Incorrect Mainnet-Only Logic on Testnet
Historically, mainnet scrypt was used up to block height 145k. Older code forced all blocks < 145,000 to be interpreted as scrypt—even on testnet. But on testnet, multi-algo and version bits activated earlier/later, so forcibly labeling testnet blocks as scrypt introduced further mismatch and potential memory corruption.

Root Cause in Brief

Unrecognized version bits → ALGO_UNKNOWN = -1
No check on unknown algos → writes to lastAlgoBlocks[-1]
Testnet logic mismatch → older blocks incorrectly forced as “scrypt”

Changes Implemented

Safer Handling of Unknown Block Versions
- In CBlockIndex::GetAlgo() and/or the constructor logic, any unrecognized version bits now yield a safe fallback (ALGO_UNKNOWN) without updating lastAlgoBlocks[].
- We log a warning if ALGO_UNKNOWN is encountered but avoid any array assignment that could go out of bounds.
Accurate Version Parsing Below 145k (Mainnet vs Testnet)
- Previously, code forced all pre-145,000 blocks to use scrypt. This is correct only for mainnet.
- We’ve now introduced a network check so that on mainnet we still label pre-145k blocks as scrypt, but on testnet (and other networks), we parse the actual version bits to determine the algo.
Robust Logging & Safety Checks
- In CBlockIndex::CBlockIndex(const CBlockHeader&), log a message if the block’s version maps to ALGO_UNKNOWN.
- Consistently handle suspicious block versions gracefully rather than forcing them into an invalid array index.

Why This Fixes the Memory Bug

By properly handling unrecognized or out-of-range block versions, we:

Prevent Out-of-Bounds Writes
- We no longer do lastAlgoBlocks[-1] = this; when GetAlgo() is ALGO_UNKNOWN.
Respect Different Network Rules
- Ensure testnet multi-algo transitions are parsed per actual version bits, preventing the forced-scrypt logic from causing further corruption.
Preserve Mainnet Behavior
- For mainnet, we still treat pre-145k blocks as scrypt, matching historical reality but avoiding the “unknown” index scenario.

Test Results

All 470 unit tests pass successfully.
All 215 functional tests pass, including p2p_dos_header_tree.py, which checks correct header acceptance/rejection for older heights.
The fix eliminates memory-access warnings/errors on startup with address sanitizer compiler flag. Multiple syncs & wallet restarts on Windows, macOS, and a Raspberry Pi 5 confirm stable memory usage (around 6.5–6.7 GB) without spikes or crashes.

Thank you for reviewing these changes! If you have any questions, please feel free to ask in this PR discussion.

gto90

ACK

ycagel

cACK

JaredTate added 2 commits January 23, 2025 08:53

Fix Getblockheader Get Algo Memory Bug

34ed08b

Fix p2p_dos_header_tree.py Failure

e578b20

ycagel requested review from SmartArray, ycagel, digicontributer and gto90 January 24, 2025 00:55

gto90 approved these changes Jan 24, 2025

View reviewed changes

ycagel approved these changes Jan 25, 2025

View reviewed changes

ycagel merged commit 90fab7b into DigiByte-Core:develop Jan 25, 2025
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Memory Access Bug & Improve GetAlgo Handling for Legacy & Testnet Blocks #278

Fix Memory Access Bug & Improve GetAlgo Handling for Legacy & Testnet Blocks #278

JaredTate commented Jan 24, 2025

gto90 left a comment

ycagel left a comment

	enum {
	ALGO_UNKNOWN = -1,
	ALGO_SHA256D = 0,
	ALGO_SCRYPT = 1,
	ALGO_GROESTL = 2,
	ALGO_SKEIN = 3,
	ALGO_QUBIT = 4,
	//ALGO_EQUIHASH = 5,
	//ALGO_ETHASH = 6,
	ALGO_ODO = 7,
	NUM_ALGOS_IMPL };

	const int NUM_ALGOS = 5;

	enum {
	// primary version
	BLOCK_VERSION_DEFAULT = 2,

	// algo
	BLOCK_VERSION_ALGO = (15 << 8),
	BLOCK_VERSION_SCRYPT = (0 << 8),
	BLOCK_VERSION_SHA256D = (2 << 8),
	BLOCK_VERSION_GROESTL = (4 << 8),
	BLOCK_VERSION_SKEIN = (6 << 8),
	BLOCK_VERSION_QUBIT = (8 << 8),
	//BLOCK_VERSION_EQUIHASH = (10 << 8),
	//BLOCK_VERSION_ETHASH = (12 << 8),
	BLOCK_VERSION_ODO = (14 << 8),
	};

Fix Memory Access Bug & Improve GetAlgo Handling for Legacy & Testnet Blocks #278

Fix Memory Access Bug & Improve GetAlgo Handling for Legacy & Testnet Blocks #278

Conversation

JaredTate commented Jan 24, 2025

Fix Memory Access Bug & Improve GetAlgo Handling for Legacy & Testnet Blocks

Summary of the Problem

Symptom & LLDB Trace

Diagnosis

Root Cause in Brief

Changes Implemented

Why This Fixes the Memory Bug

Test Results

gto90 left a comment

Choose a reason for hiding this comment

ycagel left a comment

Choose a reason for hiding this comment