Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Memory Access Bug & Improve GetAlgo Handling for Legacy & Testnet Blocks #278

Merged
merged 2 commits into from
Jan 25, 2025

Conversation

JaredTate
Copy link

Fix Memory Access Bug & Improve GetAlgo Handling for Legacy & Testnet Blocks

Summary of the Problem

For years many users have encountered a memory-access error after first launching DigiByte-QT or digibyted during the indexing phase that begins loading/parsing DigiByte block headers. This bug appears most often with older (pre-multi-algo) blocks and certain blocks after the Odo PoW fork.

Symptom & LLDB Trace

During initial block loading and when indexing reaches around 40% the wallet’s memory usage can spike dramatically, sometimes causing crashes or out-of-memory conditions. In LLDB, a this memory error back trace looks like:

* thread #18, name = 'b-qt-init', stop reason = EXC_BAD_ACCESS (code=1, address=0x17d7d847d7d9d7e2)
  * frame #0: 0x000000010084dcc0 DigiByte-QtCBlockIndex::GetBlockHeader() const + 332
    frame #1: 0x0000000101280124 DigiByte-QtCBlockIndex::GetAlgo() const + 344
    …

This indicates a memory violation (EXC_BAD_ACCESS), typically meaning an invalid pointer or out-of-bounds array usage.

Diagnosis

  1. Older Blocks Returning ALGO_UNKNOWN (-1)
    In DigiByte v8.22.0, the function that determines a block’s proof-of-work algorithm looked like:

    int CBlockHeader::GetAlgo() const
    {
        switch (nVersion & BLOCK_VERSION_ALGO)
        {
            case BLOCK_VERSION_SCRYPT:   return ALGO_SCRYPT;
            case BLOCK_VERSION_SHA256D:  return ALGO_SHA256D;
            case BLOCK_VERSION_GROESTL:  return ALGO_GROESTL;
            case BLOCK_VERSION_SKEIN:    return ALGO_SKEIN;
            case BLOCK_VERSION_QUBIT:    return ALGO_QUBIT;
            case BLOCK_VERSION_ODO:      return ALGO_ODO;
        }
        return ALGO_UNKNOWN;  // -1
    }

    Many older (2014–2015) blocks, or blocks using a version that does not set one of the above algo bits, end up returning ALGO_UNKNOWN = -1.

DigiByte has had 4 different block versioning periods or eras throughout its 11-year history, so we have a variety of block versions that need to be properly accounted for when indexing the chain. Modern multi-algo code is much different than older versions and therefore historical chain data was getting misinterpreted by newer code. Below is where the code returning a ALGO_UNKNOWN = -1 comes from and other bitwise block versioning.

enum {
ALGO_UNKNOWN = -1,
ALGO_SHA256D = 0,
ALGO_SCRYPT = 1,
ALGO_GROESTL = 2,
ALGO_SKEIN = 3,
ALGO_QUBIT = 4,
//ALGO_EQUIHASH = 5,
//ALGO_ETHASH = 6,
ALGO_ODO = 7,
NUM_ALGOS_IMPL };
const int NUM_ALGOS = 5;
enum {
// primary version
BLOCK_VERSION_DEFAULT = 2,
// algo
BLOCK_VERSION_ALGO = (15 << 8),
BLOCK_VERSION_SCRYPT = (0 << 8),
BLOCK_VERSION_SHA256D = (2 << 8),
BLOCK_VERSION_GROESTL = (4 << 8),
BLOCK_VERSION_SKEIN = (6 << 8),
BLOCK_VERSION_QUBIT = (8 << 8),
//BLOCK_VERSION_EQUIHASH = (10 << 8),
//BLOCK_VERSION_ETHASH = (12 << 8),
BLOCK_VERSION_ODO = (14 << 8),
};

  1. Out-of-Bounds Writes to lastAlgoBlocks[]
    In older code, CBlockIndex did something like:

    lastAlgoBlocks[GetAlgo()] = this;

    If GetAlgo() returned -1 (unknown), we wrote to lastAlgoBlocks[-1], corrupting memory outside the array. This could go unnoticed until subsequent calls (e.g. GetBlockHeader()) triggered a crash.

  2. Incorrect Mainnet-Only Logic on Testnet
    Historically, mainnet scrypt was used up to block height 145k. Older code forced all blocks < 145,000 to be interpreted as scrypt—even on testnet. But on testnet, multi-algo and version bits activated earlier/later, so forcibly labeling testnet blocks as scrypt introduced further mismatch and potential memory corruption.

Root Cause in Brief

  • Unrecognized version bitsALGO_UNKNOWN = -1
  • No check on unknown algos → writes to lastAlgoBlocks[-1]
  • Testnet logic mismatch → older blocks incorrectly forced as “scrypt”

Changes Implemented

  1. Safer Handling of Unknown Block Versions

    • In CBlockIndex::GetAlgo() and/or the constructor logic, any unrecognized version bits now yield a safe fallback (ALGO_UNKNOWN) without updating lastAlgoBlocks[].
    • We log a warning if ALGO_UNKNOWN is encountered but avoid any array assignment that could go out of bounds.
  2. Accurate Version Parsing Below 145k (Mainnet vs Testnet)

    • Previously, code forced all pre-145,000 blocks to use scrypt. This is correct only for mainnet.
    • We’ve now introduced a network check so that on mainnet we still label pre-145k blocks as scrypt, but on testnet (and other networks), we parse the actual version bits to determine the algo.
  3. Robust Logging & Safety Checks

    • In CBlockIndex::CBlockIndex(const CBlockHeader&), log a message if the block’s version maps to ALGO_UNKNOWN.
    • Consistently handle suspicious block versions gracefully rather than forcing them into an invalid array index.

Why This Fixes the Memory Bug

By properly handling unrecognized or out-of-range block versions, we:

  1. Prevent Out-of-Bounds Writes
    • We no longer do lastAlgoBlocks[-1] = this; when GetAlgo() is ALGO_UNKNOWN.
  2. Respect Different Network Rules
    • Ensure testnet multi-algo transitions are parsed per actual version bits, preventing the forced-scrypt logic from causing further corruption.
  3. Preserve Mainnet Behavior
    • For mainnet, we still treat pre-145k blocks as scrypt, matching historical reality but avoiding the “unknown” index scenario.

Test Results

  • All 470 unit tests pass successfully.
  • All 215 functional tests pass, including p2p_dos_header_tree.py, which checks correct header acceptance/rejection for older heights.
  • The fix eliminates memory-access warnings/errors on startup with address sanitizer compiler flag. Multiple syncs & wallet restarts on Windows, macOS, and a Raspberry Pi 5 confirm stable memory usage (around 6.5–6.7 GB) without spikes or crashes.

Screenshot 2025-01-23 at 1 17 59 PM

Thank you for reviewing these changes! If you have any questions, please feel free to ask in this PR discussion.

Copy link
Member

@gto90 gto90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK

Copy link
Member

@ycagel ycagel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cACK

@ycagel ycagel merged commit 90fab7b into DigiByte-Core:develop Jan 25, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants