Skip to content

BFT snapshots#3

Open
lploom wants to merge 24 commits intomainfrom
bft-snapshots
Open

BFT snapshots#3
lploom wants to merge 24 commits intomainfrom
bft-snapshots

Conversation

@lploom
Copy link
Copy Markdown
Contributor

@lploom lploom commented Mar 26, 2026

Added BFT snapshotting as defined in the Yellowpaper.

New block structure
Removed "minerAddress" (20 bytes) field and added "payloadRoot" field (32 bytes), so the block headers grew from 100 bytes to 112 bytes. In addition, the block now contains variable length payload of hash of miner reward token (32 bytes) and UTB_cbor bytes (variable size depending on number of BFT nodes etc). The payloadRoot is calculated as H(reward_token_hash || UTB_hash).

Reward tokens
Replaced miner addresses with reward tokens. Every time a block is mined, a reward token is generated and stored locally, the hash of the reward token is appended to the block payload. By default the reward tokens are stored to:
$ cat /tmp/node1/reward_tokens.csv

Height,BlockHash,TokenID
1,010298102a9a88e9c483c2c9952532f433548ae0792780366b2ab00c72f44568,5196fd2c158dcd030f4a148c62590d6c80f610fff9230fd41c1b3ba3484b7093
2,38787a604501b567744d42056a7e2d6508a90e0357ddad3127de828aa1182f0b,b877190e6f7509eacf0756bd9db4c33ada2a8878bad0fe9da57d9e534b9e5093

BFT layer integration
If a block is mined then the miner queries BFT layer for new UTBs, if a new UTB is found then it is also appended to block payload. If new UTBs are not found then nothing is appended to the block and zero hash is used to calculate payloadRoot. The BFT node is configurable via --bftaddr flag which defaults to "http://127.0.0.1:25866" in testnet and mainnet. In REGTEST mode the BFT integration is disabled and can be enabled by explicitly setting the --bftaddr flag.

@lploom lploom requested a review from MastaP March 26, 2026 15:21
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the blockchain protocol to a headers-only structure with a fixed 112-byte header and a variable-length payload, integrating BFT trust base verification and a new miner reward token system. My review identified a potential issue with the specified git tag for cpp-httplib in CMakeLists.txt and an outdated seed corpus in the fuzzing build script that needs to be updated to match the new header size.

FetchContent_Declare(
httplib
GIT_REPOSITORY https://github.com/yhirose/cpp-httplib.git
GIT_TAG v0.38.0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The specified git tag v0.38.0 for cpp-httplib does not appear to exist in the official yhirose/cpp-httplib repository. The latest version seems to follow a v0.major.minor scheme (e.g., v0.15.3). This could be a typo and might cause build failures for users who do not have this specific commit cached. Please verify the tag and use a valid, existing tag to ensure build reproducibility.

# Seed corpus for block headers (112 bytes each)
mkdir -p $OUT/fuzz_block_header_seed_corpus
# Create a few valid block headers as seeds
echo -n "0100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000" | xxd -r -p > $OUT/fuzz_block_header_seed_corpus/valid_header_1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The seed corpus for fuzz_block_header is outdated. The block header size has been increased to 112 bytes, but this seed is still 100 bytes long (200 hex characters). This will cause the fuzzer to start with an invalid input. The seed should be updated to be 112 bytes (224 hex characters) to match the new header size.

Suggested change
echo -n "0100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000" | xxd -r -p > $OUT/fuzz_block_header_seed_corpus/valid_header_1
echo -n "010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000" | xxd -r -p > $OUT/fuzz_block_header_seed_corpus/valid_header_1

@lploom lploom linked an issue Mar 29, 2026 that may be closed by this pull request
Copy link
Copy Markdown
Member

@MastaP MastaP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: BFT Snapshots

24 inline comments below plus 3 cross-cutting notes. Summary by severity:

  • Critical (4): Broken submitheader/submitblock RPCs, no max payload size, quorum_threshold=0 bypass, getrandom() fails on macOS
  • High (7): Wire format incompatibility, persistence migration, total_stake overflow, HTTP response injection, unbounded CBOR, mining thread crash, unvalidated trust base loading from disk
  • Medium (6): Hot-path heap allocation, CBlockIndex memory bloat, httplib in public header, secp256k1 context leak, state file permissions, fragile initializer list
  • Low (7): Debug string typo, dead typedef, unused include, duplicate code, unnecessary deep copy, redundant member, inconsistent error handling

Cross-cutting issues (not pinned to a specific line):

25. Python functional tests are broken on this branch. consensus_asert_difficulty.py and consensus_timestamp_bounds.py still build 100-byte headers with 20-byte miner field. These will fail.

26. E2E integration tests do not exercise the new payload validation. All tests in test/integration-chain/ use SetBypassPOWValidation(true), which skips CheckBlockHeader entirely. No test verifies the full payload validation pipeline end-to-end.

27. Missing negative tests for payload validation. There is no test that verifies blocks with a mismatched payloadRoot (vs actual payload content) are correctly rejected by CheckBlockHeader.

if (hex.size() != 200) {
return util::JsonError("Invalid header length (expect 200 hex chars)");
// Expect exactly 224 hex chars (112 bytes)
if (hex.size() != 224) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical: submitheader and submitblock RPCs are broken — external mining is impossible

Both RPCs enforce exactly 224 hex chars (112 bytes = bare header with no payload). However, CheckBlockHeader now requires vPayload.size() >= 32. Any block submitted via these RPCs will have an empty vPayload and fail validation with "bad-payload-size".

Additionally, getblocktemplate does not return payload_root, reward_token_id, or UTB data, so external miners have no way to construct valid blocks even if the size check were fixed.

Suggestion: Accept variable-length hex input (minimum 224 chars for the header + at least 64 chars for the 32-byte token ID hash):

if (hex.size() < 288) {  // 112-byte header + 32-byte minimum payload
    return util::JsonError("Invalid block data (minimum 288 hex chars: 112-byte header + 32-byte payload)");
}

And update getblocktemplate to return the payload fields.

// 4. Validate Payload Integrity
// Since we don't implement pruning initially, all blocks must have their payloads.
// The first 32 bytes of the payload is the Token ID Hash (leaf_0).
if (header.vPayload.size() < 32) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical: No maximum payload size — resource exhaustion vector

There is a minimum (32 bytes) but no maximum. An attacker can craft blocks with multi-megabyte payloads that pass payloadRoot verification (they control both leaves). Every node stores vPayload in the in-memory CBlockIndex permanently.

Suggestion:

if (header.vPayload.size() < 32) {
    return state.Invalid("bad-payload-size", "block payload missing Token ID hash");
}
static constexpr size_t MAX_PAYLOAD_SIZE = 4096;  // 32 bytes token hash + UTB CBOR
if (header.vPayload.size() > MAX_PAYLOAD_SIZE) {
    return state.Invalid("bad-payload-size", "block payload exceeds maximum size");
}

return std::vector(h.begin(), h.end());
}

bool RootTrustBaseV1::IsValid(const RootTrustBaseV1* prev) const {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical: quorum_threshold == 0 allows empty signatures to pass

At line 231, return total_stake >= trusted->quorum_threshold returns true with zero valid signatures if quorum_threshold is 0. Combined with IsValid() only checking epoch == 1 for genesis (no validation of threshold or root_nodes), an attacker-crafted trust base with threshold 0 and no signers passes all checks.

Suggestion: Add guards here at the top of IsValid():

bool RootTrustBaseV1::IsValid(const RootTrustBaseV1* prev) const {
  // Universal validity checks
  if (quorum_threshold == 0) return false;
  if (root_nodes.empty()) return false;
  ...

#else
size_t offset = 0;
while (offset < len) {
const ssize_t ret = getrandom(data + offset, len - offset, 0);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical: getrandom() is not available on macOS — build will fail

getrandom(2) is a Linux-specific syscall. The #else branch catches all non-Windows platforms, including macOS which the project explicitly supports (CMakeLists.txt, CI). The comment says "Linux: getrandom(2)" but the code path runs on macOS too.

Suggestion: Add macOS support:

#if defined(__APPLE__)
    // macOS: arc4random_buf has no failure mode
    arc4random_buf(data, len);
    return;
#else
    // Linux: getrandom(2)
    const ssize_t ret = getrandom(data + offset, len - offset, 0);
    ...
#endif

Or use getentropy() which works on both (limited to 256 bytes per call).

// Read total block size
const uint64_t block_size = d.read_varint();
if (block_size < CBlockHeader::HEADER_SIZE || block_size > protocol::MAX_PROTOCOL_MESSAGE_LENGTH) {
return false;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High: Wire format is backward-incompatible with no protocol version negotiation

The HEADERS message changed from fixed-size 100-byte entries to length-prefixed variable-size entries. Old and new nodes cannot communicate. There is no version bump or feature bit to detect incompatible peers — they will fail with opaque deserialization errors.

Consider bumping the protocol version and rejecting peers with incompatible versions during the version handshake.

#include "util/uint.hpp"

#include <algorithm>
#include <iostream>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low: Unused include

<iostream> is not used in this file. Likely a debugging leftover — remove it.


std::vector<uint8_t> data = nlohmann::json::to_cbor(j);

// Prepend tag 1013 (d9 03f5)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low: Duplicate CBOR tag prepending in SigBytes() and ToCBOR()

The 0xd9 0x03 0xf5 tag-prepending logic is duplicated between here (lines 139-145) and ToCBOR() (lines 155-161). Extract a helper:

static std::vector<uint8_t> PrependCBORTag1013(std::vector<uint8_t> data) {
    data.insert(data.begin(), {0xd9, 0x03, 0xf5});
    return data;
}


std::vector<uint8_t> RootTrustBaseV1::SigBytes() const {
// Create copy with empty signatures
RootTrustBaseV1 copy = *this;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low: SigBytes() deep-copies entire struct just to clear signatures

This copies all vectors, maps, and node lists. You can avoid the copy:

// Instead of copying the whole struct:
nlohmann::json j;
to_json(j, *this);
j[9] = nullptr;  // Override signatures to null
// Then serialize j directly

std::shared_ptr<BFTClient> bft_client_;
mutable std::mutex mutex_;
std::map<uint64_t, RootTrustBaseV1> trust_bases_;
std::optional<RootTrustBaseV1> latest_trust_base_;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low: Redundant latest_trust_base_ alongside sorted trust_bases_ map

Since trust_bases_ is a std::map<uint64_t, ...> (sorted by epoch), the latest is always trust_bases_.rbegin()->second. Storing a separate copy doubles memory and risks the two getting out of sync.

Consider:

const RootTrustBaseV1* GetLatestLocked() const {
    return trust_bases_.empty() ? nullptr : &trust_bases_.rbegin()->second;
}

}
}

if (!tb.IsValid(prev_tb)) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low: Inconsistent error handling — ProcessTrustBase mixes nullopt and exceptions

Returns std::nullopt for "already have this epoch" (line 100) but throws std::invalid_argument for validation failures (lines 105, 109). SyncTrustBases catches exceptions and returns {}, making it impossible for callers to distinguish "no new trust bases" from "sync failed due to malicious data".

Consider a consistent pattern — either return error codes for all failures, or make SyncTrustBases propagate errors instead of swallowing them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[TGE] BFT state snapshots to PoW chain

2 participants