Skip to content

coins: remove redundant and confusing CCoinsViewDB::HaveCoin#34320

Open
l0rinc wants to merge 4 commits intobitcoin:masterfrom
l0rinc:l0rinc/remove-HaveCoin
Open

coins: remove redundant and confusing CCoinsViewDB::HaveCoin#34320
l0rinc wants to merge 4 commits intobitcoin:masterfrom
l0rinc:l0rinc/remove-HaveCoin

Conversation

@l0rinc
Copy link
Contributor

@l0rinc l0rinc commented Jan 16, 2026

Split out of #34132 (comment).

Problem

We have two ways to check if a coin exists in the DB.
One tries to deserialize; the other just returns whether any value is associated with the key.
But we usually need the full value we probed for, and HaveCoin cannot cache the returned value, so it can trigger an extra lookup if HaveCoin uses Exists and the caller later needs the actual value.
We're already delegating most HaveCoin calls to GetCoin anyway for the above reasons, so the CCoinsViewDB implementation of HaveCoin is redundant.

Fix

Remove CCoinsViewDB::HaveCoin (falls back to the base CCoinsView::HaveCoin, which delegates to GetCoin).

Testing

A unit test was added to confirm that HaveInputs calls the backing view's ::GetCoin (not ::HaveCoin) on cache miss and checks the cache first.
Also added a benchmark to track throughput before/after changes to ensure no regression.


Note: The latest version of the change keeps most HaveCoin calls for convenience; a previous version removed all of them.

@DrahtBot
Copy link
Contributor

DrahtBot commented Jan 16, 2026

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/34320.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK Eunovo
Concept ACK sedited
Stale ACK andrewtoth

If your review is incorrectly listed, please copy-paste <!--meta-tag:bot-skip--> into the comment that the bot should ignore.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #34132 (coins: drop error catcher, centralize fatal read handling by l0rinc)
  • #34124 (validation: make CCoinsView a pure virtual interface by l0rinc)
  • #32427 ((RFC) kernel: Replace leveldb-based BlockTreeDB with flat-file based store by sedited)
  • #32317 (kernel: Separate UTXO set access from validation functions by sedited)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@maflcko
Copy link
Member

maflcko commented Jan 20, 2026

One tries to deserialize, the other just returns whether any value is associated with the key.
But we usually need the full value we probed for, and HaveCoin cannot cache the returned value, so it can trigger an extra lookup if HaveCoin uses Exists and the caller later needs the actual value.

Are there any flamegraphs or benchmarks, to see how this affects performance?

@l0rinc
Copy link
Contributor Author

l0rinc commented Jan 20, 2026

Are there any flamegraphs or benchmarks, to see how this affects performance?

I'm not sure what to measure exactly because the current code was already delegating to GetCoin in most places.
The pruned IBD and -reindex-chainstate and AssumeUTXO loading benchmarks I started haven't finished yet - but I don't expect any change there.
If you can describe a scenario that you would find helpful, please let me know.

@l0rinc
Copy link
Contributor Author

l0rinc commented Jan 22, 2026

Are there any flamegraphs or benchmarks, to see how this affects performance?

Added assumeutxo load, reindex-chainstate and pruned IBD measurements to the PR description

@andrewtoth
Copy link
Contributor

andrewtoth commented Jan 24, 2026

I'm not sure what to measure exactly because the current code was already delegating to GetCoin in most places.

I'm not sure about this. CCoinsViewCache::HaveCoin seems to be the implementation that is called in all non-test places outside of coins.cpp, and only delegates to GetCoin if there's a cache miss. In CCoinsViewCache::HaveCoin we avoid copying Coin:

bool CCoinsViewCache::HaveCoin(const COutPoint &outpoint) const {
    CCoinsMap::const_iterator it = FetchCoin(outpoint);
    return (it != cacheCoins.end() && !it->second.coin.IsSpent());
}

vs CCoinsViewCache::GetCoin where we return a copy of Coin:

std::optional<Coin> CCoinsViewCache::GetCoin(const COutPoint& outpoint) const
{
    if (auto it{FetchCoin(outpoint)}; it != cacheCoins.end() && !it->second.coin.IsSpent()) return it->second.coin;
    return std::nullopt;
}

@andrewtoth
Copy link
Contributor

andrewtoth commented Jan 24, 2026

we usually need the full value we probed for, and HaveCoin cannot cache the returned value, so it can trigger an extra lookup if HaveCoin uses Exists and the caller later needs the actual value.

I see 4 uses of HaveCoin

  1. In ConnectBlock, we ensure we do not already have outputs of a block for BIP30 checks.
  2. In ApplyTxInUndo, we ensure we do not already have the coin we are about to roll back to unspent.
  3. In MemPoolAccept::PreChecks, where we check that we have all inputs of a tx we are accepting to the mempool.
  4. In TxVerify, where we call HaveInputs.

For 1 and 2, we are checking for the non-existence of the coin so we won't need the full value. In this case not caching the value makes sense.
For 3, it is explicitly relying on the fact that the value is cached. The backend is thrown away after all input prevouts are added to the top level cache.
For 4, it is a peformance improvement that the value is cached.

I think 1 and 2 would benefit from CCoinsViewCache::HaveCoin delegating to the base's HaveCoin and not going through FetchCoin.
For 3 this seems like an abuse of the CCoinsView API. But, calling GetCoin instead there would result in unnecessary copying of the Coin. I'm not sure what the best solution is here.
For 4 the current behavior makes sense.

@l0rinc
Copy link
Contributor Author

l0rinc commented Jan 25, 2026

only delegates to GetCoin if there's a cache miss. In CCoinsViewCache::HaveCoin we avoid copying Coin

Disassembly with separate translation units indicates GetCoin() does a new + memcpy of the Coin and the caller immediately deletes it, even when only checking .has_value(). HaveCoin() does indeed avoid it, but the extra copy only happens on the unexpectedly-present coin path, since if it's expected, we usually want to cache it. If we want to avoid the extra copy, we can use AccessCoin, but we would duplicate the spentness check.

In this case not caching the value makes sense.

But there's nothing to cache since these will return an empty optional. For BIP30 and ApplyTxInUndo the missing-coin path dominates, so GetCoin() short-circuits on Read() failure without deserialization, so I/O is the same as Exists(), no coin-copy happens (and when it does we crash or reorg anyway). But given that validation until bip30-end takes less than 3 minutes on my benchmarking server (see #33817), I'd say it's irrelevant either way.

For 3 this seems like an abuse of the CCoinsView API. But, calling GetCoin instead there would result in unnecessary copying of the Coin. I'm not sure what the best solution is here.

This is a good point, the disassembly indicates there's no devirtualization and no inlining in CCoinsViewCache::HaveInputs and MemPoolAccept::PreChecks, so the copy is real - though a lot faster than disk access, so it's not actually measurable.
We can use AccessCoin here and with explicit IsSpent call instead which would avoid the call (according to the assembly):

if (AccessCoin(tx.vin[i].prevout).IsSpent()) {

and

if (m_view.AccessCoin(txin.prevout).IsSpent()) {

I have added a new benchmark as a first commit to check the performance of on-disk HaveInputs to make sure the performance is retained.

Benchmark baseline:

ns/tx tx/s err% total benchmark
998.95 1,001,046.69 0.8% 1.09 HaveInputsOnDisk

Benchmark with !GetCoin() instead of HaveCoin:

ns/tx tx/s err% total benchmark
985.57 1,014,640.63 0.8% 1.10 HaveInputsOnDisk

Replace remaining HaveCoin calls with GetCoin()/AccessCoin and remove the former

ns/tx tx/s err% total benchmark
983.17 1,017,118.55 0.3% 1.10 HaveInputsOnDisk

(ran it 5 times for each commit and copied the fastest (with smallest noise). The small difference is reproducible, but not very important as long as it's not measurably slower)

Benchmarking script and results on Linux
for commit in 3544b98927050956f68cb183bf2686f5c65e5ebb 8c568500442144fbcd457eccda5a1956aa98ca81; do \                                        
    git fetch origin $commit >/dev/null 2>&1 && git checkout $commit >/dev/null 2>&1 && echo "" && git log -1 --pretty='%h %s' && \
    rm -rf build >/dev/null 2>&1 && cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release >/dev/null 2>&1 && \
    cmake --build build -j$(nproc) >/dev/null 2>&1 && \
    for _ in $(seq 5); do \
      sleep 5; \
      ./build/bin/bench_bitcoin -filter='HaveInputsOnDisk' -min-time=1000; \
    done; \
done

resulting in

3544b98 bench: add on-disk HaveInputs benchmark

ns/tx tx/s err% ins/tx cyc/tx IPC bra/tx miss% total benchmark
2,137.20 467,902.92 0.0% 16,874.42 7,673.54 2.199 2,895.01 0.8% 1.09 HaveInputsOnDisk
ns/tx tx/s err% ins/tx cyc/tx IPC bra/tx miss% total benchmark
2,153.41 464,380.56 0.0% 16,873.94 7,729.93 2.183 2,894.99 0.8% 1.10 HaveInputsOnDisk
ns/tx tx/s err% ins/tx cyc/tx IPC bra/tx miss% total benchmark
2,148.36 465,471.81 0.1% 16,873.92 7,709.74 2.189 2,894.98 0.8% 1.10 HaveInputsOnDisk
ns/tx tx/s err% ins/tx cyc/tx IPC bra/tx miss% total benchmark
2,150.10 465,095.33 0.1% 16,873.85 7,718.44 2.186 2,894.99 0.8% 1.10 HaveInputsOnDisk
ns/tx tx/s err% ins/tx cyc/tx IPC bra/tx miss% total benchmark
2,135.95 468,175.11 0.0% 16,874.21 7,669.12 2.200 2,894.99 0.8% 1.10 HaveInputsOnDisk

8c56850 coins: replace remaining HaveCoin calls with GetCoin()/AccessCoin and remove the former

ns/tx tx/s err% ins/tx cyc/tx IPC bra/tx miss% total benchmark
2,120.50 471,586.34 0.0% 16,874.75 7,611.58 2.217 2,897.27 0.8% 1.10 HaveInputsOnDisk
ns/tx tx/s err% ins/tx cyc/tx IPC bra/tx miss% total benchmark
2,178.86 458,955.50 0.5% 16,875.34 7,821.05 2.158 2,897.29 1.0% 1.08 HaveInputsOnDisk
ns/tx tx/s err% ins/tx cyc/tx IPC bra/tx miss% total benchmark
2,153.54 464,352.06 0.0% 16,875.12 7,731.71 2.183 2,897.26 0.8% 1.10 HaveInputsOnDisk
ns/tx tx/s err% ins/tx cyc/tx IPC bra/tx miss% total benchmark
2,137.40 467,857.52 0.0% 16,875.16 7,673.32 2.199 2,897.26 0.8% 1.10 HaveInputsOnDisk
ns/tx tx/s err% ins/tx cyc/tx IPC bra/tx miss% total benchmark
2,135.08 468,365.96 0.0% 16,875.23 7,663.44 2.202 2,897.27 0.8% 1.11 HaveInputsOnDisk

@l0rinc l0rinc force-pushed the l0rinc/remove-HaveCoin branch from 9b4e044 to 8c56850 Compare January 25, 2026 15:28
@andrewtoth
Copy link
Contributor

If we want to avoid the extra copy, we can use AccessCoin, but we would duplicate the spentness check.

This tells us that HaveCoin is the right level of abstraction then, and is a useful method, no?

In this case not caching the value makes sense.

But there's nothing to cache since these will return an empty optional

Same as above.

We can use AccessCoin here and with explicit IsSpent call instead which would avoid the call (according to the assembly)

I think everywhere we use AccessCoin we assert the unspentness. This would now break that pattern and use AccessCoin to check existence via spentness. I think HaveCoin conveys intent better here.

Can we instead just remove HaveCoin on the db then, and the default implementation would just forward to !!GetCoin for it?

@l0rinc l0rinc changed the title coins: replace remaining HaveCoin calls with GetCoin coins: remove redundant and confusing CCoinsViewDB::HaveCoin Feb 1, 2026
@l0rinc
Copy link
Contributor Author

l0rinc commented Feb 1, 2026

Thanks @andrewtoth, I didn't like the suggestion at first since we keep polluting the interface, but after applying your suggestion I'm fine with the results, it's simpler and better separated from the other similar PRs.

@l0rinc l0rinc force-pushed the l0rinc/remove-HaveCoin branch 2 times, most recently from 67068ee to 3d25cbf Compare February 1, 2026 13:42
@l0rinc l0rinc force-pushed the l0rinc/remove-HaveCoin branch from 3d25cbf to 4166ee8 Compare February 1, 2026 13:43
@DrahtBot DrahtBot removed the CI failed label Feb 1, 2026
@l0rinc l0rinc force-pushed the l0rinc/remove-HaveCoin branch 2 times, most recently from 23dd085 to 69fa15a Compare February 3, 2026 17:56
@DrahtBot
Copy link
Contributor

DrahtBot commented Feb 3, 2026

🚧 At least one of the CI tasks failed.
Task lint: https://github.com/bitcoin/bitcoin/actions/runs/21641577498/job/62382439018
LLM reason (✨ experimental): Lint check failed: direct use of std::filesystem was flagged by the std_filesystem lint rule.

Hints

Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:

  • Possibly due to a silent merge conflict (the changes in this pull request being
    incompatible with the current code in the target branch). If so, make sure to rebase on the latest
    commit of the target branch.

  • A sanitizer issue, which can only be found by compiling with the sanitizer and running the
    affected test.

  • An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

Copy link
Contributor

@andrewtoth andrewtoth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK d1a8a1b

CCoinsViewDB::HaveCoin is dead code, good to remove it and prevent confusion.
Also, adding test coverage and a benchmark to prevent regressions is good.

a.out == b.out;
}

Coin MakeTestCoin() { return Coin{CTxOut{1, CScript{}}, 1, false}; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: not sure we really need this helper? It's a one liner called in two places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like duplication, but I agree it's not needed here

mtx.vin.emplace_back(prevout);
const CTransaction tx{mtx};
BOOST_CHECK(!tx.IsCoinBase());
BOOST_CHECK(cache.HaveInputs(tx));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a basic CCoinsViewCache test. Is this behavior not covered elsewhere? If not, should we extend this to test that AccessCoin, GetCoin, HaveCoin, SpendCoin do not access base when the coin is cached?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HaveInputs() had no unit coverage at all.
These new tests cover the "cache hit should not consult base" and "cache miss should consult base via GetCoin()" usecases - if you think there's a better location for these, let me know.
I've extended it with the mentioned methods, thanks

DataStream{benchmark::data::block413567} >> TX_WITH_WITNESS(block);

CCoinsViewCache cache{&db};
cache.SetBestBlock(uint256::ONE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: doesn't really matter, but we could set this to block.GetBlockHeader().GetHash().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks

@l0rinc l0rinc force-pushed the l0rinc/remove-HaveCoin branch 2 times, most recently from e4613ee to ff338fd Compare February 15, 2026 21:03
Copy link
Contributor

@Eunovo Eunovo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK ff338fd:

Left some comments.

}
}

BOOST_AUTO_TEST_CASE(ccoins_haveinputs_cache_miss_uses_base_getcoin)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4b32181:

Consider merging both tests as done in the diff below. I also replaced the DB view with a dummy view because CCoinsViewDB is not under test here; the dummy is sufficient to test the cache.

diff --git a/src/test/coins_tests.cpp b/src/test/coins_tests.cpp
index ad182559a3..baba9c4e75 100644
--- a/src/test/coins_tests.cpp
+++ b/src/test/coins_tests.cpp
@@ -1050,23 +1050,13 @@ BOOST_FIXTURE_TEST_CASE(ccoins_flush_behavior, FlushTest)
     }
 }
 
-BOOST_AUTO_TEST_CASE(ccoins_haveinputs_cache_miss_uses_base_getcoin)
+BOOST_AUTO_TEST_CASE(ccoins_cache_behavior)
 {
-    const COutPoint prevout{Txid::FromUint256(m_rng.rand256()), 0};
-
-    CCoinsViewDB db{{.path = "test", .cache_bytes = 1_MiB, .memory_only = true}, {}};
-    {
-        CCoinsViewCache write_cache{&db};
-        write_cache.SetBestBlock(m_rng.rand256());
-        write_cache.AddCoin(prevout, Coin{CTxOut{1, CScript{}}, 1, false}, /*possible_overwrite=*/false);
-        write_cache.Flush();
-    }
-
     class CCoinsViewSpy final : public CCoinsViewBacked
     {
     public:
         const COutPoint expected;
-        mutable size_t havecoin_calls{0}, getcoin_calls{0};
+        mutable size_t getcoin_calls{0};
 
         explicit CCoinsViewSpy(CCoinsView* view, const COutPoint& out) : CCoinsViewBacked(view), expected{out} {}
 
@@ -1074,18 +1064,23 @@ BOOST_AUTO_TEST_CASE(ccoins_haveinputs_cache_miss_uses_base_getcoin)
         {
             ++getcoin_calls;
             BOOST_CHECK(out == expected);
-            return CCoinsViewBacked::GetCoin(out);
+            return Coin{CTxOut{1, CScript{}}, 1, false};
         }
 
         bool HaveCoin(const COutPoint& out) const override
         {
-            ++havecoin_calls;
-            BOOST_CHECK(out == expected);
-            return CCoinsViewBacked::HaveCoin(out);
+            // This should never be called, since HaveInputs() should call GetCoin()
+            // and not call HaveCoin() at all. If this is called, it means that
+            // HaveInputs() is calling HaveCoin() instead of GetCoin(), which
+            // is less efficient since it may require two lookups instead of one.
+            BOOST_FAIL("Base HaveCoin should never be called");
+            return false;
         }
     };
 
-    CCoinsViewSpy base{&db, prevout};
+    const COutPoint prevout{Txid::FromUint256(m_rng.rand256()), 0};
+    CCoinsView dummy;
+    CCoinsViewSpy base{&dummy, prevout};
     CCoinsViewCache cache{&base};
 
     CMutableTransaction mtx;
@@ -1095,46 +1090,14 @@ BOOST_AUTO_TEST_CASE(ccoins_haveinputs_cache_miss_uses_base_getcoin)
 
     BOOST_CHECK(cache.HaveInputs(tx));
     BOOST_CHECK_EQUAL(base.getcoin_calls, 1);
-    BOOST_CHECK_EQUAL(base.havecoin_calls, 0);
-}
 
-BOOST_AUTO_TEST_CASE(ccoins_cache_hit_does_not_call_base)
-{
-    class CCoinsViewNoCall final : public CCoinsView
-    {
-    public:
-        std::optional<Coin> GetCoin(const COutPoint&) const override
-        {
-            BOOST_FAIL("Base GetCoin should not be called when input is cached");
-            return std::nullopt;
-        }
-
-        bool HaveCoin(const COutPoint&) const override
-        {
-            BOOST_FAIL("Base HaveCoin should not be called when input is cached");
-            return false;
-        }
-    };
-
-    const COutPoint prevout{Txid::FromUint256(m_rng.rand256()), 0};
-    CCoinsViewNoCall base;
-    CCoinsViewCache cache{&base};
-
-    cache.AddCoin(prevout, Coin{CTxOut{1, CScript{}}, 1, false}, /*possible_overwrite=*/false);
-    BOOST_CHECK(cache.HaveCoinInCache(prevout));
-
-    BOOST_CHECK(!cache.AccessCoin(prevout).IsSpent());
-    BOOST_CHECK(cache.GetCoin(prevout));
-    BOOST_CHECK(cache.HaveCoin(prevout));
-
-    CMutableTransaction mtx;
-    mtx.vin.emplace_back(prevout);
-    const CTransaction tx{mtx};
-    BOOST_CHECK(!tx.IsCoinBase());
     BOOST_CHECK(cache.HaveInputs(tx));
-
-    BOOST_CHECK(cache.SpendCoin(prevout));
-    BOOST_CHECK(!cache.HaveCoinInCache(prevout));
+    BOOST_CHECK(cache.GetCoin(prevout));
+    BOOST_CHECK(!cache.AccessCoin(prevout).IsSpent());
+    // GetCoin should only have been called once,
+    // since the result should be cached after
+    // the first call to HaveInputs().
+    BOOST_CHECK_EQUAL(base.getcoin_calls, 1);
 }
 
 BOOST_AUTO_TEST_CASE(coins_resource_is_used)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the hint, I simplified this into a single ccoins_cache_behavior test using a dummy spy base view. It covers the two behaviors that weren’t explicitly pinned elsewhere: on cache miss, HaveInputs() must use base GetCoin() (not HaveCoin()), and once cached (including prefilled cache), lookups stay in cache without extra base calls, with SpendCoin() updating cache state as expected. So this keeps the original intent while adopting your simplification and removing the DB setup/duplication.

Copy link
Contributor

@Eunovo Eunovo Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From 8c57687 commit message:

ExistsImpl was removed since it duplicates CDBWrapper::ReadImpl (except that it copies the resulting string on success, but that will be needed for caching anyway).

I was thrown off by the last part of this sentence because it's not clear why caching is being mentioned here. I believe you were referring to the fact that with this PR, the HaveCoin will lead to a Read call instead of an Exists call and will cache the result so we can ignore the cost of the copy when doing the first HaveCoin call.

If I am correct, can you make your justification clearer in the commit message? To state clearly that Read is used over Exists and the result is cached so the cost of the copy can be ignored.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c0c94ec:

New commit message looks good to me.

@l0rinc l0rinc force-pushed the l0rinc/remove-HaveCoin branch from ff338fd to dd76338 Compare February 20, 2026 15:56
@l0rinc l0rinc closed this Feb 20, 2026
@l0rinc l0rinc reopened this Feb 20, 2026
Copy link
Contributor

@Eunovo Eunovo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but I have one more comment on the test.

BOOST_CHECK(cache.HaveInputs(tx));
BOOST_CHECK_EQUAL(base.getcoin_calls, 1);

base.allow_getcoin = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

277c57f:

I don't think we need base.allow_getcoin, because the BOOST_CHECK_EQUAL(base.getcoin_calls, 1); line below would fail if another base.GetCoin() call was made.

Copy link
Contributor Author

@l0rinc l0rinc Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

277c57f test: add HaveInputs call-path unit tests:

Good point, taken - also rebased in the meantime, you can diff with B=ff338fdb53a66ab40a36e1277e7371941fc89840 A=dd76338a57b9b1169ac27f7b783d6d0d4c6e38ab && git fetch origin $B $A && git range-diff --creation-factor=95 $B...$A

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c0c94ec:

New commit message looks good to me.

l0rinc and others added 4 commits February 23, 2026 11:39
`CCoinsViewCache::HaveInputs()` is used by `Consensus::CheckTxInputs()` and currently exercises the `HaveCoin()` path in the UTXO cache.
Performance could be affected due to `Coin` copying and different cache warming behavior.

Add a benchmark to track `HaveInputs()` throughput before and after the `HaveCoin()` changes in this series to make sure the performance is retained.
The benchmark populates an on-disk `CCoinsViewDB` with dummy unspent coins for all block inputs and then calls `HaveInputs()` for each non-coinbase transaction using a fresh `CCoinsViewCache` (to avoid cross-iteration caching).
It calls `HaveInputs()` twice to exercise the cache-hit path as well.

Benchmark baseline:
|               ns/tx |                tx/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|            1,004.58 |          995,437.07 |    0.7% |      1.10 | `HaveInputsOnDisk`
Add unit tests covering `CCoinsViewCache::HaveInputs()`.

The tests document that `HaveInputs()` consults the cache first and that a cache miss pulls from the backing view via `GetCoin()`.

Co-authored-by: Novo <[email protected]>
`ExistsImpl` was removed since it duplicates `CDBWrapper::ReadImpl`.
The only difference is that `ExistsImpl` discards the value while `ReadImpl` returns it as `std::string`.
In this series, `HaveCoin` database lookups go through `GetCoin`/`Read`, and successful reads are cached in `CCoinsViewCache`, so this copy is not a separate recurring cost.
`CDBWrapper::Exists()` and `::ReadRaw()` both issue a LevelDB `::Get`.

Introduce a small helper returning the raw value bytes (as `std::string`) for a given key.
Move `Exists` closer to the read methods.
Direct `CCoinsViewDB::HaveCoin` calls aren't called in production anyway, so let's just revert to delegating to `::GetCoin`.

Benchmark at this commit:
|               ns/tx |                tx/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|              995.62 |        1,004,398.82 |    0.9% |      1.10 | `HaveInputsOnDisk`
@l0rinc l0rinc force-pushed the l0rinc/remove-HaveCoin branch from dd76338 to 8bfd863 Compare February 23, 2026 11:22
@l0rinc l0rinc closed this Feb 23, 2026
@l0rinc l0rinc reopened this Feb 23, 2026
l0rinc added a commit to l0rinc/bitcoin that referenced this pull request Feb 23, 2026
When `git range-diff` cannot match a commit between two versions of a branch, it shows the old commit as removed (`<`) and the new commit as added (`>`), and it does not show the patch contents.
This output is easy to misread as "no code changes" because the diff for that commit is effectively empty.
It really means the commits were considered unrelated and the new commit should be reviewed from scratch.
This is analogous to rename detection in `git diff`: if similarity is too low, a rename shows up as delete+add.

Example (exact SHAs from PR bitcoin#34320):
    B=ff338fdb53a66ab40a36e1277e7371941fc89840; A=dd76338a57b9b1169ac27f7b783d6d0d4c6e38ab; git fetch upstream $B $A
    git range-diff 0ca4295..$B 139aa4b..$A

This produced output like:
    1:  4b32181 < -:  ---------- test: add `HaveInputs` call-path unit tests
    -:  ---------- > 1:  277c57f test: add `HaveInputs` call-path unit tests

Even though the subject matches, the first commit had no matching diff shown.
That should be treated as "unmatched" rather than "unchanged".

If you expected a match, try increasing the creation factor so `git range-diff` searches harder:
    git range-diff --creation-factor=95 <old_range> <new_range>
Copy link
Contributor

@Eunovo Eunovo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 8bfd863

@DrahtBot DrahtBot requested a review from andrewtoth February 24, 2026 10:24
@sedited
Copy link
Contributor

sedited commented Mar 6, 2026

Concept ACK

I'm a bit confused by this patch set. If the CCoinsViewDB's HaveCoin is only used in the coins tests to verify invariants (which can be checked by replacing its implementation with a throw), why is this adding a benchmark? I'm also not sure why this is also changing the implementation of Exists if it is not used by production code in the coins db path. Can't that just be left alone? Why not scope this PR to just remove the HaveCoin member function and add the unit test from the second commit?

@l0rinc
Copy link
Contributor Author

l0rinc commented Mar 6, 2026

why is this adding a benchmark?

I touched on that briefly in the commit message:

CCoinsViewCache::HaveInputs() is used by Consensus::CheckTxInputs() and currently exercises the HaveCoin() path in the UTXO cache.
Performance could be affected due to Coin copying and different cache-warming behavior.

The motivation was @andrewtoth's concern about a coin copy regression. See #34320 (comment)

It also checks the cached behavior, not just the DB query.

I'm also not sure why this is also changing the implementation of Exists

I tried to cover this in the commit message as well. Please let me know which part needs more detail:

ExistsImpl was removed since it duplicates CDBWrapper::ReadImpl.

Basically, we can remove it now since the underlying premise, trying to avoid deserialization cost, isn't warranted anymore. I could inline the Exists method now (I did so in a previous version), but this results in a smaller diff.

@sedited
Copy link
Contributor

sedited commented Mar 6, 2026

The motivation was @andrewtoth's concern about a coin copy regression. See #34320 (comment)

Why do these concerns apply to its removal if the coins db's HaveCoin is never called in production?

Basically, we can remove it now since the underlying premise, trying to avoid deserialization cost, isn't warranted anymore. I could inline the Exists method now (I did so in a previous version), but this results in a smaller diff.

Sure, but it would be good to mention why this cost isn't relevant to its current use in the indexes.

chriszeng1010 pushed a commit to chriszeng1010/bitcoin that referenced this pull request Mar 17, 2026
When `git range-diff` cannot match a commit between two versions of a branch, it shows the old commit as removed (`<`) and the new commit as added (`>`), and it does not show the patch contents.
This output is easy to misread as "no code changes" because the diff for that commit is effectively empty.
It really means the commits were considered unrelated and the new commit should be reviewed from scratch.
This is analogous to rename detection in `git diff`: if similarity is too low, a rename shows up as delete+add.

Example (exact SHAs from PR bitcoin#34320):
    B=ff338fdb53a66ab40a36e1277e7371941fc89840; A=dd76338a57b9b1169ac27f7b783d6d0d4c6e38ab; git fetch upstream $B $A
    git range-diff 0ca4295..$B 139aa4b..$A

This produced output like:
    1:  4b32181 < -:  ---------- test: add `HaveInputs` call-path unit tests
    -:  ---------- > 1:  277c57f test: add `HaveInputs` call-path unit tests

Even though the subject matches, the first commit had no matching diff shown.
That should be treated as "unmatched" rather than "unchanged".

If you expected a match, try increasing the creation factor so `git range-diff` searches harder:
    git range-diff --creation-factor=95 <old_range> <new_range>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants