fuzz: Add tx_pool fuzz target#21142
Conversation
ced2dbe to
cf801da
Compare
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsNo conflicts as of last run. |
|
Concept ACK: very nice to see the mempool logic more thoroughly fuzzed! |
cf801da to
4e197bb
Compare
4e197bb to
faa542a
Compare
|
Rebased |
faa542a to
e4e253d
Compare
|
Tested ACK e4e253d
|
glozow
left a comment
There was a problem hiding this comment.
Concept ACK, very excited to see fuzzing in ATMP :) I have a lot of questions.
Got it running and have reviewed tx_pool.cpp so far. My biggest suggestion is to try to hit the RBF code by keeping track of outpoints in the mempool and potentially pulling input(s) from there as well.
I'm a fuzzing noob, ran into a sanitizer error on my mac so I ran it in docker instead
(Commands if anyone's interested)
docker run -it ubuntu:bionic /bin/bash
apt update
apt install -y git
apt install -y sudo
git clone https://github.com/bitcoin/bitcoin
cd bitcoin
git remote add marco https://github.com/MarcoFalke/bitcoin-core.git
git fetch marco 2102-fuzzPool
git checkout 2102-fuzzPool
sudo apt-get install build-essential libtool autotools-dev automake pkg-config bsdmainutils python3
sudo apt-get install libevent-dev libboost-system-dev libboost-filesystem-dev libboost-test-dev libboost-thread-dev
sudo apt-get install clang
./autogen.sh
CC=clang CXX=clang++ ./configure --enable-fuzz --with-sanitizers=address,fuzzer,undefined --without-gui --disable-wallet
make
FUZZ=tx_pool src/test/fuzz/fuzz
593c7af to
bcf96cd
Compare
|
I pushed an update for RBF. Will push some more stuff tomorrow 😴 |
|
ACK fa5382241f7943a7395f60a8916e0d71c10d2932 Reviewed the code and ran the fuzz coverage tests before & after running for a few hours. Definitely getting better coverage with this, especially with the mempool/validation files. BeforeAfter |
fae8741 to
fab3bee
Compare
fab6b16 to
fae1617
Compare
|
@MarcoFalke as you requested in the review club yesterday, I ran the if (accepted) {
txids.push_back(tx->GetHash());
+ std::cout << "\n\n\n*********************** SUCCESS ***************************\n\n\n";
} |
|
I quickly ran 15kk iterations (double the 7kk iterations of yours) and it did hit a valid tx. However, I pushed a patch, where you should see a valid tx within ~100k iterations. All feedback has been addressed, this is now ready for re-ACKs. |
|
Valid tx within a few seconds with latest push. Edit: and many valid txs. |
|
reACK faa9ef4 Getting a better overall coverage profile (though I have been running for a lot longer and probably have better seeds by now) Plus got 378 accepted transactions for the |
|
Tested ACK faa9ef4 Very nice fuzzing harness! Thanks @MarcoFalke! |
glozow
left a comment
There was a problem hiding this comment.
code review ACK faa9ef4, a bunch of comments but non blocking
@MarcoFalke why do you call it tx_pool, is it the cool way of saying mempool? I would've called this mempool_validation.cpp and s/tx_pool/mempool
| constexpr CAmount SUPPLY_TOTAL{COINBASE_MATURITY * 50 * COIN}; | ||
|
|
||
| CTxMemPool tx_pool_{/* estimator */ nullptr, /* check_ratio */ 1}; | ||
| MockedTxPool& tx_pool = *(MockedTxPool*)&tx_pool_; |
There was a problem hiding this comment.
Question: Why can't you use a static_cast here instead?
There was a problem hiding this comment.
Thanks, done in follow-up
| amount_view.GetCoin(outpoint, c); | ||
| Assert(!c.IsSpent()); |
There was a problem hiding this comment.
This can just be:
| amount_view.GetCoin(outpoint, c); | |
| Assert(!c.IsSpent()); | |
| Assert(amount_view.GetCoin(outpoint, c)); |
There was a problem hiding this comment.
Thanks, done in follow-up
There was a problem hiding this comment.
Isn't it better to not have side-effects from asserts?
| const auto& node = g_setup->m_node; | ||
|
|
||
| std::vector<uint256> txids; | ||
| for (const auto& outpoint : g_outpoints_coinbase_init) { |
There was a problem hiding this comment.
Question: is the same g_outpoints_coinbase_init used for both the tx_pool and tx_pool_standard targets?
| } | ||
| for (int i{0}; i <= 3; ++i) { | ||
| // Add some immature and non-existent outpoints | ||
| txids.push_back(g_outpoints_coinbase_init.at(i).hash); |
There was a problem hiding this comment.
Aren't the ones at the beginning the mature ones?
It could also be more clear to have two sets g_outpoints_coinbases_mature and g_outpoints_coinbases_immature.
| txids.push_back(g_outpoints_coinbase_init.at(i).hash); | |
| txids.push_back(g_outpoints_coinbase_init.at(COINBASE_MATURITY + i).hash); |
| auto pop = outpoints_rbf.begin(); | ||
| std::advance(pop, fuzzed_data_provider.ConsumeIntegralInRange<size_t>(0, outpoints_rbf.size() - 1)); | ||
| const auto outpoint = *pop; |
There was a problem hiding this comment.
Why not use PickValue for this? It could take a bool to PickAndMaybeDelete?
There was a problem hiding this comment.
Wouldn't that make PickValue unsuitable for collections that can't erase elements. E.g. std::array
| } | ||
|
|
||
| template <typename Collection> | ||
| const auto& PickValue(FuzzedDataProvider& fuzzed_data_provider, const Collection& col) |
There was a problem hiding this comment.
Why not make this a member function of FuzzedDataProvider like PickValueInArray is?
There was a problem hiding this comment.
FuzzedDataProvider is taken from upstream, so someone needs to submit this to upstream first.
There was a problem hiding this comment.
Oh, is it not src/test/fuzz/FuzzedDataProvider.h?
There was a problem hiding this comment.
It is, but it is only synced from upstream: https://github.com/llvm/llvm-project/blob/main/compiler-rt/include/fuzzer/FuzzedDataProvider.h
| class CTxMemPool | ||
| { | ||
| private: | ||
| protected: |
There was a problem hiding this comment.
It seems a bit invasive to change all of these mempool members... can you just call something that automatically does RollingFeeUpdate like tx_pool.GetMinFee()? Or just set the members you need to protected?
There was a problem hiding this comment.
tx_pool.GetMinFee exists early because of !blockSinceLastRollingFeeBump, so I can't use that. protected as a replacement for private seems fine generally, because it allows tests to mock any private member without having to mark each member individually or add a friend TestClass01 for each class that mocks the mempool. Since mempool isn't derived outside of tests, protected shouldn't come with any risks either.
| [[nodiscard]] CMutableTransaction ConsumeTransaction(FuzzedDataProvider& fuzzed_data_provider, const std::optional<std::vector<uint256>>& prevout_txids, const int max_num_in = 10, const int max_num_out = 10) noexcept; | ||
|
|
||
| [[nodiscard]] CScriptWitness ConsumeScriptWitness(FuzzedDataProvider& fuzzed_data_provider, const size_t max_stack_elem_size = 32) noexcept; | ||
|
|
||
| [[nodiscard]] CScript ConsumeScript(FuzzedDataProvider& fuzzed_data_provider, const size_t max_length = 4096, const bool maybe_p2wsh = false) noexcept; | ||
|
|
||
| [[nodiscard]] uint32_t ConsumeSequence(FuzzedDataProvider& fuzzed_data_provider) noexcept; |
There was a problem hiding this comment.
Add comments for the util functions? e.g. I think it'd be helpful to document that ConsumeTransaction will give a tx that's well-formed but not necessarily valid, and it takes prevout_txids as input but won't necessarily use a txid from there.
There was a problem hiding this comment.
If prevout_txids is passed, it should only pick from there. As none of the functions in this file have doxygen comments, so I'll skip them for now. Though, I am more than happy to review a pull adding docs :)
| tx_mut.nLockTime = fuzzed_data_provider.ConsumeIntegral<uint32_t>(); | ||
| const auto num_in = fuzzed_data_provider.ConsumeIntegralInRange<int>(0, max_num_in); | ||
| const auto num_out = fuzzed_data_provider.ConsumeIntegralInRange<int>(0, max_num_out); | ||
| for (int i = 0; i < num_in; ++i) { |
There was a problem hiding this comment.
In some places you do braced, and in some places you do = 🤷
| for (int i = 0; i < num_in; ++i) { | |
| for (int i{0}; i < num_in; ++i) { |
There was a problem hiding this comment.
Right, this isn't consistent. Though, I'll leave as is for now and remember to write for (int i{0}; i < ... in the future.
There was a problem hiding this comment.
Why not for (auto i{0} ...? 👀
There was a problem hiding this comment.
int is one character shorter 🩳
| // All RBF-spendable outpoints | ||
| std::set<COutPoint> outpoints_rbf; | ||
| // All outpoints counting toward the total supply (subset of outpoints_rbf) | ||
| std::set<COutPoint> outpoints_supply; |
There was a problem hiding this comment.
These comments are confusing 😢
outpoints_rbf includes all spendable outpoints including ones that would be RBFs, but "RBF-spendable" sounds like they're only the ones in the mempool.
outpoints_supply doesn't actually include all the outpoints that count towards total supply. It doesn't include the mempool inputs which is why we need to add mempool GetTotalFee to it...
There was a problem hiding this comment.
outpoints_rbf
Happy to rename if you have suggestions.
It doesn't include the mempool inputs which is why we need to add mempool GetTotalFee to it...
I don't understand what you mean with "mempool inputs". outpoints_supply does include all outpoints from the mempool. The reason that GetTotalFee needs to be added is that the fee in the mempool isn't assigned to an outpoint (yet). The fee is collected in the coinbase transaction in a block, which doesn't exist because this fuzz target doesn't mine any blocks with mempool txs.
There was a problem hiding this comment.
outpoints_supply does include all outpoints from the mempool.
A right 🤦 I was confoozed
| auto pop = outpoints_rbf.begin(); | ||
| std::advance(pop, fuzzed_data_provider.ConsumeIntegralInRange<size_t>(0, outpoints_rbf.size() - 1)); | ||
| const auto outpoint = *pop; |
There was a problem hiding this comment.
Wouldn't that make PickValue unsuitable for collections that can't erase elements. E.g. std::array
| constexpr CAmount SUPPLY_TOTAL{COINBASE_MATURITY * 50 * COIN}; | ||
|
|
||
| CTxMemPool tx_pool_{/* estimator */ nullptr, /* check_ratio */ 1}; | ||
| MockedTxPool& tx_pool = *(MockedTxPool*)&tx_pool_; |
There was a problem hiding this comment.
Thanks, done in follow-up
| amount_view.GetCoin(outpoint, c); | ||
| Assert(!c.IsSpent()); |
There was a problem hiding this comment.
Thanks, done in follow-up
| // All RBF-spendable outpoints | ||
| std::set<COutPoint> outpoints_rbf; | ||
| // All outpoints counting toward the total supply (subset of outpoints_rbf) | ||
| std::set<COutPoint> outpoints_supply; |
There was a problem hiding this comment.
outpoints_rbf
Happy to rename if you have suggestions.
It doesn't include the mempool inputs which is why we need to add mempool GetTotalFee to it...
I don't understand what you mean with "mempool inputs". outpoints_supply does include all outpoints from the mempool. The reason that GetTotalFee needs to be added is that the fee in the mempool isn't assigned to an outpoint (yet). The fee is collected in the coinbase transaction in a block, which doesn't exist because this fuzz target doesn't mine any blocks with mempool txs.
| const auto& node = g_setup->m_node; | ||
|
|
||
| std::vector<uint256> txids; | ||
| for (const auto& outpoint : g_outpoints_coinbase_init) { |
| } | ||
| for (int i{0}; i <= 3; ++i) { | ||
| // Add some immature and non-existent outpoints | ||
| txids.push_back(g_outpoints_coinbase_init.at(i).hash); |
| tx_mut.nLockTime = fuzzed_data_provider.ConsumeIntegral<uint32_t>(); | ||
| const auto num_in = fuzzed_data_provider.ConsumeIntegralInRange<int>(0, max_num_in); | ||
| const auto num_out = fuzzed_data_provider.ConsumeIntegralInRange<int>(0, max_num_out); | ||
| for (int i = 0; i < num_in; ++i) { |
There was a problem hiding this comment.
Right, this isn't consistent. Though, I'll leave as is for now and remember to write for (int i{0}; i < ... in the future.
| } | ||
|
|
||
| template <typename Collection> | ||
| const auto& PickValue(FuzzedDataProvider& fuzzed_data_provider, const Collection& col) |
There was a problem hiding this comment.
FuzzedDataProvider is taken from upstream, so someone needs to submit this to upstream first.
| class CTxMemPool | ||
| { | ||
| private: | ||
| protected: |
There was a problem hiding this comment.
tx_pool.GetMinFee exists early because of !blockSinceLastRollingFeeBump, so I can't use that. protected as a replacement for private seems fine generally, because it allows tests to mock any private member without having to mark each member individually or add a friend TestClass01 for each class that mocks the mempool. Since mempool isn't derived outside of tests, protected shouldn't come with any risks either.
| [[nodiscard]] CMutableTransaction ConsumeTransaction(FuzzedDataProvider& fuzzed_data_provider, const std::optional<std::vector<uint256>>& prevout_txids, const int max_num_in = 10, const int max_num_out = 10) noexcept; | ||
|
|
||
| [[nodiscard]] CScriptWitness ConsumeScriptWitness(FuzzedDataProvider& fuzzed_data_provider, const size_t max_stack_elem_size = 32) noexcept; | ||
|
|
||
| [[nodiscard]] CScript ConsumeScript(FuzzedDataProvider& fuzzed_data_provider, const size_t max_length = 4096, const bool maybe_p2wsh = false) noexcept; | ||
|
|
||
| [[nodiscard]] uint32_t ConsumeSequence(FuzzedDataProvider& fuzzed_data_provider) noexcept; |
There was a problem hiding this comment.
If prevout_txids is passed, it should only pick from there. As none of the functions in this file have doxygen comments, so I'll skip them for now. Though, I am more than happy to review a pull adding docs :)
faa9ef4 fuzz: Add tx_pool fuzz targets (MarcoFalke) Pull request description: ACKs for top commit: AnthonyRonning: reACK faa9ef4 practicalswift: Tested ACK faa9ef4 glozow: code review ACK faa9ef4, a bunch of comments but non blocking Tree-SHA512: 8d404398faa46d8e7bf93060a2fe9afd5c0c2bd6e549ff6588d2f3dd1b912dff6c5416d5477c18edecc2e85b00db4fdf4790c3e6597a5149b0d40c9d5014d82f
No description provided.