fuzz: check that ser+unser produces the same AddrMan#21129
fuzz: check that ser+unser produces the same AddrMan#21129maflcko merged 2 commits intobitcoin:masterfrom
Conversation
|
Concept ACK: very nice fuzzing work @vasild! :) |
|
Tested ACK b0f06d04c1a57b3b440ebccf8482fcee124b6337 Thanks for improving our fuzzing harnesses @vasild! :) |
|
Concept ACK |
|
Tested b0f06d0 with Works as expected. |
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
b0f06d0 to
0de22a0
Compare
|
|
jonatack
left a comment
There was a problem hiding this comment.
ACK 0de22a097e89af200d1c6125ba7e5c71fa71e53e
|
re-ACK 0de22a097e89af200d1c6125ba7e5c71fa71e53e |
0de22a0 to
fe9401b
Compare
|
|
|
Concept ACK |
fe9401b to
2fe1efa
Compare
|
|
Move the addrman init code from the test case to a newly added `CAddrManDeterministic` constructor. This way it can be reused by other tests.
2fe1efa to
1f00d22
Compare
|
|
1f00d22 to
6d5e55e
Compare
6d5e55e to
8765179
Compare
|
|
Nice! |
|
cr ACK 8765179 |
| // 0, 1, 2, 3 corresponding to 0%, 100%, 50%, 33% | ||
| const size_t n = m_fuzzed_data_provider.ConsumeIntegralInRange<size_t>(0, 3); | ||
|
|
||
| const size_t num_sources = m_fuzzed_data_provider.ConsumeIntegralInRange<size_t>(10, 50); |
There was a problem hiding this comment.
Is there a reason to exclude num_sources == 1?
…rMan 8765179 fuzz: check that ser+unser produces the same AddrMan (Vasil Dimov) 6408b24 fuzz: move init code to the CAddrManDeterministic constructor (Vasil Dimov) Pull request description: Add a fuzz test that fills addrman with a pile of randomly generated addresses, serializes it to a stream, unserializes the stream to another addrman object and compares the two. Some discussion of this already happened at jnewbery#18. ACKs for top commit: practicalswift: cr ACK 8765179 jonatack: ACK 8765179 rebased to current master, reviewed, fuzz build, ran `FUZZ=addrman_serdeser src/test/fuzz/fuzz` Tree-SHA512: 7eda79279f14f2649840bf752e575d7b02cbaad541f74f7254855ebd4a32da988f042d78aa9228983350283bb74dd0c71f51f04c0846889c3ba2f19f01a0c303
The new test was added in bitcoin/bitcoin#21129.
| for (size_t j = 0; j < num_addresses; ++j) { | ||
| const auto addr = CAddress{CService{RandAddr(), 8333}, NODE_NETWORK}; | ||
| const auto time_penalty = insecure_rand.randrange(100000001); | ||
| #if 1 |
There was a problem hiding this comment.
Change it to #if 0 to enable the equivalent #else branch which is much shorter and clear, but also slower. The #else snippet also shows what the faster snippet is trying to achieve by "manually" fiddling with addrman internals.
57ce203 fuzz: allow lower number of sources (Martin Zumsande) acf656d fuzz: Use public interface to fill addrman tried tables (Martin Zumsande) eb2e113 addrman: Improve performance of Good (Martin Zumsande) Pull request description: Currently, `CAddrman::Good()` is rather slow because the process of moving an addr from new to tried involves looping over the new tables twice: 1) In `Good_()`, there is a loop searching for a new bucket the addr is currently in, but this information is never used except for aborting if it is not found anywhere (since [this commit](e6b343d#diff-49d1faa58beca1ee1509a247e0331bb91f8604e30a483a7b2dea813e6cea02e2R263) it is no longer passed to `MakeTried`) This is unnecessary because in a non-corrupted addrman, an address that is not in New must be either in Tried or not at all in addrman, both cases in which we'd return early in `Good_()` and never get to this point. I removed this loop (and left a check for `nRefCount` as a belt-and-suspenders check). 2) In `MakeTried()`, which is called from `Good_()`, another loop removes all instances of this address from new. This can be spedup by stopping the search at `nRefCount==0`. Further reductions in `nRefCount` would only lead to an assert anyway. Moreover, the search can be started at the bucket determined by the source of the addr for which `Good` was called, so that if it is present just once in New, no further buckets need to be checked. While calls to `Good()` are not that frequent normally, the performance gain is clearly seen in the fuzz target `addman_serdeser`, where, because of the slowness in creating a decently filled addrman, a shortcut was created that would directly populate the tried tables by reaching into addrman's internals, bypassing `Good()` (#21129). I removed this workaround in the second commit: Using `Good()` is still slower by a factor of 2 (down from a factor of ~60 before), but I think that this compensated by the advantages of not having to reach into the internal structures of addrman (see jnewbery#18 (comment)). [Edit]: For benchmark results see #22974 (comment) and #22974 (comment) - the benchmark `AddrManGood` shows a significant speedup by a factor >100. ACKs for top commit: naumenkogs: ACK 57ce203 jnewbery: ACK 57ce203 laanwj: Code review ACK 57ce203 theStack: ACK 57ce203 vasild: ACK 57ce203 Tree-SHA512: fb6dfc198f2e28bdbb41cef9709828f22d83b4be0e640a3155ca42e771b6f58466de1468f54d773e794f780a79113f9f7d522032e87fdd75bdc4d99330445198
57ce203 fuzz: allow lower number of sources (Martin Zumsande) acf656d fuzz: Use public interface to fill addrman tried tables (Martin Zumsande) eb2e113 addrman: Improve performance of Good (Martin Zumsande) Pull request description: Currently, `CAddrman::Good()` is rather slow because the process of moving an addr from new to tried involves looping over the new tables twice: 1) In `Good_()`, there is a loop searching for a new bucket the addr is currently in, but this information is never used except for aborting if it is not found anywhere (since [this commit](bitcoin@e6b343d#diff-49d1faa58beca1ee1509a247e0331bb91f8604e30a483a7b2dea813e6cea02e2R263) it is no longer passed to `MakeTried`) This is unnecessary because in a non-corrupted addrman, an address that is not in New must be either in Tried or not at all in addrman, both cases in which we'd return early in `Good_()` and never get to this point. I removed this loop (and left a check for `nRefCount` as a belt-and-suspenders check). 2) In `MakeTried()`, which is called from `Good_()`, another loop removes all instances of this address from new. This can be spedup by stopping the search at `nRefCount==0`. Further reductions in `nRefCount` would only lead to an assert anyway. Moreover, the search can be started at the bucket determined by the source of the addr for which `Good` was called, so that if it is present just once in New, no further buckets need to be checked. While calls to `Good()` are not that frequent normally, the performance gain is clearly seen in the fuzz target `addman_serdeser`, where, because of the slowness in creating a decently filled addrman, a shortcut was created that would directly populate the tried tables by reaching into addrman's internals, bypassing `Good()` (bitcoin#21129). I removed this workaround in the second commit: Using `Good()` is still slower by a factor of 2 (down from a factor of ~60 before), but I think that this compensated by the advantages of not having to reach into the internal structures of addrman (see jnewbery#18 (comment)). [Edit]: For benchmark results see bitcoin#22974 (comment) and bitcoin#22974 (comment) - the benchmark `AddrManGood` shows a significant speedup by a factor >100. ACKs for top commit: naumenkogs: ACK 57ce203 jnewbery: ACK 57ce203 laanwj: Code review ACK 57ce203 theStack: ACK 57ce203 vasild: ACK 57ce203 Tree-SHA512: fb6dfc198f2e28bdbb41cef9709828f22d83b4be0e640a3155ca42e771b6f58466de1468f54d773e794f780a79113f9f7d522032e87fdd75bdc4d99330445198
Add a fuzz test that fills addrman with a pile of randomly generated addresses, serializes it to a stream, unserializes the stream to another addrman object and compares the two.
Some discussion of this already happened at jnewbery#18.