Headers-first synchronization#4468
Conversation
|
While syncing with this code from a pair of local peers: 2014-07-06 04:26:28 UpdateTip: new best=000000001ee1d3053357a374d6d9746e80d56a666f3827c92e80e0d1b3f2f2a1 height=1888 log2_work=42.883429 tx=1917 date=2009-01-26 03: (not the Misbehaving) Also, it seems to only be pulling from one so far: It started pulling from it eventually, perhaps when a block came in? |
|
2014-07-06 04:38:07 Leaving block file 0: CBlockFileInfo(blocks=119950, size=134216038, heights=0...309426, time=2009-01-03...2014-07-06) The first range looks wrong. :) |
|
@gmaxwell The 'prev block not found' should be fixed; there is a code path to fetch blocks directly (ignoring the headers-based fetching), in case we're very close to being synced (to avoid an extra roundtrip for newly inv'ed blocks)... but the time comparison used < instead of >. |
|
3hr 46 minute resync over the network here. I'm a bit confused that the ping time to the two peers I was pulling from was still >10 seconds even when the sync was well into the cpu bound part (for their own part, the peers were fairly low on cpu utilization). |
|
4h21m here, from random peers, and default dbcache. |
|
Woohoo! |
There was a problem hiding this comment.
Why wait until a complete block is received before marking it as not stalling? Wouldn't it be better to consider it as not stalling as long as a block is being downloaded (even if it takes a while on a slow connection)?
There was a problem hiding this comment.
"Stalling" is already a rather strong condition: it means all blocks in
flight are from a single peer, and we can't ask any peer for other blocks
because they are outside the download window. It does not just mean that
we're not receiving anything from this peer, it means we can't download
anything from anyone because of this peer. The 2 second delay before
actually connecting is to give us some chance to act on blocks that were
already received while we were busy.
That said, this is certainly not perfect. We'll need some tracking of
peers' speed and latency to adjust window sizes, for example. I really just
want something that reasonably well in now.
|
Synced in 3:43 here, and with lots of other things running on the same computer. I do get this error when running with Oh I get the same without any checklevel options. I'll keep this copy of the blocks data and database in case it's interesting for debuging. |
src/main.cpp
Outdated
There was a problem hiding this comment.
We already download every block that peers advertize. In PR changes it to
first ask for headers, and only asks for the block immediately if we're
close to being synced. Validating the header first would require an extra
roundtrip, which can hurt network propagation time.
|
Changed the code to use block-first-seen rather than header-first-seen to distinguish between equal-work branches. |
|
Moved RPC changes to a separate commit (now the actual headers-first commit is a net negative in lines, while adding comments!). |
|
Needs rebase. |
|
Rebased. |
|
I retried my testing, now with a non-corrupting destination, and found no issues. I tickled it in various ways with -checklevel -checkblocks and it completed fine. |
|
Ok, I have a thought/question. With this pull, won't it effectively mean that the average height of the average node be less than without this pull? I.e. detrimental to the bitcoin network? |
|
@rebroad I can't figure out why you'd think that. Once synchronized the heights of all nodes will be the best available to them, and this change makes the system synchronize much faster. |
|
It's not the headers first aspect that makes them sync faster but rather the use of concurrent block downloads. The getting of headers first and the downloading of blocks that aren't necessarily in the best chain will delay nodes getting up to date. |
|
I'm not convinced by your claim @rebroad. Let's look at the evidence here: with this code, new nodes get up to speed much faster. With a decent internet connection this is scarcely longer than a -reindex would take. I don't see one single case of a user reporting that this makes a node get up to date slower. |
|
@rebroad Part of the whole point is that it can use the headers to determine the best chain (with very high probability— only invalidated if the majority hashrate chain is invalidated) very fast, then it only downloads blocks in the best chain. New blocks at the tip are downloaded like they've always been. |
|
Right, this mostly avoids downloading blocks not on the best chain unlike before. No more orphans... |
|
I added a commit that AFAICT should allow >2000 deep reorgs. I tried to enable the reorg test in Travis, but that fails due to >4Mbyte of produced log. |
|
Tested ACK |
|
Tested ACK. I've reviewed this at many points in time and tried many cases (and hit a number of issues which have been fixed). I expect that it still has some remaining bugs, but our behavior without it is almost uselessly bad in the common case... getting some wider usage (in master) will help shake out issues. We know we're going to have a long cycle of testing in master on this version, so lets get that started. |
|
Seems to function fine on ARMv6. |
e11b2ce Fix large reorgs (Pieter Wuille) afc32c5 Fix rebuild-chainstate feature and improve its performance (Pieter Wuille) 16d5194 Skip reindexed blocks individually (Pieter Wuille) ad96e7c Make -reindex cope with out-of-order blocks (Wladimir J. van der Laan) e17bd58 Rename setBlockIndexValid to setBlockIndexCandidates (Pieter Wuille) 1af838b Add height to "Requesting block" debug (R E Broadley) 1bcee67 Better logging of stalling (R E Broadley) 4c93322 Improve getheaders (sending) logging (R E Broadley) f244c99 Remove CheckMinWork, as we always know all parent headers (Pieter Wuille) ad6e601 RPC additions after headers-first (Pieter Wuille) 341735e Headers-first synchronization (Pieter Wuille)
|
|
||
| if (nCount == MAX_HEADERS_RESULTS && pindexLast) { | ||
| // Headers message had its maximum size; the peer may have more headers. | ||
| // TODO: optimize: if pindexLast is an ancestor of chainActive.Tip or pindexBestHeader, continue |
There was a problem hiding this comment.
continue form "there"? there being chainActive.Tip() or pindexBestHeader?
There was a problem hiding this comment.
ah, I get it... ok, coding this...
pindexBestHeader Needed since bitcoin#4468
Here's a first (well, second) version of a fully-functional headers-first synchronization.
Many changes: