Replace libevent with our own HTTP and socket-handling implementation#32061
Replace libevent with our own HTTP and socket-handling implementation#32061pinheadmz wants to merge 27 commits intobitcoin:masterfrom
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code Coverage & BenchmarksFor details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32061. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please copy-paste ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. LLM Linter (✨ experimental)Possible places where named args for integral literals may be used (e.g.
2026-03-16 16:09:17 |
|
🚧 At least one of the CI tasks failed. HintsTry to run the tests locally, according to the documentation. However, a CI failure may still
Leave a comment here, if you need help tracking down a confusing failure. |
|
Concept ACK, nice work |
|
Concept ACK |
|
Concept ACK My understanding from the low-level networking discussion at CoreDev was that this wouldn't build on top of sockman. I guess the devil is in the details but can you address that in what sense the current approach follows what was discussed there? Thanks! |
Sure, by coredev I had already written most of this implementation (based on sockman) but the performance was bad, and that was part of the motivation behind the deep-dive talk. However, by the end of the week I had reviewed that code in person with smart attendees and not only improved the performance of my code but started to improve performance vs master branch as well! Those updates came in the days just after the deep-dive discussion. SOME kind of sockman is needed to replace libevent. The one @vasild wrote does actually seem to work well for this purpose as well as for p2p, and it would be "nice" to only have to maintain one I/O loop structure in bitcoind. @theuni is investigating how a sockman for http could be optimized if it had no other purpose, and I think that is the kind of feedback that will help us decide which path to take. |
640f38f to
49784b3
Compare
💯 |
When DynSock is used to represent a connected socket (e.g. a client) the data I/O pipes are needed but not the m_accepted_sockets Queue, because connected sockets do not create more connected sockets. When DynSock is used to represent a listening socket, the Queue is necessary to create connected sockets upon mocked connection, but the Queue does not need to be a std::shared_ptr as long as it is guaranteed to live as long as the DynSock. Co-Authored by: Hodlinator <[email protected]>
This commit is a no-op to isolate HTTP methods and objects that depend on libevent. Following commits will add replacement objects and methods in a new namespace for testing and review before switching over the server.
pinheadmz
left a comment
There was a problem hiding this comment.
push to fe18c0b:
Address feedback from @hodlinator. Wow, lots of really great comments thank you. Almost everything was code clean ups and C++ style improvements like avoiding data copying and shared_ptr's, using string_view instead of string, and using somewhat newer codebase utilities like hex literals and Expected.
Also addressed feedback from @theStack which overlapped a bit with hodlinator.
Some of these improvements require extra pre-factor commits, which are at the start of the PR now. With enough foresight they perhaps could have been applied in #34242 and #30205. These commits should be easy to review on their own PR which I'll open soon.
Meta note: this PR is getting LOOOooooong and github is struggling in some cases to find comments. So I hope I didn't miss anything. Might be time to open a fresh pull request soon?
src/httpserver.cpp
Outdated
| const std::string& line = *maybe_line; | ||
|
|
||
| // An empty line indicates end of the headers section https://www.rfc-editor.org/rfc/rfc2616#section-4 | ||
| if (line.length() == 0) break; |
There was a problem hiding this comment.
switch to empty() is cleaner thanks
src/rpc/protocol.h
Outdated
|
|
||
| //! Mapping of HTTP status codes to short string explanation. | ||
| //! Copied from libevent http.c success_phrases[] and client_error_phrases[] | ||
| const std::map<HTTPStatusCode, std::string> HTTPReason{ |
There was a problem hiding this comment.
Took the extra step here to create a struct with its own lookup method. The table is compile-time and the getter handles missing values to address #32061 (comment)
src/test/util/setup_common.h
Outdated
| decltype(CreateSock) m_create_sock_orig; | ||
|
|
||
| //! Queue of connected sockets returned by listening socket (represents network interface) | ||
| std::shared_ptr<DynSock::Queue> m_accepted_sockets{std::make_shared<DynSock::Queue>()}; |
There was a problem hiding this comment.
Good call, I know how much you hate shared_ptr's already ;-) I'll open a small PR with changes to DynSock. So far only this PR and #30988 use it, nothing on master! I'm going to make the Queue optional as well since it's not always needed.
src/test/util/str.h
Outdated
| * Returns a byte vector filled with data from a string. Used to test string-encoded | ||
| * data from a socket like HTTP headers. |
There was a problem hiding this comment.
Ok removed the example from the comment.
src/test/httpserver_tests.cpp
Outdated
| const std::vector<std::byte> full_request{TryParseHex<std::byte>( | ||
| "504f5354202f20485454502f312e310d0a486f73743a203132372e302e302e310d" | ||
| "0a436f6e6e656374696f6e3a20636c6f73650d0a436f6e74656e742d547970653a" | ||
| "206170706c69636174696f6e2f6a736f6e0d0a417574686f72697a6174696f6e3a" | ||
| "204261736963205831396a6232397261575666587a6f354f4751354f4451334d57" | ||
| "4e6d4e6a67304e7a417a59546b7a4e32457a4e7a6b305a44466c4f4451314e6a5a" | ||
| "6d5954526b5a6a4a694d7a466b596a68684f4449345a4759344d6a566a4f546735" | ||
| "5a4749344f54566c0d0a436f6e74656e742d4c656e6774683a2034360d0a0d0a7b" | ||
| "226d6574686f64223a22676574626c6f636b636f756e74222c22706172616d7322" | ||
| "3a5b5d2c226964223a317d0a").value()}; |
There was a problem hiding this comment.
Amazing, thanks! Switching to hex literals, and will apply elsewhere as well.
src/httpserver.h
Outdated
| int m_version_major; | ||
| int m_version_minor; |
There was a problem hiding this comment.
thanks, made this always uint8_t i think this was an artifact of a previous rebase.
src/httpserver.h
Outdated
| { | ||
| return AcceptConnection(*m_listen.front(), addr); | ||
| } | ||
| std::shared_ptr<HTTPClient> GetFirstConnection() { return m_connected.front(); } |
There was a problem hiding this comment.
Switched this from ptr to ref, thanks. Also had to change the argument to CloseConnection() which is not a temporary method like GetFirstConnection() but it makes sense there as well. That one could even just accept a HTTPServer::Id but I'll keep it as a HTTPClient& for now. (also addresses #32061 (comment))
There was a problem hiding this comment.
Ok I took the commit message change and reverted the change to RPC logging. I think it is still appropriate to remove UpdateHTTPServerLogging() and all calls to event_set_log_callback() and event_enable_debug_logging() from httpserver.cpp so what I left behind can be removed in #34411 or other follow-up, essentially:
static RPCHelpMan logging()
...
if (libevent thing) {
// Comment that this is where libevent logging code would go but nothing uses it now
}
src/httpserver.cpp
Outdated
| for (const auto &item : m_headers) { | ||
| if (CaseInsensitiveEqual(key, item.first)) { | ||
| return item.second; | ||
| } | ||
| } |
There was a problem hiding this comment.
I think I'll keep as is for now, but feel free to push back
src/httpserver.cpp
Outdated
|
|
||
| // Add response code and look up reason string | ||
| res.m_status = status; | ||
| res.m_reason = HTTPReason.find(status)->second; |
There was a problem hiding this comment.
Fixed with a custom lookup function that returns "" if code is missing, as part of addressing #32061 (comment)
hodlinator
left a comment
There was a problem hiding this comment.
Reviewed fe18c0b
Thanks for taking so many of my suggestions!
Nice that you were able to get rid of StringToBuffer() entirely!
Meta note: this PR is getting LOOOooooong and github is struggling in some cases to find comments. So I hope I didn't miss anything. Might be time to open a fresh pull request soon?
In favor. 😅
src/httpserver.h
Outdated
| /** | ||
| * @param[in] key The field-name of the header to search for and delete | ||
| */ | ||
| void RemoveAll(const std::string_view& key); |
There was a problem hiding this comment.
nit: Convention is to take string_view by value as you do elsewhere:
| void RemoveAll(const std::string_view& key); | |
| void RemoveAll(std::string_view key); |
There was a problem hiding this comment.
thanks, removed the const reference
src/test/httpserver_tests.cpp
Outdated
| res.m_version_minor = 1; | ||
| res.m_status = HTTP_OK; | ||
| res.m_reason = HTTPReason::FromHTTPStatusCode(res.m_status); | ||
| std::span<const std::byte>result{StringToBytes(R"({"result":865793,"error":null,"id":null})")}; |
There was a problem hiding this comment.
nit: Missing whitespace between type and variable name:
| std::span<const std::byte>result{StringToBytes(R"({"result":865793,"error":null,"id":null})")}; | |
| std::span<const std::byte> result{StringToBytes(R"({"result":865793,"error":null,"id":null})")}; |
src/test/httpserver_tests.cpp
Outdated
| // Given that the mock client is itself a mock socket | ||
| // with hard-coded data it should only take a fraction of that. | ||
| attempts = 6000; | ||
| do { |
There was a problem hiding this comment.
nit: Easier to read as an infinite loop when front-loading the fact instead of having to scan to the end?
| do { | |
| while (true) { |
There was a problem hiding this comment.
Thanks, front-loaded here, can't remember why this ended up looking like that.
src/test/util/setup_common.h
Outdated
|
|
||
| /** | ||
| * Connect to the socket with a mock client (a DynSock) and send pre-loaded data. | ||
| * Returns the I/O pipes from the mock client so we can read response datasent to it. |
There was a problem hiding this comment.
typo: "datasent":
| * Returns the I/O pipes from the mock client so we can read response datasent to it. | |
| * Returns the I/O pipes from the mock client so we can read response data sent to it. |
src/test/util/setup_common.h
Outdated
| decltype(CreateSock) m_create_sock_orig; | ||
|
|
||
| //! Queue of connected sockets returned by listening socket (represents network interface) | ||
| std::shared_ptr<DynSock::Queue> m_accepted_sockets{std::make_shared<DynSock::Queue>()}; |
There was a problem hiding this comment.
Thank you for killing this shared_ptr! :)
Good call on adding the assert in DynSock::Accept().
Hm.. you want to add the extra constructor to document the two ways in which to use DynSock?
Maybe could inline the new constructor in the header and forward to the main one?
explicit DynSock(std::shared_ptr<Pipes> pipes) : DynSock{std::move(pipes), /*accept_sockets=*/nullptr} {}
src/httpserver.cpp
Outdated
| { | ||
| // Create socket for listening for incoming connections | ||
| sockaddr_storage storage; | ||
| auto sa = static_cast<sockaddr*>(static_cast<void*>(&storage)); |
There was a problem hiding this comment.
Please respond to my comment in #32061 (comment), I think this is cheating the type system.
There was a problem hiding this comment.
Ah thanks sorry I missed that on last rebase.
src/rpc/protocol.h
Outdated
|
|
||
| //! Mapping of HTTP status codes to short string explanation. | ||
| //! Copied from libevent http.c success_phrases[] and client_error_phrases[] | ||
| const std::map<HTTPStatusCode, std::string> HTTPReason{ |
There was a problem hiding this comment.
Maybe we have few enough values that we could follow the pattern of RequestMethodString() and skip the struct?
inline std::string_view HTTPStatusReasonString(HTTPStatusCode code)
{
switch (code) {
case HTTP_OK: return "OK";
case HTTP_NO_CONTENT: return "No Content";
case HTTP_BAD_REQUEST: return "Bad Request";
case HTTP_UNAUTHORIZED: return "Unauthorized";
case HTTP_FORBIDDEN: return "Forbidden";
case HTTP_NOT_FOUND: return "Not Found";
case HTTP_BAD_METHOD: return "Method Not Allowed";
case HTTP_INTERNAL_SERVER_ERROR: return "Internal Server Error";
case HTTP_SERVICE_UNAVAILABLE: return "Service Unavailable";
default:
// Reason phrases are optional and may be replaced by local variants.
// https://httpwg.org/specs/rfc9110.html#rfc.section.15.1
return "";
}
}| #ifdef WIN32 | ||
| evthread_use_windows_threads(); | ||
| #else | ||
| evthread_use_pthreads(); | ||
| #endif |
There was a problem hiding this comment.
Does removing this screw with other use of libevent outside of HTTP?
https://libevent.org/doc/thread_8h.html#details says:
Most programs will either be using Windows threads or Posix threads. You can configure Libevent to use one of these evthread_use_windows_threads() or evthread_use_pthreads() respectively. If you're using another threading library, you'll need to configure threading functions manually using evthread_set_lock_callbacks() and evthread_set_condition_callbacks().
Maybe could be extracted to the same temporary code which could do the libevent log hookup, if we go that way.
There was a problem hiding this comment.
I think this is safe to remove. Besides the HTTP server, libevent is used in TorControl, where these same lines are repeated in StartTorControl(). In bitcoin-cli we don't do this because the client binary uses a single thread to make a single HTTP request.
There was a problem hiding this comment.
Thanks for changing the commit message!
Seems fine to break libevent logging if this PR lands in the same release as we remove the libevent dependency globally, but otherwise it would be good to extract the logging logic which gets removed in fe18c0b.
HTTP Response message: https://datatracker.ietf.org/doc/html/rfc1945#section-6 Status line (first line of response): https://datatracker.ietf.org/doc/html/rfc1945#section-6.1 Status code definitions: https://datatracker.ietf.org/doc/html/rfc1945#section-9
HTTP Request message: https://datatracker.ietf.org/doc/html/rfc1945#section-5 Request Line aka Control Line aka first line: https://datatracker.ietf.org/doc/html/rfc1945#section-5.1 See message_read_status() in libevent http.c for how `MORE_DATA_EXPECTED` is handled there
…ocket Introduce a new low-level socket managing class `HTTPServer`. BindAndStartListening() was copied from CConnMan's BindListenPort() in net.cpp and modernized. Unit-test it with a new class `SocketTestingSetup` which mocks `CreateSock()` and will enable mock client I/O in future commits. Co-authored-by: Vasil Dimov <[email protected]>
AcceptConnection() is mostly copied from CConmann in net.cpp and then modernized. Co-authored-by: Vasil Dimov <[email protected]>
Co-authored-by: Vasil Dimov <[email protected]>
Socket handling methods are copied from CConnMan: `CConnman::GenerateWaitSockets()` `CConnman::SocketHandlerListening()` `CConnman::ThreadSocketHandler()` and `CConnman::SocketHandler()` are combined into ThreadSocketHandler()`. Co-authored-by: Vasil Dimov <[email protected]>
`SocketHandlerConnected()` adapted from CConnman Testing this requires adding a new feature to the SocketTestingSetup, inserting a "request" payload into the mock client that connects to us. This commit also moves IOErrorIsPermanent() from sock.cpp to sock.h so it can be called from the socket handler in httpserver.cpp Co-authored-by: Vasil Dimov <[email protected]>
Sockets-touching bits copied and adapted from `CConnman::SocketSendData()` Testing this requires adding a new feature to the SocketTestingSetup, returning the DynSock I/O pipes from the mock socket so the received data can be checked. Co-authored-by: Vasil Dimov <[email protected]>
See https://www.rfc-editor.org/rfc/rfc7230#section-6.3.2 > A server MAY process a sequence of pipelined requests in parallel if they all have safe methods (Section 4.2.1 of [RFC7231]), but it MUST send the corresponding responses in the same order that the requests were received. We choose NOT to process requests in parallel. They are executed in the order recevied as well as responded to in the order received. This prevents race conditions where old state may get sent in response to requests that are very quick to process but were requested later on in the queue.
This is a refactor to prepare for matching the API of HTTPRequest definitions in both namespaces http_bitcoin and http_libevent. In particular, to provide a consistent return type for GetRequestMethod() in both classes.
These methods are called by http_request_cb() and are present in the original http_libevent::HTTPRequest.
The original function is passed to libevent as a callback when HTTP requests are received and processed. It wrapped the libevent request object in a http_libevent::HTTPRequest and then handed that off to bitcoin for basic checks and finally dispatch to worker threads. In this commit we split the function after the http_libevent::HTTPRequest is created, and pass that object to a new function that maintains the logic of checking and dispatching. This will be the merge point for http_libevent and http_bitcoin, where HTTPRequest objects from either namespace have the same downstream lifecycle.
The original function was already naturally split into two chunks: First, we parse and validate the users' RPC configuration for IPs and ports. Next we bind libevent's http server to the appropriate endpoints. This commit splits these chunks into two separate functions, leaving the argument parsing in the common space of the module and moving the libevent-specific binding into the http_libevent namespace. A future commit will implement http_bitcoin::HTTPBindAddresses to bind the validate list of endpoints by the new HTTP server.
pinheadmz
left a comment
There was a problem hiding this comment.
push to 0aa7362
Address more great feedback from @hodlinator mostly code style nits, and catching some things missed on last rebase. Also replied to one or two things, not sure how to proceed (I might have missed your point)
src/httpserver.h
Outdated
| /** | ||
| * @param[in] key The field-name of the header to search for and delete | ||
| */ | ||
| void RemoveAll(const std::string_view& key); |
There was a problem hiding this comment.
thanks, removed the const reference
src/test/httpserver_tests.cpp
Outdated
| res.m_version_minor = 1; | ||
| res.m_status = HTTP_OK; | ||
| res.m_reason = HTTPReason::FromHTTPStatusCode(res.m_status); | ||
| std::span<const std::byte>result{StringToBytes(R"({"result":865793,"error":null,"id":null})")}; |
src/test/httpserver_tests.cpp
Outdated
| // Given that the mock client is itself a mock socket | ||
| // with hard-coded data it should only take a fraction of that. | ||
| attempts = 6000; | ||
| do { |
There was a problem hiding this comment.
Thanks, front-loaded here, can't remember why this ended up looking like that.
src/test/util/setup_common.h
Outdated
|
|
||
| /** | ||
| * Connect to the socket with a mock client (a DynSock) and send pre-loaded data. | ||
| * Returns the I/O pipes from the mock client so we can read response datasent to it. |
| /** | ||
| * Create a new mocked sock that represents a connected socket. It has pipes | ||
| * for data transport but there is no queue because connected sockets do | ||
| * not introduce new connected sockets. | ||
| * @param[in] pipes Send/recv pipes used by the Send() and Recv() methods. | ||
| */ | ||
| explicit DynSock(std::shared_ptr<Pipes> pipes); | ||
|
|
There was a problem hiding this comment.
Responding to: #32061 (comment)
Hm.. you want to add the extra constructor to document the two ways in which to use DynSock?
I believe this is covered already
src/httpserver.cpp
Outdated
| { | ||
| // Create socket for listening for incoming connections | ||
| sockaddr_storage storage; | ||
| auto sa = static_cast<sockaddr*>(static_cast<void*>(&storage)); |
There was a problem hiding this comment.
Ah thanks sorry I missed that on last rebase.
src/rpc/protocol.h
Outdated
| @@ -20,6 +20,35 @@ enum HTTPStatusCode | |||
| HTTP_SERVICE_UNAVAILABLE = 503, | |||
| }; | |||
|
|
|||
| //! Mapping of HTTP status codes to short string explanation. | |||
| //! Copied from libevent http.c success_phrases[] and client_error_phrases[] | |||
| struct HTTPReason | |||
There was a problem hiding this comment.
responding to #32061 (comment)
Maybe we have few enough values that we could follow the pattern of RequestMethodString() and skip the struct?
Sure, using inline method here instead of struct.
There was a problem hiding this comment.
Also for example see that bitcoin-cli calls this on its own:
Line 1334 in ff7cdf6
| #ifdef WIN32 | ||
| evthread_use_windows_threads(); | ||
| #else | ||
| evthread_use_pthreads(); | ||
| #endif |
There was a problem hiding this comment.
I think this is safe to remove. Besides the HTTP server, libevent is used in TorControl, where these same lines are repeated in StartTorControl(). In bitcoin-cli we don't do this because the client binary uses a single thread to make a single HTTP request.
theStack
left a comment
There was a problem hiding this comment.
Reviewed up to c44ffcc (~half-way through in terms of commit count), left some nits only
Meta note: this PR is getting LOOOooooong and github is struggling in some cases to find comments. So I hope I didn't miss anything. Might be time to open a fresh pull request soon?
Sounds reasonable yes 👍
| "334e7a49354d544d334e54526d4e54686b4e6a63324f574d775a5459785a6a677a" | ||
| "4e5467794e7a4577595459314f47526b596a566d5a4751330d0a436f6e74656e74" | ||
| "2d4c656e6774683a2034360d0a0d0a"_hex}; | ||
| util::LineReader reader(buffer, /*max_line_length=*/1028); |
There was a problem hiding this comment.
curiosity nit: any specific reason for this max line length magic number, here and below? (my best guess would be that this was derived from a reasonably high power of two (1024) plus the overhead for ": " and "\r\n" 🤔 ); could introduce a constant, or maybe just use MAX_HEADERS_SIZE
| std::vector<std::byte> StringToBuffer(const std::string& str) | ||
| { | ||
| auto span = std::as_bytes(std::span(str)); | ||
| return {span.begin(), span.end()}; | ||
| } | ||
|
|
| std::span<const std::byte> result{StringToBytes(R"({"result":865793,"error":null,"id":null})")}; | ||
| res.m_body.assign(result.begin(), result.end()); | ||
| res.m_headers = std::move(headers); | ||
| // Only one header means we don't need to worry about unordered_map, |
There was a problem hiding this comment.
in 476d3cc: nit: the headers are not in an unordered_map anymore, so could e.g. just say "worry about the order" (or remove the comment)
This is a major component of removing libevent as a dependency of the project, by replacing the HTTP server used for RPC and REST with one implemented entirely within the Bitcoin Core codebase. The new
HTTPServerclass runs its own I/O thread, handling socket connections with code based on #30988, but tailored specifically for HTTP.Some functional tests were added in #32408 to cover libevent behaviors that remain consistent with this branch.
Commit strategy:
http_libeventhttp_bitcoin(classes likeHTTPRequest,HTTPClient, etc...)http_libevent)I am currently seeing about a 10% speed up in the functional tests on my own arm/macos machine.
Integration testing:
I am testing the new HTTP server by forking projects that integrate with bitcoin via HTTP and running their integration tests with bitcoind built from this branch (on Github actions). I will continue adding integrations over time, and re-running these CI tests as this branch gets rebased:
rpc-bitcoin