Skip to content

Add functional test for IPC interface#33201

Merged
glozow merged 6 commits intobitcoin:masterfrom
sipa:202508_ipc_test
Sep 5, 2025
Merged

Add functional test for IPC interface#33201
glozow merged 6 commits intobitcoin:masterfrom
sipa:202508_ipc_test

Conversation

@sipa
Copy link
Member

@sipa sipa commented Aug 15, 2025

This adds support to the functional test framework to run the multiprocess bitcoin-node binary, and then tests it in a new interface_ipc.py functional test through the pycapnp module.

@DrahtBot
Copy link
Contributor

DrahtBot commented Aug 15, 2025

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33201.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK ryanofsky, TheCharlatan, Sjors
Stale ACK josibake, janb84, ismaelsadeeq

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #33142 (test: Run bench sanity checks in parallel with functional tests by maflcko)
  • #32297 (bitcoin-cli: Add -ipcconnect option by ryanofsky)
  • #31723 (qa debug: Add --debug_runs/-waitfordebugger [DRAFT] by hodlinator)
  • #31349 (ci: detect outbound internet traffic generated while running tests by vasild)
  • #30437 (ipc: add bitcoin-mine test program by ryanofsky)
  • #29838 (Feature: Use different datadirs for different signets by BrandonOdiwuor)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed.
Task lint: https://github.com/bitcoin/bitcoin/runs/48194727681
LLM reason (✨ experimental): The CI failure is caused by errors detected during the linting process, specifically from ruff and mypy, indicating code issues and missing module imports.

Hints

Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:

  • Possibly due to a silent merge conflict (the changes in this pull request being
    incompatible with the current code in the target branch). If so, make sure to rebase on the latest
    commit of the target branch.

  • A sanitizer issue, which can only be found by compiling with the sanitizer and running the
    affected test.

  • An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

@sipa sipa force-pushed the 202508_ipc_test branch from 5050855 to 27bd678 Compare August 15, 2025 20:24
@ryanofsky
Copy link
Contributor

Concept ACK. 27bd67857dd0e145d7539b949df2b7aaa53c6b1b is really nice and simpler than I would have expected.

In case it's useful #30437 adds an interface_ipc_mining.py test, and #32297 adds an interface_ipc_cli.py test which are similar to this test and have similar test framework updates. Code from those PRs might be useful here or vice versa.

@sipa sipa force-pushed the 202508_ipc_test branch 4 times, most recently from b4cb8f6 to b87de46 Compare August 16, 2025 00:08
@Sjors
Copy link
Member

Sjors commented Aug 16, 2025

Concept ACK

So far I achieved test coverage for the Mining interface with a combination of unit tests and by having RPC methods such as getblocktemplate use the interface internally. But the latter has side-effects (see #32547) and could be avoided once we have enough direct coverage.

CentOS CI fails with capnp.lib.capnp.KjException: mp/proxy.capnp:6: failed: Import failed: /capnp/c++.capn https://cirrus-ci.com/task/4824947408764928?logs=ci#L3320

Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On macOS 15.6 with capnp 1.2.0 installed via Homebrew and Python 3.10.14 installed via PyEnv I get:

pip3 install pycapnp
...
Successfully installed pycapnp-2.0.0

cmake -B build
cmake --build build
...

% build/test/functional/interface_ipc.py
2025-08-17T07:05:49.321522Z TestFramework (INFO): PRNG seed is: 6039316184654231320
2025-08-17T07:05:49.322566Z TestFramework (INFO): Initializing test directory /var/folders/8p/qqdl2fsj7rd6q096jtmsfydc0000gn/T/bitcoin_func_test_z3jtr5u1
2025-08-17T07:05:50.595292Z TestFramework (ERROR): Unexpected exception
Traceback (most recent call last):
  File "/Users/sjors/dev/bitcoin/test/functional/test_framework/test_framework.py", line 200, in main
    self.run_test()
  File "/Users/sjors/dev/bitcoin/build/test/functional/interface_ipc.py", line 82, in run_test
    self.run_echo_test()
  File "/Users/sjors/dev/bitcoin/build/test/functional/interface_ipc.py", line 63, in run_echo_test
    asyncio.run(capnp.run(async_routine()))
  File "/Users/sjors/.pyenv/versions/3.10.14/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/Users/sjors/.pyenv/versions/3.10.14/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "capnp/lib/capnp.pyx", line 1944, in run
  File "capnp/lib/capnp.pyx", line 1945, in capnp.lib.capnp.run
  File "/Users/sjors/dev/bitcoin/build/test/functional/interface_ipc.py", line 57, in async_routine
    init, ctx = await self.make_capnp_init_ctx()
  File "/Users/sjors/dev/bitcoin/build/test/functional/interface_ipc.py", line 45, in make_capnp_init_ctx
    modules = self.capnp_modules()
  File "/Users/sjors/dev/bitcoin/build/test/functional/interface_ipc.py", line 38, in capnp_modules
    "proxy": capnp.load(str(mp_dir / "mp" / "proxy.capnp"), imports=imports),
  File "capnp/lib/capnp.pyx", line 4385, in capnp.lib.capnp.load
  File "capnp/lib/capnp.pyx", line 3574, in capnp.lib.capnp.SchemaParser.load
capnp.lib.capnp.KjException: mp/proxy.capnp:6: failed: Import failed: /capnp/c++.capnp

Maybe related: capnproto/pycapnp#377 (though pycapnp-1.3.0 doesn't work at all, since it doesn't have the run method)

Suggested test doc improvement:

diff --git a/test/README.md b/test/README.md
index 37f2c07217..3b2979cdf6 100644
--- a/test/README.md
+++ b/test/README.md
@@ -34,6 +34,9 @@ The ZMQ functional test requires a python ZMQ library. To install it:
 - on Unix, run `sudo apt-get install python3-zmq`
 - on mac OS, run `pip3 install pyzmq`

+The IPC functional test requires a python IPC library. To install it:
+
+- `pip3 install pycapnp`

 On Windows the `PYTHONUTF8` environment variable must be set to 1:

Expose ENABLE_IPC build option to functional tests so new tests can test
IPC-only features.
Set new `BitcoinTestFramework.binary_paths.bitcoin_bin` property with path to
the `bitcoin` wrapper binary. This allows new tests for `bitcoin-mine` in
bitcoin#30437 and `bitcoin-cli` in bitcoin#32297 to find the `bitcoin` binary and call
`bitcoin -m` to start nodes with IPC support. This way the new tests can run
whenever the ENABLE_IPC build option is enabled, instead of only running when
the `BITCOIN_CMD` environment variable is set to `bitcoin -m`
With this change, tests can specify `self.extra_init = [{ipcbind: True}]` to
start a node listening on an IPC socket, instead of needing to choose which
node binary to invoke and what `self.extra_args=[["-ipcbind=..."]]` value to
pass to it.

The eliminates boilerplate code bitcoin#30437 (interface_ipc_mining.py), bitcoin#32297
(interface_ipc_cli.py), and bitcoin#33201 (interface_ipc.py) previously needed in
their test setup.
@sipa
Copy link
Member Author

sipa commented Sep 5, 2025

Strange error, https://github.com/bitcoin/bitcoin/actions/runs/17491932432/job/49683489249?pr=33201:

node0 stderr bitcoin-node: ipc/libmultiprocess/include/mp/proxy-io.h:279: void mp::Waiter::post(Fn &&) [Fn = (lambda at 
                 ./ipc/libmultiprocess/include/mp/type-context.h:67:19)]: Assertion `!m_fn' failed.

in line

template4 = await template2.result.waitNext(ctx, waitoptions)

Any idea, @ryanofsky?

@sipa
Copy link
Member Author

sipa commented Sep 5, 2025

Debugged on IRC:

07:50:56 < sipa> ryanofsky: any idea about this error in https://github.com/bitcoin/bitcoin/actions/runs/17491932432/job/49683489249?pr=33201 :
07:51:08 < sipa>  node0 stderr bitcoin-node: ipc/libmultiprocess/include/mp/proxy-io.h:279: void mp::Waiter::post(Fn &&) [Fn = (lambda at 
                 ./ipc/libmultiprocess/include/mp/type-context.h:67:19)]: Assertion `!m_fn' failed. 
07:55:12 < kevkevin> looks like the test got hung up on this line in the ipc test: template4 = File "/home/admin/actions-runner/_work/_temp/build/test/functional/interface_ipc.py", line 
                     156, in async_routine: await template2.result.waitNext(ctx, waitoptions)
07:56:46 < sipa> yeah, bitcoin-node crashed during that invocation
08:01:34 < sipa> i can reproduce it locally, though it's pretty rare (once in 60 runs, so far)
08:04:00 < kevkevin> hmm interesting, I'll try to reproduce it locally myself
08:05:03 < ryanofsky> that is interesting, i've never seen that assert hit before. i think it could happen if two ipc clients try to execute a different IPC calls on the same thread at the 
                      same time
08:06:01 < sipa> huh!
08:06:15 < ryanofsky> libmultiprocess assumes a 1:1 threading model where a there is one server thread for every client thread, emulating a traditional c++ call stack
08:06:26 < ryanofsky> allowing recursive mutexes, synchronous callbacks, etc
08:06:37 < sipa> should the python side do anything specific to arrange that?
08:07:01 < sipa> i would assume not, because there is only one thread on the python side that does all IPC calls
08:07:11 < ryanofsky> it should just make two IPC calls at the same time, and they should be calls that do not return instantly like waitNext
08:08:18 < ryanofsky> yeah, it is unexpected that current python test would cause that
08:08:35 < dodo> c
08:10:03 < ryanofsky> maybe there is a race condition where c++ code sends a response but the server thread doesn't return to being idle, and python code sends the next request quickly 
                      enough to hit the assert
08:11:00 < sipa> oh, i may have messed something up
08:11:09 < sipa> waitnext = template2.result.waitNext(ctx, waitoptions)
08:11:14 < sipa> template4 = await template2.result.waitNext(ctx, waitoptions)
08:11:31 < sipa> might create two asynchronous calls simultaneously?
08:12:27 < ryanofsky> it seems like that would just chain the calls so the server runs them back to back, and make the bug more likely to happen
08:13:09 < ryanofsky> but if you added an await on the first line it might prevent this
08:13:40 < sipa> the first one is just unnecessary - i inlined it into the second one, but forgot to remove the original
08:14:06 < sipa> you say "make the bug more likely", so you think that the bug isn't in this code itself, but elsewhere?
08:14:18 < ryanofsky> actually i misread. yeah i assumed you were using waitnext variable second line and the calls would be chained. but actually this does look like simultaneous calls
08:14:44 < sipa> alright, let me run it 1000 times to see if it reappears, but if not, i'm going to assume that was the culprit
08:14:48 < ryanofsky> no i think this code does explain the bug
08:15:15 < sipa> thanks!
08:15:23 < ryanofsky> i just first misread it as chained calls that would be pipelined when actually those are parallel calls
08:15:29 < sipa> right
08:16:14 < ryanofsky> thanks for the test, and finding this. server could definitely do better here, it could just refuse the request instead of asserting
08:18:03 < sipa> or at least a more helpful error message
08:18:23 < sipa> i could imagine third parties trying to use the interface might hit something like this too
08:20:08 < ryanofsky> yeah the error message is the most important thing in practice

So the culprit was:

waitnext = template2.result.waitNext(ctx, waitoptions)
template4 = await template2.result.waitNext(ctx, waitoptions)

Which resulted in attempting two parallel IPC calls being executed in the same thread on the server.

sipa and others added 3 commits September 5, 2025 09:24
Co-Authored-By: ismaelsadeeq <[email protected]>
Co-Authored-By: ryanofsky <[email protected]>
Co-Authored-By: TheCharlatan <[email protected]>
Co-Authored-By: Sjors Provoost <[email protected]>
Install pycapnp on all (active) CI hosts which have IPC enabled and
run the functional tests.

Except for previous_releases, which uses an older version of pip
that doesn't support --break-system-packages.
Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK a341e11. Changes since last review: rebasing, switching to miniwallet and expanding wallet test, improving pycapnp install steps in instructions and CI.

Looking at CI changes here, it's interesting that low-level way we configure CI jobs allows dependencies to be installed that might not be used, and things to be built that are not tested. I think a more ideal config / script boundary might make the config only responsible for specifying what should be tested, and let the script be responsible for figuring out what dependencies need to be installed. That's a wider issue though and CI changes here all look right from what I can tell.

re: libmultiprocess assert failure #33201 (comment), I opened bitcoin-core/libmultiprocess#206 to improve this

Copy link
Contributor

@sedited sedited left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK a341e11

@Sjors
Copy link
Member

Sjors commented Sep 5, 2025

ACK a341e11

As if by magic, interface_ipc.py now does run on the test each commit job.

@sipa
Copy link
Member Author

sipa commented Sep 6, 2025

@Sjors Not magic, it needed a --break-system-packages instead of --user somewhere.

@Sjors
Copy link
Member

Sjors commented Oct 7, 2025

Reviewers may find #33566 interesting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.