Skip to content

DAOS-18086 control: Fix data race in drpc unit test#17979

Merged
daltonbohning merged 1 commit intomasterfrom
kjacque/drpc-utest-data-race
Apr 14, 2026
Merged

DAOS-18086 control: Fix data race in drpc unit test#17979
daltonbohning merged 1 commit intomasterfrom
kjacque/drpc-utest-data-race

Conversation

@kjacque
Copy link
Copy Markdown
Contributor

@kjacque kjacque commented Apr 10, 2026

This issue was only affecting unit tests.

Steps for the author:

  • Commit message follows the guidelines.
  • Appropriate Features or Test-tag pragmas were used.
  • Appropriate Functional Test Stages were run.
  • At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • Gatekeeper requested (daos-gatekeeper added as a reviewer).

This issue was only affecting unit tests.

Signed-off-by: Kris Jacque <[email protected]>
@kjacque kjacque requested review from mjmac and tanabarr April 10, 2026 17:59
@kjacque kjacque self-assigned this Apr 10, 2026
@kjacque kjacque requested review from a team as code owners April 10, 2026 18:00
@kjacque
Copy link
Copy Markdown
Contributor Author

kjacque commented Apr 10, 2026

Scripted "go test -race" runs for this test 100 times in a row. I was able to reproduce the original issue within 100 runs on master, but not with this patch.

@github-actions
Copy link
Copy Markdown

Ticket title is 'Unit Tests / Unit Test on EL 8.8 / UTEST_control.drpc.TestServer_Listen_AcceptConnection: data race'
Status is 'In Review'
https://daosio.atlassian.net/browse/DAOS-18086

@kjacque kjacque added the forced-landing The PR has known failures or has intentionally reduced testing, but should still be landed. label Apr 14, 2026
@kjacque
Copy link
Copy Markdown
Contributor Author

kjacque commented Apr 14, 2026

Functional hardware tests didn't run due to an existing issue in CI. However: this is a Go unit test only fix.

@kjacque kjacque requested a review from a team April 14, 2026 17:34
@daltonbohning daltonbohning merged commit 620b590 into master Apr 14, 2026
43 of 45 checks passed
@daltonbohning daltonbohning deleted the kjacque/drpc-utest-data-race branch April 14, 2026 18:19
daltonbohning pushed a commit that referenced this pull request Apr 23, 2026
This issue was only affecting unit tests.

Signed-off-by: Kris Jacque <[email protected]>
mchaarawi pushed a commit that referenced this pull request Apr 27, 2026
The mock for this test wasn't updated when we switched to dRPC
message chunking, so the mocked sessions were failing and being
removed. Whether they were removed early enough for the test to
detect was intermittent.

- Populate valid data in the mockConn used for each mock session.
  This ensures the sessions remain "open" until the context is
  canceled (i.e. the test ends).
- Add a brief sleep in the test to allow for the goroutines to
  either die or remain stable. This allowed the test to reproduce
  the bug 100% of the time.
- Update to a table-based test format for the affected tests.

Signed-off-by: Kris Jacque <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

forced-landing The PR has known failures or has intentionally reduced testing, but should still be landed.

Development

Successfully merging this pull request may close these issues.

4 participants