Skip to content

DAOS-18304 ddb: Introduce DdbAPI interface for Go unit testing#17967

Open
knard38 wants to merge 1 commit intomasterfrom
ckochhof/fix/master/daos-18304-patch-001
Open

DAOS-18304 ddb: Introduce DdbAPI interface for Go unit testing#17967
knard38 wants to merge 1 commit intomasterfrom
ckochhof/fix/master/daos-18304-patch-001

Conversation

@knard38
Copy link
Copy Markdown
Contributor

@knard38 knard38 commented Apr 10, 2026

Description

Overview

Preliminary refactoring to enable Go unit test coverage for the ddb command layer.
The actual tests are added in the follow-up PR #18086.

A DdbAPI Go interface is introduced to decouple the command layer from the C VOS
implementation, so that tests can use a stub without requiring a live VOS environment.

This patch is also solving the ticket DAOS-18581

Go Refactoring

  • New DdbAPI interface: all ddbXxx() free functions are converted to
    (ctx *DdbContext) Xxx() methods; DdbAPI is propagated through
    addAppCommands(), createGrumbleApp() and parseOpts().
  • DdbContext lifecycle: Init() becomes a method on a caller-allocated
    DdbContext, giving the caller (or a test stub) full control over the object.
  • Remove dc_pool_path / dc_db_path from ddb_ctx: paths are now
    passed as explicit function arguments, removing the need for the SetCString()
    helper.
  • parseOpts() testability: os.Exit() calls are removed; all errors are
    returned up to main(), making parseOpts() fully unit-testable.
  • Auto-open command list: replace the HasPrefix loop with a noAutoOpen
    map for exact command-name matching.

C API Simplification

  • dv_pool_open / dv_pool_destroy: signatures simplified from
    (path, path_parts, poh, flags, write_mode) to (path, db_path, ctx, flags);
    vos_file_parts allocation is moved into a new internal
    create_vos_file_parts() helper.
  • VOS/DB path size: DB_PATH_SIZE and VOS_PATH_SIZE raised from 256 to
    PATH_MAX; the original VOS path is now stored in vf_vos_file_path and its
    length is validated in parse_vos_file_parts().

Bug Fixes

  • ddb_run_feature(): dc_poh / dc_write_mode state reset is now guarded
    by the close flag, preventing corruption when the pool was already open;
    add missing error message on dv_pool_open() failure.
  • Non-interactive mode: errors from runCmdStr() and runFileCmds() are now
    propagated instead of silently discarded.
  • --version: prints directly instead of synthesising a fake command.
  • --db_path without --vos_path: returns an explicit error instead of
    being silently ignored.
  • Pool close: consolidated into a closePoolIfOpen() helper with defer.
  • Feature(): fix memory leak on enable/disable C strings.
  • DtxStat(): fix defer freeString() ordering.
  • Unknown command errors: typed unknownCmdError struct for type-safe
    detection; exitWithError() now prints the command list on unknown commands
    and fixes the duplicated ERROR: prefix in fault resolution messages.
  • rm_pool: make --vos_path mandatory; pass db_path explicitly.
  • Test fixes: correct VOS tree path indices in ls, value_dump and
    ilog_dump tests; update dtx_stat message; restore pool handle in
    helper_stat_open_modify_close_stat() after close; fix assert_memory_equal
    in read_only_vs_write_mode_test() to compare PRE vs POST digest.

Steps for the author:

  • Commit message follows the guidelines.
  • Appropriate Features or Test-tag pragmas were used.
  • Appropriate Functional Test Stages were run.
  • At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • Gatekeeper requested (daos-gatekeeper added as a reviewer).

@knard38 knard38 self-assigned this Apr 10, 2026
@github-actions
Copy link
Copy Markdown

Ticket title is 'Add unit test to ddb go code'
Status is 'In Progress'
https://daosio.atlassian.net/browse/DAOS-18304

@daosbuild3
Copy link
Copy Markdown
Collaborator

@knard38 knard38 force-pushed the ckochhof/fix/master/daos-18304-patch-001 branch 2 times, most recently from aadfb16 to 92d3255 Compare April 10, 2026 10:24
@daosbuild3
Copy link
Copy Markdown
Collaborator

@knard38 knard38 force-pushed the ckochhof/fix/master/daos-18304-patch-001 branch from 92d3255 to b2a2966 Compare April 10, 2026 15:57
@daosbuild3
Copy link
Copy Markdown
Collaborator

@knard38 knard38 force-pushed the ckochhof/fix/master/daos-18304-patch-001 branch 3 times, most recently from 1c2251d to 2b2bad2 Compare April 13, 2026 07:45
@daosbuild3
Copy link
Copy Markdown
Collaborator

Test stage Unit Test bdev with memcheck completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos/job/PR-17967/7/display/redirect

@daosbuild3
Copy link
Copy Markdown
Collaborator

Test stage Unit Test with memcheck completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos/job/PR-17967/7/display/redirect

@knard38 knard38 changed the title DAOS-18304 ddb: Add unit test to ddb code DAOS-18304 ddb: Add unit test infrastructure to ddb Go code Apr 13, 2026
@knard38 knard38 force-pushed the ckochhof/fix/master/daos-18304-patch-001 branch 3 times, most recently from 111eb0e to 44d2003 Compare April 13, 2026 13:06
@daosbuild3
Copy link
Copy Markdown
Collaborator

@daosbuild3
Copy link
Copy Markdown
Collaborator

@daosbuild3
Copy link
Copy Markdown
Collaborator

@knard38 knard38 force-pushed the ckochhof/fix/master/daos-18304-patch-001 branch 2 times, most recently from f6bdb6e to 2ebb6be Compare April 14, 2026 19:03
@daosbuild3
Copy link
Copy Markdown
Collaborator

Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17967/14/execution/node/1335/log

@daosbuild3
Copy link
Copy Markdown
Collaborator

Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17967/14/execution/node/1356/log

Base automatically changed from ckochhof/fix/master/daos-18291 to master April 20, 2026 15:29
@knard38 knard38 force-pushed the ckochhof/fix/master/daos-18304-patch-001 branch 2 times, most recently from 91c32a6 to 3c4faba Compare April 22, 2026 13:43
@knard38 knard38 changed the title DAOS-18304 ddb: Add unit test infrastructure to ddb Go code DAOS-18304 ddb: Introduce DdbAPI interface for Go unit testing Apr 22, 2026
@knard38 knard38 force-pushed the ckochhof/fix/master/daos-18304-patch-001 branch from 3c4faba to 24d96e6 Compare April 22, 2026 14:30
Preliminary refactoring to enable Go unit test coverage for the ddb
command layer in a follow-up patch. A DdbAPI interface is introduced to
decouple the command layer from the C VOS implementation, so that future
tests can use stubs without requiring a live VOS environment.

- Convert all ddbXxx() free functions to (ctx *DdbContext) Xxx() methods
  and propagate DdbAPI through addAppCommands(), createGrumbleApp() and
  parseOpts().
- Remove dc_pool_path/dc_db_path from ddb_ctx; pass db_path explicitly
  at every call site.
- Simplify dv_pool_open/dv_pool_destroy to (path, db_path, ctx, flags);
  move vos_file_parts allocation into a new create_vos_file_parts() helper.
- Fix ddb_run_feature(): guard dc_poh/dc_write_mode reset with the close
  flag; add missing error message on open failure.
- Fix memory leaks in Feature() (enable/disable C strings) and DtxStat()
  (freeString defer ordering).
- Propagate errors from runCmdStr() and runFileCmds() in non-interactive mode.
- Add typed unknownCmdError; fix duplicated ERROR: prefix in exitWithError().
- Remove os.Exit() from parseOpts() to make it fully testable.
- Raise DB_PATH_SIZE and VOS_PATH_SIZE to PATH_MAX; store the original
  VOS path in vf_vos_file_path.
- Fix test VOS tree path indices and dtx_stat message.
- Fix helper_stat_open_modify_close_stat(): restore the pool handle
  saved from dv_test_setup after close instead of leaving a stale one.
- Fix read_only_vs_write_mode_test(): compare PRE vs POST digest instead
  of PRE vs PRE.

Features: recovery
Signed-off-by: Cedric Koch-Hofer <[email protected]>
@knard38 knard38 force-pushed the ckochhof/fix/master/daos-18304-patch-001 branch from 24d96e6 to f81f3be Compare April 23, 2026 05:02
@knard38 knard38 marked this pull request as ready for review April 23, 2026 05:51
@knard38 knard38 requested review from a team as code owners April 23, 2026 05:51
func SetCString(out **C.char, s string) func() {
cstr := C.CString(s)
*out = cstr
type DdbAPI interface {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://go-proverbs.github.io/ "the bigger the interface, the weaker the abstraction", we definitely haven't always followed this advice in DAOS but it's worth noting

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with Tom, but seeing how it's used, this seems like an acceptable intermediate step to enable better unit testing.

Copy link
Copy Markdown
Contributor Author

@knard38 knard38 Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed on the fact thata large interface weakens the abstraction.
However, since each sub-command maps to one method, the interface cannot easily be split further without losing the ability to stub the entire VOS layer in tests.
As kjacque noted, this is an intentional intermediate step: the follow-up PR #18086 shows exactly how the interface is used in unit tests.

Copy link
Copy Markdown
Contributor

@tanabarr tanabarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, not sure if we really need the large interface and there might be a better way of achieving the goal but that's not a blocking comment

@kjacque kjacque requested a review from mjmac April 23, 2026 23:17
Copy link
Copy Markdown
Contributor

@kjacque kjacque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a big patch doing several different things. I would like to see this split into 3 different PRs, one for each of the categories referenced by the description. For a refactor I would not expect to see any changes in tests, for example--we want to ensure the refactor did NOT change behavior. And I would like the C API improvements to be separated from the bugfixes if possible.

func SetCString(out **C.char, s string) func() {
cstr := C.CString(s)
*out = cstr
type DdbAPI interface {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with Tom, but seeing how it's used, this seems like an acceptable intermediate step to enable better unit testing.

Comment thread src/control/cmd/ddb/main.go
Comment thread src/control/cmd/ddb/main.go
Comment thread src/control/cmd/ddb/main.go
/* printing a recx works */
dvt_fake_print_called = 0;
opt.path = "/[0]/[0]/[0]/[0]/[0]";
opt.path = "/[0]/[0]/[0]/[1]/[0]";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we changing behavior here? IMO it is best to do the refactor and any desired behavior changes separately to ensure the refactor is transparent.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you. While writing this PR I also found some bugs and fixed them along the way, which is why they ended up in the same patch.

However, the following changes are tightly coupled to the Go refactoring and cannot be split without leaving the code or tests in a broken intermediate state.
Removing dc_pool_path/dc_db_path from ddb_ctx is required by the Go layer which now passes paths explicitly; this forces the dv_pool_open / dv_pool_destroy signature change.
The signature change in turn requires adding vf_vos_file_path and raising path size constants in ddb_parse, which then cascades into:

  • ddb_commands.c simplifications.
  • The dtx_stat message change (the old message printed ctx->dc_pool_path which no longer exists).
  • All test call-site updates, including the path index corrections (caused by moving dvt_vos_insert_2_records_with_dtx to dcv_suit_setup as part of the test setup rewrite).

The following parts are standalone bug fixes that could go in a separate PR:

  • ddb_run_feature out-label guard (ddb_commands.c): if (close) braces preventing dc_poh/dc_write_mode from being reset when the pool was not opened by this call.
  • assert_memory_equal comparing PRE vs PRE (ddb_vos_tests.c): read_only_vs_write_mode_test was comparing the pre-digest against itself instead of the post-digest.
  • tctx->dvt_poh not restored (ddb_vos_tests.c): helper_stat_open_modify_close_stat was leaving the pool handle in an invalid state for the rest of the test suite.

Does it sounds reasonable to you ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If some bugfixes must be left in, could you separate them out into a different commit in the same PR? That would make it easier to review. Thanks.

@kjacque
Copy link
Copy Markdown
Contributor

kjacque commented Apr 23, 2026

Additionally, any bugfixes touching Go code post-refactor provide you with a great opportunity to write new Go unit tests.

@knard38
Copy link
Copy Markdown
Contributor Author

knard38 commented Apr 24, 2026

Additionally, any bugfixes touching Go code post-refactor provide you with a great opportunity to write new Go unit tests.

A first set of Go unit tests could be found in the follow-up PR #18086.
It covers, among others, the unknownCmdError handling and the feature command no-auto-open fix, taking advantage of the new DdbAPI stub.

The Feature() memory leak and DtxStat() defer ordering fixes are not specifically covered:

  • the memory leak involves C string deallocation which cannot be easily observed from Go unit tests,
  • the defer ordering fix is a code structure correction without easy way to assert on.

@kjacque
Copy link
Copy Markdown
Contributor

kjacque commented Apr 24, 2026

Additionally, any bugfixes touching Go code post-refactor provide you with a great opportunity to write new Go unit tests.

A first set of Go unit tests could be found in the follow-up PR #18086. It covers, among others, the unknownCmdError handling and the feature command no-auto-open fix, taking advantage of the new DdbAPI stub.

The Feature() memory leak and DtxStat() defer ordering fixes are not specifically covered:

  • the memory leak involves C string deallocation which cannot be easily observed from Go unit tests,
  • the defer ordering fix is a code structure correction without easy way to assert on.

Cool, good to know there's another PR in the works with test coverage. I'd still like this PR split to separate the transparent refactoring from the behavior changes though.

@knard38
Copy link
Copy Markdown
Contributor Author

knard38 commented Apr 24, 2026

Additionally, any bugfixes touching Go code post-refactor provide you with a great opportunity to write new Go unit tests.

A first set of Go unit tests could be found in the follow-up PR #18086. It covers, among others, the unknownCmdError handling and the feature command no-auto-open fix, taking advantage of the new DdbAPI stub.
The Feature() memory leak and DtxStat() defer ordering fixes are not specifically covered:

  • the memory leak involves C string deallocation which cannot be easily observed from Go unit tests,
  • the defer ordering fix is a code structure correction without easy way to assert on.

Cool, good to know there's another PR in the works with test coverage. I'd still like this PR split to separate the transparent refactoring from the behavior changes though.

I will do my best to make split this PR in smaller ones as small as possible.

/* Test with root path */
rc = parse_vos_file_parts("/" MOCKED_POOL_UUID_STR "/vos-0", NULL, &parts);
assert_rc_equal(rc, DER_SUCCESS);
assert_string_equal("/" MOCKED_POOL_UUID_STR "/vos-0", parts.vf_vos_file_path);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please apply everywhere Ctrl+F will find a match.

Suggested change
assert_string_equal("/" MOCKED_POOL_UUID_STR "/vos-0", parts.vf_vos_file_path);
assert_string_equal(MOCKED_VOS_PATH_STR, parts.vf_vos_file_path);

Comment thread src/utils/ddb/ddb_parse.c
D_ERROR("VOS path '%s' too long: got=%zu, max=%d\n", vos_path, vos_path_len,
VOS_PATH_SIZE - 1);
rc = -DER_EXCEEDS_PATH_LEN;
goto out_preg;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you are using out_preg label for both vos_path and db_path now it seems you have a memory leek when vos_path passes the test and get copied into the vfp_tmp.

helper_stat_open_modify_close_stat(tctx, fs, false /** read-only */);
assert_int_equal(fs[FILE_STATE_PRE].stat.st_mtime, fs[FILE_STATE_POST].stat.st_mtime);
assert_memory_equal(fs[FILE_STATE_PRE].digest, fs[FILE_STATE_PRE].digest,
assert_memory_equal(fs[FILE_STATE_PRE].digest, fs[FILE_STATE_POST].digest,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting. I assume it changed after #17533 has landed.
How come this test has not failed previously?

Copy link
Copy Markdown
Contributor Author

@knard38 knard38 Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard to fail when comparing the same memory segments ;-)
Hopefully it is not failing with POST.

Comment on lines +321 to +331
noAutoOpen := map[string]bool{
"feature": true,
"open": true,
"close": true,
"prov_mem": true,
"smd_sync": true,
"rm_pool": true,
"dev_list": true,
"dev_replace": true,
}
if !noAutoOpen[opts.Args.RunCmd] {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there is no record with false value and it looks like not all commands actually have records in the noAutoOpen map, it seems a list would be a better solution. What do you think?

Suggested change
noAutoOpen := map[string]bool{
"feature": true,
"open": true,
"close": true,
"prov_mem": true,
"smd_sync": true,
"rm_pool": true,
"dev_list": true,
"dev_replace": true,
}
if !noAutoOpen[opts.Args.RunCmd] {
noAutoOpen := []string{
"feature",
"open",
"close",
"prov_mem",
"smd_sync",
"rm_pool",
"dev_list",
"dev_replace",
}
if !slices.Contains(noAutoOpen, opts.Args.RunCmd) {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with you that using a dictionary is overkill in this situation and probably less efficient with this number of elements.

  • Use slice instead of dictionary

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

6 participants