Skip to content

Add missing mean_stat_per_model method to StatObject_SB#3029

Merged
pplantinga merged 2 commits intospeechbrain:developfrom
Mr-Neutr0n:fix-mean-stat-per-model
Feb 10, 2026
Merged

Add missing mean_stat_per_model method to StatObject_SB#3029
pplantinga merged 2 commits intospeechbrain:developfrom
Mr-Neutr0n:fix-mean-stat-per-model

Conversation

@Mr-Neutr0n
Copy link
Copy Markdown
Contributor

Summary

  • Add mean_stat_per_model method to StatObject_SB in speechbrain/processing/PLDA_LDA.py
  • The PLDA scoring functions fast_PLDA_scoring and fast_PLDA_scoring_with_uncertainty call enroll_ctr.mean_stat_per_model() when enrollment models are not unique, but only sum_stat_per_model was defined
  • This causes an AttributeError when duplicate model IDs are present in enrollment data

The new method reuses sum_stat_per_model internally and divides by the session count to compute per-model averages.

Fixes #3026

Test plan

  • Verify PLDA scoring works when enrollment contains duplicate model IDs
  • Run existing PLDA/LDA tests

The PLDA scoring functions fast_PLDA_scoring and fast_PLDA_scoring_with_uncertainty
call enroll_ctr.mean_stat_per_model(), but only sum_stat_per_model was defined.
This raises an AttributeError when enrollment models are not unique.

Add mean_stat_per_model that computes the per-model average of zero-
and first-order statistics using the existing sum_stat_per_model.
Copy link
Copy Markdown
Collaborator

@pplantinga pplantinga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There must not be any recipes that have multiple models per stat because this code clearly never was able to handle that. For example, later in the file (where mean_stat_per_model gets called) the same code is repeated, which also looks like a mistake:

    # If models are not unique, compute the mean per model, display a warning
    if not numpy.unique(enroll_ctr.modelset).shape == enroll_ctr.modelset.shape:
        # logging.warning("Enrollment models are not unique, average i-vectors")
        enroll_ctr = enroll_ctr.mean_stat_per_model()

Do we have a way of testing this? Cuz if not maybe it would just be better to add a note that the code doesn't work for >1 model per stat.

@Mr-Neutr0n
Copy link
Copy Markdown
Contributor Author

Yeah you're right, I dug through the recipes and couldn't find anything that actually hits the multi-model path. The code's been broken for a while and nobody noticed, so probably nobody uses it.

That said, since mean_stat_per_model() is already being called in a few places, I figured it's better to have it not crash if someone does happen to hit it. But I can totally see the argument for just documenting the limitation instead — less risk of introducing something untested.

Want me to switch it to a NotImplementedError with a clear message, or just add a comment noting it's unsupported? Happy to go either way.

@pplantinga
Copy link
Copy Markdown
Collaborator

Given that there's no easy way to test this, here's the solution that seems best to me:

Keep the mean_stat_per_model() implementation, but make users call it themselves if they have multiple models. Then in the fast_PLDA_scoring() method add a check for multiple models (as before) but instead of calling the mean_stat_per_model() automatically, throw an error that helpfully explains how to call mean_stat_per_model

@Mr-Neutr0n
Copy link
Copy Markdown
Contributor Author

That makes sense — keeps the method available but makes the multi-model case explicit rather than silently averaging. I'll update the PR with that approach.

Per reviewer feedback, fast_PLDA_scoring() now raises a ValueError
when enrollment models are not unique, directing users to call
mean_stat_per_model() explicitly. Removed the redundant second
uniqueness check after centering since the first check already guards.
Copy link
Copy Markdown
Collaborator

@pplantinga pplantinga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pplantinga pplantinga merged commit a601051 into speechbrain:develop Feb 10, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AttributeError: 'StatObject_SB' object has no attribute 'mean_stat_per_model'

2 participants