Skip to content

FocalCodec [NeurIPS 2025]#3000

Merged
pplantinga merged 8 commits intospeechbrain:developfrom
lucadellalib:focalcodec
Nov 24, 2025
Merged

FocalCodec [NeurIPS 2025]#3000
pplantinga merged 8 commits intospeechbrain:developfrom
lucadellalib:focalcodec

Conversation

@lucadellalib
Copy link
Copy Markdown
Collaborator

Add FocalCodec training recipe.

Copy link
Copy Markdown
Collaborator

@mravanelli mravanelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great Job @lucadellalib! I did a quick code inspection and shared some comments about the docstrings.
Other Comments:

  • Regarding the extra-depenency, please take a look at our policy here. This code should be compliant with that. @pplantinga can advise
  • I'm not sure about having "metrics" as a local folder. I think we might need to resuse the same metrics in other recipes, for instance the streamable focalcodec and the extension of FocalCoded to LibriLight. Maybe we can put it in SpeechBrain/metrics?. Any advise @pplantinga ?



class Generation(sb.Brain):
def fit_batch(self, batch):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For every method, we typically add some short description about its functionality for better clarity (See for instance this). It is even more important here as some methods are not standard.

return super()._fit_valid(valid_set, epoch, enable)

@torch.no_grad()
def evaluate_batch(self, batch, stage):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure every method has a short description.



def prepare_recipe(hparams, run_opts):
# Dataset preparation
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add docstring

audio_backend="soundfile",
**kwargs,
):
"""This function prepares the datasets to be used in the brain class.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improve the docstring by explaining all the parameters. They are a lot in this case, but that can improve clarity and usability.

provides = ["sig"]

def audio_pipeline_train(wav):
original_sample_rate = sb.dataio.dataio.read_audio_info(wav).sample_rate
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure all the functions have a short docstring



class DWER(MetricStats):
def __init__(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add docstring with a working example



class SpkSimWavLM(MetricStats):
def __init__(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a docstring with a working example (such that we can test all with our doc tests)



class UTMOS(MetricStats):
def __init__(self, sample_rate, model=None):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add docstring with example



class HingeGLoss(nn.Module):
"""Hinge Generator Loss
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the example to the docstring



class HingeDLoss(nn.Module):
"""Hinge Discriminator Loss
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add example

@pplantinga
Copy link
Copy Markdown
Collaborator

  • Regarding the extra-depenency, please take a look at our policy here. This code should be compliant with that. @pplantinga can advise

I think this PR follows the policy: for recipes, the only requirement is an extra-requirements.txt file with all non-speechbrain dependencies. The one thing I might suggest is adding transformers to the extra requirements, as this is (mostly) moved to integrations now, not a core dependency.

  • I'm not sure about having "metrics" as a local folder. I think we might need to resuse the same metrics in other recipes, for instance the streamable focalcodec and the extension of FocalCoded to LibriLight. Maybe we can put it in SpeechBrain/metrics?. Any advise @pplantinga ?

Perhaps we can leave it here for now, and move it if we do end up using it for other recipes. The principle of YAGNI (you ain't gonna need it) might apply here, let's keep it as straightforward as possible and not plan too far ahead.

@mravanelli
Copy link
Copy Markdown
Collaborator

Thank you for your comments @pplantinga! Do you have other comments or suggestions?

@mravanelli
Copy link
Copy Markdown
Collaborator

@Adel-Moumen, do you also have some comments and suggestions here?

@mravanelli mravanelli added the enhancement New feature or request label Nov 22, 2025
@mravanelli
Copy link
Copy Markdown
Collaborator

I tested the recipe and the recipe tests. All seems to work properly.
A couple of small points:

  • We need to upload the logs to Dropbox. I will follow up on that privately.
  • In both Yaml file, train-clean-360 and train-other-500. I would suggest uncommenting that by default.

@mravanelli
Copy link
Copy Markdown
Collaborator

mravanelli commented Nov 24, 2025

This PR LGTM now. I think we can go ahead and merge it, unless @pplantinga or @Adel-Moumen have further comments. Great Job @lucadellalib!

Copy link
Copy Markdown
Collaborator

@pplantinga pplantinga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pplantinga pplantinga added this to the v1.1.0 milestone Nov 24, 2025
@pplantinga pplantinga added the recipes Changes to recipes only (add/edit) label Nov 24, 2025
@pplantinga pplantinga merged commit 637f0a5 into speechbrain:develop Nov 24, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request recipes Changes to recipes only (add/edit)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants