Skip to content

Solves #2846 Refactor run_opts into a @dataclass#2855

Merged
pplantinga merged 32 commits intospeechbrain:developfrom
nouranali:develop
Jul 9, 2025
Merged

Solves #2846 Refactor run_opts into a @dataclass#2855
pplantinga merged 32 commits intospeechbrain:developfrom
nouranali:develop

Conversation

@nouranali
Copy link
Copy Markdown
Contributor

@nouranali nouranali commented Mar 9, 2025

What does this PR do?

Fixes #2846

@nouranali
Copy link
Copy Markdown
Contributor Author

@pplantinga

@pplantinga
Copy link
Copy Markdown
Collaborator

Hi @nouranali - thanks for your contribution. This is a step in the right direction, but I think there's more that's needed here. Just to warn you, the core.py and interfaces.py are important parts of the code and will need a lot of testing before this sort of PR can be merged.

So to start with, I think Brain and Pretrained should accept a RunOpts object as an argument, rather than a dict. In addition, the parse_arguments() function is closely tied to the RunOpts, and should maybe just be a method of the class or at least in the same file, and should probably just return a RunOpts object (e.g. RunOpts.from_command_line_args() @classmethod or some such thing). Finally, there is some argument validation that is done in Brain that could maybe move to the RunOpts class. Use your best judgement -- this part is probably the tricky part of the PR, as we should focus on making the Brain class an easy to understand basic training loop implementation, while no surprising details should get hidden if possible. Good luck!

@nouranali
Copy link
Copy Markdown
Contributor Author

Hi @nouranali - thanks for your contribution. This is a step in the right direction, but I think there's more that's needed here. Just to warn you, the core.py and interfaces.py are important parts of the code and will need a lot of testing before this sort of PR can be merged.

So to start with, I think Brain and Pretrained should accept a RunOpts object as an argument, rather than a dict. In addition, the parse_arguments() function is closely tied to the RunOpts, and should maybe just be a method of the class or at least in the same file, and should probably just return a RunOpts object (e.g. RunOpts.from_command_line_args() @classmethod or some such thing). Finally, there is some argument validation that is done in Brain that could maybe move to the RunOpts class. Use your best judgement -- this part is probably the tricky part of the PR, as we should focus on making the Brain class an easy to understand basic training loop implementation, while no surprising details should get hidden if possible. Good luck!

Great, will start working on your comments, how can I start testing along the way?

@nouranali
Copy link
Copy Markdown
Contributor Author

@pplantinga Can u check the new updates?

@pplantinga
Copy link
Copy Markdown
Collaborator

I don't have a lot of time to review this rn, but you can check it yourself by running a template, e.g. templates/speech_recognition/ASR

@pplantinga
Copy link
Copy Markdown
Collaborator

My only comments right now are: the from_command_line_args() method should be a @classmethod. And ideally the hparams file and yaml_overrides should become members of the RunOpts. I think the name RunOptDefaults should be changed to something like RunOptions.

https://docs.python.org/3/library/functions.html#classmethod

@nouranali
Copy link
Copy Markdown
Contributor Author

My only comments right now are: the from_command_line_args() method should be a @classmethod. And ideally the hparams file and yaml_overrides should become members of the RunOpts. I think the name RunOptDefaults should be changed to something like RunOptions.

https://docs.python.org/3/library/functions.html#classmethod

Done

@nouranali
Copy link
Copy Markdown
Contributor Author

@mravanelli

@nouranali
Copy link
Copy Markdown
Contributor Author

@mravanelli

Copy link
Copy Markdown
Collaborator

@pplantinga pplantinga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point I'm unsure if this PR will need to change every recipe or not -- ideally we could get away without having to do this, but we'll see. Have you tried running a template yet?

@pplantinga
Copy link
Copy Markdown
Collaborator

I ran the checks and the linter failed. You can run it yourself, even automatically on every commit, following instructions here:
https://github.com/speechbrain/speechbrain/blob/develop/docs/devtools.md

@nouranali
Copy link
Copy Markdown
Contributor Author

At this point I'm unsure if this PR will need to change every recipe or not -- ideally we could get away without having to do this, but we'll see. Have you tried running a template yet?

yes

At this point I'm unsure if this PR will need to change every recipe or not -- ideally we could get away without having to do this, but we'll see. Have you tried running a template yet?

Yes I ran this
https://github.com/speechbrain/speechbrain/tree/develop/templates/speech_recognition/ASR

@pplantinga
Copy link
Copy Markdown
Collaborator

This used to touch only a few files but now touches over 500... did you run a different version of black on the whole repo?

@nouranali
Copy link
Copy Markdown
Contributor Author

This used to touch only a few files but now touches over 500... did you run a different version of black on the whole repo?

What version do u use? and what files do u run it on?

@pplantinga
Copy link
Copy Markdown
Collaborator

If you install the precommit hook, the linters will run on all the files changed as part of the commit. The versions are in the file https://github.com/speechbrain/speechbrain/blob/develop/lint-requirements.txt

@pplantinga pplantinga requested a review from Adel-Moumen June 4, 2025 17:11
Copy link
Copy Markdown
Collaborator

@Adel-Moumen Adel-Moumen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM! I think its a great improvement on core.py side.

Comment on lines +180 to +189
modules : dict[str, torch.nn.Module]
These modules are passed to the optimizer by default if they have
trainable parameters, and will have ``train()``/``eval()`` called on them.
opt_class : torch.optim class
opt_class : Optional[Type[torch.optim]]
A torch optimizer constructor that takes only the list of
parameters (e.g. a lambda or partial function definition). By default,
this will be passed all modules in ``modules`` at the
beginning of the ``fit()`` method. This behavior can be changed
by overriding the ``configure_optimizers()`` method.
hparams : dict
hparams : Optional[dict]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we should maybe log this to the user i.e. saying that no opt+class / hparams are provided etc to make sure everything is transparent to the user

@TParcollet
Copy link
Copy Markdown
Collaborator

@pplantinga / @Adel-Moumen if agreement from both of you, please merge!

@pplantinga pplantinga merged commit 900f02c into speechbrain:develop Jul 9, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor run_opts into a @dataclass

4 participants