Skip to content

Fix issue 'GlobalNorm with DDP'#2934

Merged
TParcollet merged 1 commit intospeechbrain:developfrom
svecjan:norm_ddp
Jun 10, 2025
Merged

Fix issue 'GlobalNorm with DDP'#2934
TParcollet merged 1 commit intospeechbrain:developfrom
svecjan:norm_ddp

Conversation

@svecjan
Copy link
Copy Markdown
Contributor

@svecjan svecjan commented Jun 5, 2025

What does this PR do?

This pull request fixes an issue with training in DDP mode when using GlobalNorm.

Error :

File "sb1/recipes/LibriSpeech/self-supervised-learning/BEST-RQ/train.py", line 355, in <module>
    main()
  File "sb1/recipes/LibriSpeech/self-supervised-learning/BEST-RQ/train.py", line 345, in main
    brain.fit(
  File "sb1/speechbrain/core.py", line 1585, in fit
    self._fit_train(train_set=train_set, epoch=epoch, enable=enable)
  File "sb1/speechbrain/core.py", line 1410, in _fit_train
    loss = self.fit_batch(batch)
           ^^^^^^^^^^^^^^^^^^^^^
  File "sb1/speechbrain/core.py", line 1209, in fit_batch
    outputs = self.compute_forward(batch, sb.Stage.TRAIN)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "sb1/recipes/LibriSpeech/self-supervised-learning/BEST-RQ/train.py", line 54, in compute_forward
    feats = self.modules.normalize(feats, wav_lens, epoch=current_epoch)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/matylda3/isvecjan/miniconda3/envs/sb1/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/matylda3/isvecjan/miniconda3/envs/sb1/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "sb1/speechbrain/processing/features.py", line 1430, in forward
    self._update_global_stats(x, mask)
  File "sb1/speechbrain/processing/features.py", line 1468, in _update_global_stats
    self.count, self.glob_mean, self.glob_std = mean_std_update(
                                                ^^^^^^^^^^^^^^^^
  File "sb1/speechbrain/processing/features.py", line 1250, in mean_std_update
    new_statistics = combine_gaussian_statistics_distributed(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "sb1/speechbrain/processing/features.py", line 1176, in combine_gaussian_statistics_distributed
    global_count = ddp_all_reduce(torch.tensor(local_count), ReduceOp.SUM)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "sb1/speechbrain/utils/distributed.py", line 254, in ddp_all_reduce
    torch.distributed.all_reduce(communication_object, op=reduce_op)
  File "/mnt/matylda3/isvecjan/miniconda3/envs/sb1/lib/python3.12/site-packages/torch/distributed/c10d_logger.py", line 81, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/matylda3/isvecjan/miniconda3/envs/sb1/lib/python3.12/site-packages/torch/distributed/distributed_c10d.py", line 2810, in all_reduce
    work = group.allreduce([tensor], opts)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: No backend type associated with device type cpu

I encountered this issue with the vanilla recipes:

recipes/LibriSpeech/self-supervised-learning/BEST-RQ/hparams/BEST-RQ.yaml 
(change: normalize.norm_type from "sentence" to "global")

recipes/LibriSpeech/ASR/CTC/hparams/conformer_large.yaml

Copy link
Copy Markdown
Collaborator

@Adel-Moumen Adel-Moumen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for catching this bug.

@TParcollet TParcollet merged commit c75ab54 into speechbrain:develop Jun 10, 2025
6 of 9 checks passed
@svecjan svecjan deleted the norm_ddp branch June 13, 2025 09:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants