Abnormal WER Results for HyperConformer/Conformer on 100h LibriSpeech (HyperConformer Not Outperforming Conformer, Conformer_22M WER >95%) #3040

zzm196 · 2026-03-04T13:31:43Z

zzm196
Mar 4, 2026

I'm attempting to reproduce the ASR Transformer recipe for LibriSpeech (https://github.com/speechbrain/speechbrain/tree/develop/recipes/LibriSpeech/ASR/transformer)) to validate the conclusions of the paper HyperConformer: Multi-head HyperMixer for Efficient Speech Recognition.
Per Table 2 of the paper, HyperConformer outperforms Conformer on the test-other split of 100-hour LibriSpeech (in terms of WER) and demonstrates better data efficiency. However, my training results show:
Overall WER values are higher than expected;
HyperConformer does not outperform Conformer (contrary to the paper’s conclusion);
Conformer_22M yields abnormally high WER (over 95%) across training, test-clean, and test-other splits.
GPU: RTX 3090 (24GB)
Execution Commands:
python train.py hparams/hyperconformer_8M.yaml
python train.py hparams/conformer_8M.yaml # Modified from hyperconformer_8M.yaml

Also ran experiments for hyperconformer_22M.yaml and conformer_22M.yaml

Key Config Modifications:
For conformer_8M.yaml, I only modified attention_type to RelPosMHAXL (from HyperConformer’s config). To use only the 100-hour LibriSpeech subset, I added/modified these lines in the YAML files:

Dataset path was specified; added splits for 100h LibriSpeech

train_splits: ["train-clean-100"]
dev_splits: ["dev-clean"]
test_splits: ["test-clean", "test-other"]
skip_prep: False
train_csv: !ref <output_folder>/train.csv
valid_csv: !ref <output_folder>/dev-clean.csv
test_csv:
- !ref <output_folder>/test-clean.csv
- !ref <output_folder>/test-other.csv
Training Log Snippets

Conformer_8M (100h LibriSpeech)
epoch: 106, lr: 6.05e-04, steps: 68370, optimizer: Adam - train loss: 24.91 - valid loss: 32.76, valid ACC: 8.75e-01
epoch: 107, lr: 6.02e-04, steps: 69015, optimizer: Adam - train loss: 24.79 - valid loss: 31.85, valid ACC: 8.76e-01
epoch: 108, lr: 5.99e-04, steps: 69660, optimizer: Adam - train loss: 24.76 - valid loss: 34.30, valid ACC: 8.75e-01
epoch: 109, lr: 5.96e-04, steps: 70305, optimizer: Adam - train loss: 24.53 - valid loss: 32.79, valid ACC: 8.75e-01
epoch: 110, lr: 5.94e-04, steps: 70950, optimizer: Adam - train loss: 24.51 - valid loss: 33.23, valid ACC: 8.75e-01, valid WER: 11.38
Epoch loaded: 110 - test loss: 17.27, test ACC: 8.88e-01, test WER: 6.20 // test-clean
Epoch loaded: 110 - test loss: 10.42, test ACC: 7.96e-01, test WER: 15.57 // test-other
HyperConformer_8M (100h LibriSpeech)
epoch: 105, lr: 6.08e-04, steps: 67725, optimizer: Adam - train loss: 28.67 - valid loss: 30.93, valid ACC: 8.78e-01
epoch: 106, lr: 6.05e-04, steps: 68370, optimizer: Adam - train loss: 28.52 - valid loss: 31.04, valid ACC: 8.78e-01
epoch: 107, lr: 6.02e-04, steps: 69015, optimizer: Adam - train loss: 28.54 - valid loss: 31.11, valid ACC: 8.77e-01
epoch: 108, lr: 5.99e-04, steps: 69660, optimizer: Adam - train loss: 28.40 - valid loss: 30.46, valid ACC: 8.77e-01
epoch: 109, lr: 5.96e-04, steps: 70305, optimizer: Adam - train loss: 28.34 - valid loss: 30.46, valid ACC: 8.79e-01
epoch: 110, lr: 5.94e-04, steps: 70950, optimizer: Adam - train loss: 28.24 - valid loss: 30.49, valid ACC: 8.79e-01, valid WER: 11.79
Epoch loaded: 110 - test loss: 17.17, test ACC: 8.90e-01, test WER: 6.29 // test-clean
Epoch loaded: 110 - test loss: 10.44, test ACC: 7.94e-01, test WER: 16.47 // test-other
HyperConformer_22M (100h LibriSpeech)
epoch: 106, lr: 6.05e-04, steps: 68370, optimizer: Adam - train loss: 18.79 - valid loss: 34.62, valid ACC: 8.65e-01
epoch: 107, lr: 6.02e-04, steps: 69015, optimizer: Adam - train loss: 18.84 - valid loss: 33.89, valid ACC: 8.66e-01
epoch: 108, lr: 5.99e-04, steps: 69660, optimizer: Adam - train loss: 18.60 - valid loss: 34.30, valid ACC: 8.64e-01
epoch: 109, lr: 5.96e-04, steps: 70305, optimizer: Adam - train loss: 18.56 - valid loss: 34.02, valid ACC: 8.67e-01
epoch: 110, lr: 5.94e-04, steps: 70950, optimizer: Adam - train loss: 18.52 - valid loss: 33.75, valid ACC: 8.66e-01, valid WER: 11.73
Epoch loaded: 110 - test loss: 18.75, test ACC: 8.84e-01, test WER: 6.65 // test-clean
Epoch loaded: 110 - test loss: 11.50, test ACC: 7.91e-01, test WER: 16.75 // test-other
Conformer_22M
WER values are abnormally high (over 95%) across training, test-clean, and test-other splits (no detailed logs provided for this run).
Questions
1.Have there been any recent changes to the code or configuration files for these recipes?
2.Could you share the exact running steps and full configuration files required to reproduce the paper’s results (100h LibriSpeech, HyperConformer vs Conformer)?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Abnormal WER Results for HyperConformer/Conformer on 100h LibriSpeech (HyperConformer Not Outperforming Conformer, Conformer_22M WER >95%) #3040

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Abnormal WER Results for HyperConformer/Conformer on 100h LibriSpeech (HyperConformer Not Outperforming Conformer, Conformer_22M WER >95%) #3040

Uh oh!

zzm196 Mar 4, 2026

Also ran experiments for hyperconformer_22M.yaml and conformer_22M.yaml

Dataset path was specified; added splits for 100h LibriSpeech

Replies: 0 comments

zzm196
Mar 4, 2026