This repository contains the dataset, code, and models for our paper:
CHSER: A Dataset and Case Study on Generative Speech Error Correction for Child ASR, accepted at Interspeech 2025.
CHSER/
βββ code/ # Source code for dataset generation, model training, and evaluation
β βββ analysis/ # Scripts for baseline wer generation
β βββ dataset_gen/ # Scripts for creating the CHSER dataset from raw hypotheses
β βββ gensec/ # Core modules for generative speech error correction (GenSEC) (T5 and Llama based)
βββ dataset/ # CHSER dataset splits
β βββ dev/
β βββ test/
β βββ train/
βββ models/ # Pretrained and fine-tuned model checkpoints
β βββ 3gram/ # n-gram baseline (for comparison or decoding)
β βββ llama2/ # Adapter weights for Llama2 model fine-tuned on CHSER
β βββ t5/ # Adapter weights for T5 model fine-tuned on CHSER
β βββ t5_myst/ # Adapter weights for T5 model fine-tuned on MyST data
β βββ transformer/ # Transformer LM baseline model (non-pretrained)
The CHSER dataset consists of child ASR hypotheses paired with human-verified reference transcripts. Hypotheses were generated using Whisper-base.en in a zero-shot beam search setting.
We provide checkpoints of GenSEC models trained on adult speech (HyPoradise) and fine-tuned on CHSER. Models include:
- Llama-based correction model
- T5-based correction models
If you found this work useful in your research, please cite:
@misc{shankar2025chser,
title={CHSER: A Dataset and Case Study on Generative Speech Error Correction for Child ASR},
author={Natarajan Balaji Shankar and Zilai Wang and Kaiyuan Zhang and Mohan Shi and Abeer Alwan},
year={2025},
eprint={2505.18463},
archivePrefix={arXiv},
primaryClass={eess.AS},
url={https://arxiv.org/abs/2505.18463},
}