egs2

egs2 (Examples of ESPnet2)

How to use?

See: https://espnet.github.io/espnet/espnet2_tutorial.html#recipes-using-espnet2

Overview of example information

Directory name	Corpus name	Task	Language	URL	Note
accentdb	A Database of Non-Native English Accents	Accent Recognition	ENG	https://accentdb.org/
accented_french_openslr57	African Accented French Corpus	ASR	FRA	https://www.openslr.org/57/
acesinger	ACESinger Singing Corpus	SVS	CMN	WIP
aesrc2020	Accented English Speech Recognition Challenge 2020	ASR	ENG	https://arxiv.org/abs/2102.10233
aidatatang_200zh	Aidatatang_200zh A free Chinese Mandarin speech corpus	ASR	CMN	http://www.openslr.org/resources/62
aishell	AISHELL-ASR0009-OS1 Open Source Mandarin Speech Corpus	ASR	CMN	http://www.aishelltech.com/kysjcp
aishell2	AISHELL-2 Open Source Mandarin Speech Corpus	ASR	CMN	https://www.aishelltech.com/aishell_2
aishell3	AISHELL3 Mandarin multi-speaker text-to-speech	TTS	CMN	https://www.openslr.org/93/
aishell4	AISHELL4 Open Source Mandarin Speech Corpus in Conference Scenario	ASR/SE	CMN	https://www.openslr.org/111/
ameboshi	Ameboshi Ciphyer's singing voice database	SVS	JPN	https://parapluie2c56m.wixsite.com/mysite
americasnlp22	The Second AmericasNLP Competition	ASR	BZD, GUG, GVC, QWE, TAV	http://turing.iimas.unam.mx/americasnlp/st.html
ami	The AMI Meeting Corpus	ASR	ENG	http://groups.inf.ed.ac.uk/ami/corpus/
an4	CMU AN4 database	ASR/TTS	ENG	http://www.speech.cs.cmu.edu/databases/an4/
aphasiabank	AphasiaBank database (English)	ASR	ENG	https://aphasia.talkbank.org/
arabic_sc	Database for Arabic Speech Commands Recognition	SLU	ARA	https://github.com/ltkbenamer/AR_Speech_Database
asvspoof	The 3rd Automatic Speaker Verification Spoofing and Countermeasures Challenge database	Fak Speech Detection	ENG	https://datashare.ed.ac.uk/handle/10283/3336
babel	IARPA Babel corups	ASR	~20 languages	https://www.iarpa.gov/index.php/research-programs/babel
bibletts	Bible TTS corups	TTS	6 Sub-Saharan Africa languages	https://masakhane-io.github.io/bibleTTS/
bn_openslr53	Large bengali ASR training dataset	ASR	BEN	https://openslr.org/53/
bur_openslr80	Burmese ASR training dataset	ASR	BUR	https://openslr.org/80/
catslu	CATSLU-MAPS	SLU	CMN	https://sites.google.com/view/catslu/home
catslu_entity	CATSLU	SLU/Entity Classifi.	CMN	https://sites.google.com/view/catslu/home
chime1	The 1st CHiME Speech Separation and Recognition Challenge	ASR/Multichannel ASR	ENG	https://spandh.dcs.shef.ac.uk/chime_challenge/chime2011/
chime2	The 2nd CHiME Speech Separation and Recognition Challenge	ASR/Multichannel ASR	ENG	https://spandh.dcs.shef.ac.uk/chime_challenge/chime2013/
chime4	The 4th CHiME Speech Separation and Recognition Challenge	ASR/Multichannel ASR	ENG	http://spandh.dcs.shef.ac.uk/chime_challenge/chime2016/
chime6	The 6th CHiME Speech Separation and Recognition Challenge	ASR	ENG	https://chimechallenge.github.io/chime6/
clarity21	The First Clarity Enhancement Challenge CEC1	SE	ENG	https://claritychallenge.github.io/clarity_CEC1_doc/
cmu_arctic	CMU ARCTIC	TTS	ENG	http://www.festvox.org/cmu_arctic/
cmu_indic	CMU INDIC	TTS	7 languages	http://festvox.org/cmu_indic/
commonvoice	The Mozilla Common Voice	ASR	13 languages	https://voice.mozilla.org/datasets
conferencingspeech21	Far-field Multi-channel Speech Enhancement Challenge for Video Conferencing (ConferencingSpeech 2021)	SE	ENG, CMN	https://tea-lab.qq.com/conferencingspeech-2021
covost2	Multilingual speech-to-text translation corpus from Common Voice	ST	lang pairs from 22	https://github.com/facebookresearch/covost
csj	Corpus of Spontaneous Japanese	ASR	JPN	https://pj.ninjal.ac.jp/corpus_center/csj/en/
csmsc	Chinese Standard Mandarin Speech Copus	TTS	CMN	https://www.data-baker.com/open_source.html
css10	CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages	TTS	10 langauges	https://github.com/Kyubyong/css10
dcase22_task1	DCASE Task1 2022 Dataset	SLU	ENG	https://dcase.community/challenge2022/task-low-complexity-acoustic-scene-classification
dirha_wsj	Distant-speech Interaction for Robust Home Applications	Multichannel ASR	ENG	https://dirha.fbk.eu/, https://github.com/SHINE-FBK/DIRHA_English_wsj
dns_ins20	Deep Noise Suppression Challenge – INTERSPEECH 2020	SE	7 languages +singing	https://www.microsoft.com/en-us/research/academic-program/deep-noise-suppression-challenge-interspeech-2020/
dns_icassp21	Deep Noise Suppression Challenge – ICASSP 2021	SE	11 languages + singing	https://www.microsoft.com/en-us/research/academic-program/deep-noise-suppression-challenge-icassp-2021/
dns_icassp22	Deep Noise Suppression Challenge – ICASSP 2022	SE	11 languages + singing	https://www.microsoft.com/en-us/research/academic-program/deep-noise-suppression-challenge-icassp-2022/
dns_ins20	Deep Noise Suppression Challenge – INTERSPEECH 2020	SE	11 languages + singing	https://www.microsoft.com/en-us/research/academic-program/deep-noise-suppression-challenge-interspeech-2021/
dns_ins21	Deep Noise Suppression Challenge – INTERSPEECH 2021	SE	11 languages + singing	https://www.microsoft.com/en-us/research/academic-program/deep-noise-suppression-challenge-interspeech-2021/
dsing	Automatic Lyric Transcription from Karaoke Vocal Tracks (From DAMP Sing300x30x2)	ASR (ALT)	ENG singing	https://github.com/groadabike/Kaldi-Dsing-task
easycom	An Augmented Reality Dataset to Support Algorithms for Easy Communication in Noisy Classification	ASR	ENG	https://github.com/facebookresearch/EasyComDataset
esc50	Dataset for Environmental Sound Classification	Audio Classification		https://github.com/karolpiczak/ESC-50
fisher_callhome_spanish	Fisher and CALLHOME Spanish--English Speech Translation	ASR/ST	SPA->ENG	https://catalog.ldc.upenn.edu/LDC2014T23
fleurs	Few-shot Learning Evaluation of Universal Representations of Speech	ASR/Multilingual	102 languages	https://huggingface.co/datasets/google/fleurs
freesound	Speech Command & Freesound for VAD	English	https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/asr/speech_classification/datasets.html#speech-command-freesound-for-vad
fsc	Fluent Speech Commands Dataset	SLU	ENG	https://fluent.ai/fluent-speech-commands-a-dataset-for-spoken-language-understanding-research/
fsc_challenge	Fluent Speech Commands Dataset MASE Eval Challenge splits	SLU	ENG	https://github.com/maseEval/mase
fsc_unseen	Fluent Speech Commands Dataset MASE Eval Unseen splits	SLU	ENG	https://github.com/maseEval/mase
gigaspeech	GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio	ASR	ENG	https://github.com/SpeechColab/GigaSpeech
googlei18n_lowresource	Googlei18n crowdsource project	TTS	ENG	https://github.com/mirumee/google-i18n-address (most in openslr as separate entries)
grabo	Grabo dataset	SLU	ENG + NLD	https://www.esat.kuleuven.be/psi/spraak/downloads/
gramvaani	GramVaani ASR Challenge 2022	ASR	HI	https://sites.google.com/view/gramvaaniasrchallenge/dataset
harpervalley	HarperValleyBank: A Domain-Specific Spoken Dialog Corpus	SLU	ENG	https://github.com/cricketclub/gridspace-stanford-harper-valley
hkust	HKUST/MTS: A very large scale Mandarin telephone speech corpus	ASR	CMN	https://catalog.ldc.upenn.edu/LDC2005S15
how2	How2: A Large-scale Dataset for Multimodal Language Understanding	ASR/MT/ST	ENG->POR	https://github.com/srvk/how2-dataset
how2_2000h	How2_2000h fbank features	ASR/SUM	ENG->POR	https://arxiv.org/pdf/2110.06263.pdf
hub4_spanish	1997 Spanish Broadcase News Speech	ASR	SPA	https://catalog.ldc.upenn.edu/LDC98S74
hui_acg	HUI-audio-corpus-german	TTS	DEU	https://opendata.iisys.de/datasets.html#hui-audio-corpus-german
iam	IAM Handwriting Database 3.0	OCR	ENG	https://fki.tic.heia-fr.ch/databases/iam-handwriting-database
iemocap	IEMOCAP database: The Interactive Emotional Dyadic Motion Capture database	SLU	ENG	https://sail.usc.edu/iemocap/
indic_speech	IndicSpeech: Text-to-Speech Corpus for Indian Languages	TTS	3 indic languages	http://cvit.iiit.ac.in/research/projects/cvit-projects/text-to-speech-dataset-for-indian-languages
interspeech2024_dsu_challenge	Interspeech2024 speech processing using discrete speech unit challenge (ASR track)	ASR/Multilingual ASR	145 languages	https://www.wavlab.org/activities/2024/Interspeech2024-Discrete-Speech-Unit-Challenge/
itako	Itako Singing voice synthesis corpus	SVS	JPN	https://zunko.jp/itadev/login.php
iwslt14	IWSLT14 MT shared task	MT	DEU->ENG	http://dl.fbaipublicfiles.com/fairseq/data/iwslt14/de-en.tgz
iwslt21_low_resource	ALFFA, IARPA Babel, Gamayun, IWSLT 2021	ASR	SWA	http://www.openslr.org/25/ https://catalog.ldc.upenn.edu/LDC2017S05 https://gamayun.translatorswb.org/data/ https://iwslt.org/2021/low-resource
iwslt22_dialect	IWSLT2022 dialectal speech translation shared task	ASR/ST	ARA->Tunisian ARA	https://github.com/kevinduh/iwslt22-dialect.git
iwslt22_low_resource	IWSLT2022 Low-resource speech translation track task	ST	Tamasheq->FrenchPermalink	https://github.com/mzboito/IWSLT2022_Tamasheq_data.git
jdcinal	Japanese Dialogue Corpus of Information Navigation and Attentive Listening Annotated with Extended ISO-24617-2 Dialogue Act Tags	SLU	JPN	http://www.lrec-conf.org/proceedings/lrec2018/pdf/464.pdf http://tts.speech.cs.cmu.edu/awb/infomation_navigation_and_attentive_listening_0.2.zip
jkac	J-KAC: Japanese Kamishibai and audiobook corpus	TTS	JPN	https://sites.google.com/site/shinnosuketakamichi/research-topics/j-kac_corpus
jmd	JMD: Japanese multi-dialect corpus for speech synthesis	TTS	JPN	https://sites.google.com/site/shinnosuketakamichi/research-topics/jmd_corpus
jsss	JSSS: Japanese speech corpus for summarization and simplification	TTS	JPN	https://sites.google.com/site/shinnosuketakamichi/research-topics/jsss_corpus
jsut	Japanese speech corpus of Saruwatari-lab., University of Tokyo	ASR/TTS	JPN	https://sites.google.com/site/shinnosuketakamichi/publication/jsut
jsut_song	JSUT-song corpus	SVS	JPN	https://sites.google.com/site/shinnosuketakamichi/publication/jsut-song
jtubespeech	Japanese YouTube Speech corpus	ASR/TTS	JPN
jv_openslr35	Javanese	ASR	JAV	http://www.openslr.org/35
jvs	JVS (Japanese versatile speech) corpus	TTS	JPN	https://sites.google.com/site/shinnosuketakamichi/research-topics/jvs_corpus
kathbath	Kathbath dataset	ASR	12 Indian langauges	https://ai4bharat.iitm.ac.in/indic-superb
kising	KiSing-v2 Corpus (ACESinger augmented)	SVS	CMN	WIP
ksponspeech	KsponSpeech (Korean spontaneous speech) corpus	ASR	KOR	https://aihub.or.kr/aidata/105
ksc	Kazakh speech corpus			ASR	KAZ
kss	Korean single speaker corpus	TTS	KOR	https://www.kaggle.com/bryanpark/korean-single-speaker-speech-dataset
l3das22	L3DAS22: Machine Learning for 3D Audio Signal Processing - ICASSP 2022	SE	ENG	https://www.l3das.com/icassp2022/
laborotv	LaboroTVSpeech (A large-scale Japanese speech corpus on TV recordings)	ASR	JPN	https://laboro.ai/column/eg-laboro-tv-corpus-jp
libriheavy_medium	Libriheavy medium subset	ASR	ENG	https://github.com/k2-fsa/libriheavy
libriheavy_small	Libriheavy small subset	ASR	ENG	https://github.com/k2-fsa/libriheavy
librilight_limited	Librilight-limited subset	ASR	ENG	https://dl.fbaipublicfiles.com/librilight/data/librispeech_finetuning.tgz
librimix	LibriMix: An Open-Source Dataset for Generalizable Speech Separation	SE/DIAR	ENG	https://github.com/JorisCos/LibriMix
librispeech	LibriSpeech ASR corpus	ASR	ENG	http://www.openslr.org/12
librispeech_100	LibriSpeech ASR corpus 100h subset	ASR	ENG	http://www.openslr.org/12
libritts	LibriTTS corpus	TTS	ENG	http://www.openslr.org/60
libritts_r	LibriTTS-R corpus	TTS	ENG	http://www.openslr.org/141
ljspeech	The LJ Speech Dataset	TTS	ENG	https://keithito.com/LJ-Speech-Dataset/
lrs2	The Oxford-BBC Lip Reading Sentences 2 (LRS2) Dataset	Lipreading/ASR	ENG	https://www.robots.ox.ac.uk/~vgg/data/lip_reading/lrs2.html
lrs3	The Oxford-BBC Lip Reading Sentences 3 (LRS3) Dataset	ASR	ENG	https://www.robots.ox.ac.uk/~vgg/data/lip_reading/lrs3.html
lt_slurp_spatialized	Spatialized Libri-Trans and Spatialized SLURP (LT-S and SLURP-S), Enhancement for Translation and Understanding Dataset	SE/ST/SLU	ENG
lt_speech_commands	Lithuanian Speech Commands dataset	LIT	https://github.com/kolesov93/lt_speech_commands
m4singer	Multi-Style, Multi-Singer and Musical Score Provided Mandarin Singing Corpus	SVS	CMN	https://drive.google.com/file/d/1xC37E59EWRRFFLdG3aJkVqwtLDgtFNqW/view?usp=share_link
magicdata	MAGICDATA Mandarin Chinese Read Speech Corpus	ASR	ENG	https://www.openslr.org/68/
media	MEDIA speech database for French	SLU/Entity Classifi.	FRA	https://catalogue.elra.info/en-us/repository/browse/ELRA-S0272/
mediaspeech	MediaSpeech: Multilanguage ASR Benchmark and Dataset	ASR	FRA	https://www.openslr.org/108/
meld	MELD: Multimodal EmotionLines Dataset	SLU	ENG	https://affective-meld.github.io/
microsoft_speech	Microsoft Speech Corpus (Indian languages)	ASR	3 languages	https://msropendata.com/datasets/7230b4b1-912d-400e-be58-f84e0512985e
mini_an4	Mini version of CMU AN4 database for the integration test	ASR/TTS/SE	ENG	http://www.speech.cs.cmu.edu/databases/an4/
mini_librispeech	Mini version of Librispeech corpus	DIAR	ENG	https://openslr.org/31/
misp2021	Multimodal Information Based Speech Processing (MISP) Challenge 2021	ASR/AVSR	MAL	https://mispchallenge.github.io/
ml_openslr63	Crowdsourced high-quality Malayalam multi-speaker speech data	ASR	MAL	https://openslr.org/63/
mls	MLS (A large multilingual corpus derived from LibriVox audiobooks)	ASR	8 languages	http://www.openslr.org/94/
mr_openslr64	OpenSLR Marathi Corpus	ASR	MAR	http://www.openslr.org/64/
ms_indic_is18	Microsoft Speech Corpus (Indian languages)	ASR	3 langs: TEL TAM GUJ	https://msropendata.com/datasets/7230b4b1-912d-400e-be58-f84e0512985e
ml_superb	Multilingual SUPERB benchamrk	ASR	145 languages	Not Released
mucs21_subtask1	MUltilingual and Code-Switching ASR Challenges for Low Resource Indian Languages	ASR	6 indian languages	https://navana-tech.github.io/MUCS2021/challenge_details.html
mucs21_subtask2	MUltilingual and Code-Switching ASR Challenges for Low Resource Indian Languages	ASR	2 codeswitching data	https://navana-tech.github.io/MUCS2021/challenge_details.html
musdb18	Music source separation corpus	ENH	ENG	https://sigsep.github.io/datasets/musdb.htmlmust-c/
must_c	https://ict.fbk.eu/must-c/	ASR/MT/ST	ENG->14langs	https://ict.fbk.eu/must-c/
must_c_v2	https://ict.fbk.eu/must-c/	ASR/MT/ST	ENG->DEU	https://ict.fbk.eu/must-c/
mustard	MUStARD: Multimodal Sarcasm Detection Dataset	SLU	ENG	https://github.com/soujanyaporia/MUStARD/
mustard_plus_plus	A Multimodal Corpus for Emotion Recognition in Sarcasm	SLU	ENG	https://github.com/cfiltnlp/MUStARD_Plus_Plus/
nit_song070	The NITech Japanese speech database	SVS	JPN	http://hts.sp.nitech.ac.jp/archives/2.3/HTS-demo_NIT-SONG070-F001.tar.bz2
nsc	National Speech Corpus	ASR	ENG-SG	https://www.imda.gov.sg/programme-listing/digital-services-lab/national-speech-corpus
ofuton_p_utagoe_db	Ofuton_p_utagoe Singing voice synthesis corpus	SVS	JPN	https://sites.google.com/view/oftn-utagoedb/%E3%83%9B%E3%83%BC%E3%83%A0
oniku_kurumi_utagoe_db	Oniku Singing voice synthesis corpus	SVS	JPN	http://onikuru.info/db-download/
open_li110	Corpus combination with 110 languages	Multilingual ASR	100+ languages
open_li52	Corpus combination with 52 languages(Commonvocie + voxforge)	Multilingual ASR	52 languages
opencpop	Opencpop: Mandarin singing voice synthesis corpus	SVS	CMN	https://wenet.org.cn/opencpop/
pjs	Phoneme-balanced Japanese Singing-voice corpus	SVS	JPN	https://sites.google.com/site/shinnosuketakamichi/research-topics/pjs_corpus
polyphone_swiss_french	Swiss French Polyphone corpus	ASR	FRA	http://catalog.elra.info/en-us/repository/browse/ELRA-S0030_02
portmedia_dom	PortMedia French corpus	SLU/Entity Classifi.	FRA	https://catalogue.elra.info/en-us/repository/browse/ELRA-S0371/
portmedia_lang	PortMedia Italian corpus	SLU/Entity Classifi.	ITA	https://catalogue.elra.info/en-us/repository/browse/ELRA-S0371/
primewords_chinese	Primewords Chinese Corpus Set 1	ASR	CMN	https://www.openslr.org/47/
puebla_nahuatl	Highland Puebla Nahuatl corpus (endangered language in central Mexico)	ASR/ST	HPN	https://www.openslr.org/92/
qasr_tts	TTS character based system using semi-supervised data selection	TTS	ARA	https://arabicspeech.org/qasr_tts
reasonspeech	ReazonSpeech: Japanese Corpus collected from TV Programs	ASR	JPN	https://research.reazon.jp/projects/ReazonSpeech/
reverb	REVERB (REverberant Voice Enhancement and Recognition Benchmark) challenge	ASR	ENG	https://reverb2014.dereverberation.com/
ru_open_stt	Russian Open Speech To Text (STT/ASR) Dataset	ASR	RUS	https://github.com/snakers4/open_stt
ruslan	RUSLAN: Russian Spoken Language Corpus For Speech Synthesis	TTS	RUS	https://ruslan-corpus.github.io/
sdsv21	SdSV 2021: Short-duration Speaker Verification (SdSV) Challenge 2021	SPK	10+ Languages	https://sdsvc.github.io/
seame	SEAME: a Mandarin-English Code-switching Speech Corpus in South-East Asia	ASR	ENG + CMN	https://catalog.ldc.upenn.edu/LDC2015S04
sinhala	Sinhala speech recognition corpus	ASR	SIN	https://drive.google.com/file/d/17_e0JhMW4_FPxfh93foplnxb4OQp8zh3/view?usp=sharing
siwis	SIWIS: Spoken Interaction with Interpretation in Switzerland	TTS	FRA	https://datashare.ed.ac.uk/handle/10283/2353
slue-voxceleb	SLUE: Spoken Language Understanding Evaluation	SLU	ENG	https://github.com/asappresearch/slue-toolkit
slue-voxpopuli	SLUE: Spoken Language Understanding Evaluation	SLU	ENG	https://github.com/asappresearch/slue-toolkit
slurp	SLURP: A Spoken Language Understanding Resource Package	SLU	ENG	https://github.com/pswietojanski/slurp
slurp_entity	SLURP: A Spoken Language Understanding Resource Package	SLU/Entity Classifi.	ENG	https://github.com/pswietojanski/slurp
slurp_spatialized	Spatialized SLURP (SLURP-S), Noisy Reverberan Spoken Language Understanding Dataset	SLU	ENG
sms_wsj	SMS-WSJ: A database for in-depth analysis of multi-channel source separation algorithms	SE	ENG	https://github.com/fgnt/sms_wsj
snips	SNIPS: A dataset for spoken language understanding	SLU	ENG	https://github.com/sonos/spoken-language-understanding-research-datasets
speechcommands	Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition	SLU	ENG	https://www.tensorflow.org/datasets/catalog/speech_commands
spgispeech	SPGISpeech 5k corpus	ASR	ENG	https://datasets.kensho.com/datasets/scribe
spring_speech	SPRING-INX: Data for Indian Languages	ASR	ENG	https://asr.iitm.ac.in/dataset
stop	STOP: Spoken Task Oriented Parsing	SLU	ENG	https://facebookresearch.github.io/spoken_task_oriented_parsing/
su_openslr36	Sundanese	ASR	SUN	http://www.openslr.org/36
swbd	Switchboard Corpus for 2-channel Conversational Telephone Speech (300h)	ASR	ENG	https://catalog.ldc.upenn.edu/LDC97S62
swbd_da	NXT Switchboard Annotations	SLU	ENG	https://catalog.ldc.upenn.edu/LDC2009T26
swbd_sentiment	Speech Sentiment Annotations	SLU	ENG	https://catalog.ldc.upenn.edu/LDC2020T14
talromur	Talromur: A large Icelandic TTS corpus	TTS	ISL	https://repository.clarin.is/repository/xmlui/handle/20.500.12537/104, https://aclanthology.org/2021.nodalida-main.50.pdf
talromur2	Talromur 2: Icelandic multi-speaker TTS corpus	TTS	ISL	https://repository.clarin.is/repository/xmlui/handle/20.500.12537/167
tedlium2	TED-LIUM corpus release 2	ASR	ENG	https://www.openslr.org/19/, http://www.lrec-conf.org/proceedings/lrec2014/pdf/1104_Paper.pdf
tedlium3	TED-LIUM corpus release 3	ASR	ENG	https://www.openslr.org/51/
tedx_spanish_openslr67	TEDx Spanish Corpus	ASR	SPA	https://www.openslr.org/67/
thchs30	A Free Chinese Speech Corpus Released by CSLT@Tsinghua University	ASR/TTS	CMN	https://www.openslr.org/18/
timit	TIMIT Acoustic-Phonetic Continuous Speech Corpus	ASR/UASR	ENG	https://catalog.ldc.upenn.edu/LDC93S1
totonac	Highland Totonac corpus (endangered language in central Mexico)	ASR	TOS	http://www.openslr.org/107/
tsukuyomi	つくよみちゃんコーパス	TTS	JPN	https://tyc.rei-yumesaki.net/material/corpus
universal_se_v1	Combination of Multi-condition English Corpora (vctk_noisy, dns_ins20, chime4, reverb, whamr)	SE	ENG
vctk	English Multi-speaker Corpus for CSTR Voice Cloning Toolkit	ASR/TTS	ENG	http://www.udialogue.org/download/cstr-vctk-corpus.html
vctk_reverb	Reverberant speech database (48kHz)	SE	ENG	https://datashare.ed.ac.uk/handle/10283/2826
vctk_noisyreverb	Noisy reverberant speech database (48kHz)	SE	ENG	https://datashare.ed.ac.uk/handle/10283/2826
vivos	VIVOS (Vietnamese corpus for ASR)	ASR	VIE	https://doi.org/10.5281/zenodo.7068130
voices	VOiCES	ASR/SPK	ENG	https://iqtlabs.github.io/voices/
voxceleb	VoxCeleb	SPK	10+ languages	https://mm.kaist.ac.kr/datasets/voxceleb/
voxforge	VoxForge	ASR	7 languages	http://www.voxforge.org/
wenetspeech	WenetSpeech: A 10000+ Hours Multi-domain Chinese Corpus for Speech Recognition	ASR	CMN	https://wenet-e2e.github.io/WenetSpeech/
wham	The WSJ0 Hipster Ambient Mixtures (WHAM!) dataset	SE	ENG	https://wham.whisper.ai/
whamr	WHAMR!: Noisy and Reverberant Single-Channel Speech Separation	SE	ENG	https://wham.whisper.ai/
wsj	CSR-I (WSJ0) Complete, CSR-II (WSJ1) Complete	ASR	ENG	https://catalog.ldc.upenn.edu/LDC93S6A,https://catalog.ldc.upenn.edu/LDC94S13A
wsj0_2mix	MERL WSJ0-mix multi-speaker dataset	ASR/SE	ENG	http://www.merl.com/demos/deep-clustering
wsj0_2mix_spatialized	MERL WSJ0-mix multi-speaker dataset (Spatialized version)	ASR/Multichannel ASR/SE	ENG	http://www.merl.com/demos/deep-clustering
yesno	The "yesno" corpus	ASR	HEB	http://www.openslr.org/1
yoloxochitl_mixtec	Yoloxochitl-Mixtec corpus (endangered language in central Mexico)	ASR	XTY	http://www.openslr.org/89
zeroth_korean	Zeroth-Korean	ASR	KOR	http://www.openslr.org/40
zh_openslr38	ST-CMDS-20170001_1, Free ST Chinese Mandarin Corpus	ASR	CMN	http://www.openslr.org/38

Name		Name	Last commit message	Last commit date
parent directory ..
TEMPLATE		TEMPLATE
accentdb/asr1		accentdb/asr1
accented_french_openslr57/asr1		accented_french_openslr57/asr1
acesinger/svs1		acesinger/svs1
aesrc2020/asr1		aesrc2020/asr1
aidatatang_200zh/asr1		aidatatang_200zh/asr1
aishell/asr1		aishell/asr1
aishell2/asr1		aishell2/asr1
aishell3/tts1		aishell3/tts1
aishell4		aishell4
ameboshi/svs1		ameboshi/svs1
americasnlp22/asr1		americasnlp22/asr1
ami/asr1		ami/asr1
an4		an4
aphasiabank/asr1		aphasiabank/asr1
arabic_sc/asr1		arabic_sc/asr1
asvspoof/asr1		asvspoof/asr1
babel/asr1		babel/asr1
bibletts/tts1		bibletts/tts1
bn_openslr53/asr1		bn_openslr53/asr1
bur_openslr80/asr1		bur_openslr80/asr1
catslu/asr1		catslu/asr1
catslu_entity/asr1		catslu_entity/asr1
chime1/enh1		chime1/enh1
chime2/enh1		chime2/enh1
chime4		chime4
chime6/asr1		chime6/asr1
chime7_task1		chime7_task1
chime8_task1		chime8_task1
clarity21/enh1		clarity21/enh1
cmu_arctic/tts1		cmu_arctic/tts1
cmu_indic/tts1		cmu_indic/tts1
commonvoice		commonvoice
conferencingspeech21/enh1		conferencingspeech21/enh1
covost2		covost2
csj/asr1		csj/asr1
csmsc/tts1		csmsc/tts1
css10/tts1		css10/tts1
cvss		cvss
dcase22_task1/asr1		dcase22_task1/asr1
dirha_wsj/asr1		dirha_wsj/asr1
dns_icassp21/enh1		dns_icassp21/enh1
dns_icassp22/enh1		dns_icassp22/enh1
dns_ins20/enh1		dns_ins20/enh1
dns_ins21/enh1		dns_ins21/enh1
dsing/asr1		dsing/asr1
easycom/avsr1		easycom/avsr1
esc50/asr1		esc50/asr1
espnet_tutorial/asvspoof1		espnet_tutorial/asvspoof1
fisher_callhome_spanish		fisher_callhome_spanish
fleurs/asr1		fleurs/asr1
freesound/asr1		freesound/asr1
fsc/asr1		fsc/asr1
fsc_challenge		fsc_challenge
fsc_unseen/asr1		fsc_unseen/asr1
gigaspeech		gigaspeech
googlei18n_lowresource/tts1		googlei18n_lowresource/tts1
grabo/asr1		grabo/asr1
gramvaani/asr1		gramvaani/asr1
harpervalley/asr1		harpervalley/asr1
hkust/asr1		hkust/asr1
how2/asr1		how2/asr1
how2_2000h		how2_2000h
hub4_spanish/asr1		hub4_spanish/asr1
hui_acg/tts1		hui_acg/tts1
iam/ocr1		iam/ocr1
iemocap/asr1		iemocap/asr1
indic_speech/tts1		indic_speech/tts1
interspeech2024_dsu_challenge/asr2		interspeech2024_dsu_challenge/asr2
itako/svs1		itako/svs1
iwslt14/mt1		iwslt14/mt1
iwslt21_low_resource/asr1		iwslt21_low_resource/asr1
iwslt22_dialect		iwslt22_dialect
iwslt22_low_resource/st1		iwslt22_low_resource/st1
jdcinal/asr1		jdcinal/asr1
jkac/tts1		jkac/tts1
jmd/tts1		jmd/tts1
jsss/tts1		jsss/tts1
jsut		jsut
jsut_song/svs1		jsut_song/svs1
jtubespeech		jtubespeech
jv_openslr35/asr1		jv_openslr35/asr1
jvs/tts1		jvs/tts1
kathbath/asr1		kathbath/asr1
kiritan/svs1		kiritan/svs1
kising/svs1		kising/svs1
ksc/asr1		ksc/asr1
ksponspeech/asr1		ksponspeech/asr1
kss/tts1		kss/tts1
l3das22/enh1		l3das22/enh1
laborotv/asr1		laborotv/asr1
libriheavy_medium/asr2		libriheavy_medium/asr2
libriheavy_small/asr2		libriheavy_small/asr2
librilight_limited/asr1		librilight_limited/asr1
librimix		librimix
librispeech		librispeech
librispeech_100		librispeech_100
libritts		libritts
libritts_r/tts1		libritts_r/tts1
ljspeech/tts1		ljspeech/tts1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

egs2 (Examples of ESPnet2)

How to use?

Overview of example information

FilesExpand file tree

egs2

Directory actions

More options

Directory actions

More options

Latest commit

History

egs2

Folders and files

parent directory

README.md

egs2 (Examples of ESPnet2)

How to use?

Overview of example information