egs2

egs2 (Examples of ESPnet2)

How to use?

See: https://espnet.github.io/espnet/espnet2_tutorial.html#recipes-using-espnet2

Overview of example information

Directory name	Corpus name	Task	Language	URL
aidatatang_200zh	Aidatatang_200zh A free Chinese Mandarin speech corpus	ASR	CMN	http://www.openslr.org/resources/62
aishell	AISHELL-ASR0009-OS1 Open Source Mandarin Speech Corpus	ASR	CMN	http://www.aishelltech.com/kysjcp
aishell3	AISHELL3 Mandarin multi-speaker text-to-speech	TTS	CMN	https://www.openslr.org/93/
ami	The AMI Meeting Corpus	ASR	ENG	http://groups.inf.ed.ac.uk/ami/corpus/
an4	CMU AN4 database	ASR/TTS	ENG	http://www.speech.cs.cmu.edu/databases/an4/
babel	IARPA Babel corups	ASR	~20 languages	https://www.iarpa.gov/index.php/research-programs/babel
bn_openslr53	Large bengali ASR training dataset	ASR	BEN	https://openslr.org/53/
catslu	CATSLU-MAPS	SLU	CMN	https://sites.google.com/view/catslu/home
chime4	The 4th CHiME Speech Separation and Recognition Challenge	ASR/Multichannel ASR	ENG	http://spandh.dcs.shef.ac.uk/chime_challenge/chime2016/
cmu_indic	CMU INDIC	TTS	7 languages	http://festvox.org/cmu_indic/
commonvoice	The Mozilla Common Voice	ASR	13 languages	https://voice.mozilla.org/datasets
csj	Corpus of Spontaneous Japanese	ASR	JPN	https://pj.ninjal.ac.jp/corpus_center/csj/en/
csmsc	Chinese Standard Mandarin Speech Copus	TTS	CMN	https://www.data-baker.com/open_source.html
css10	CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages	TTS	10 langauges	https://github.com/Kyubyong/css10
dirha_wsj	Distant-speech Interaction for Robust Home Applications	Multichannel ASR	ENG	https://dirha.fbk.eu/, https://github.com/SHINE-FBK/DIRHA_English_wsj
dns_ins20	Deep Noise Suppression Challenge – INTERSPEECH 2020	SE	7 languages + singing	https://www.microsoft.com/en-us/research/academic-program/deep-noise-suppression-challenge-interspeech-2020/
dsing	Automatic Lyric Transcription from Karaoke Vocal Tracks (From DAMP Sing300x30x2)	ASR (ALT)	ENG singing	https://github.com/groadabike/Kaldi-Dsing-task
fisher_callhome_spanish	Fisher and CALLHOME Spanish--English Speech Translation	ASR/ST	SPA->ENG	https://catalog.ldc.upenn.edu/LDC2014T23
fsc	Fluent Speech Commands Dataset	SLU	ENG	https://fluent.ai/fluent-speech-commands-a-dataset-for-spoken-language-understanding-research/
fsc_unseen	Fluent Speech Commands Dataset MASE Eval Unseen splits	SLU	ENG	https://github.com/maseEval/mase
fsc_challenge	Fluent Speech Commands Dataset MASE Eval Challenge splits	SLU	ENG	https://github.com/maseEval/mase
gigaspeech	GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio	ASR	ENG	https://github.com/SpeechColab/GigaSpeech
grabo	Grabo dataset	SLU	ENG + NLD	https://www.esat.kuleuven.be/psi/spraak/downloads/
hkust	HKUST/MTS: A very large scale Mandarin telephone speech corpus	ASR	CMN	https://catalog.ldc.upenn.edu/LDC2005S15
hui_acg	HUI-audio-corpus-german	TTS	DEU	https://opendata.iisys.de/datasets.html#hui-audio-corpus-german
how2	How2: A Large-scale Dataset for Multimodal Language Understanding	ASR/MT/ST	ENG->POR	https://github.com/srvk/how2-dataset
iemocap	IEMOCAP database: The Interactive Emotional Dyadic Motion Capture database	SLU	ENG	https://sail.usc.edu/iemocap/
iwslt21_low_resource	ALFFA, IARPA Babel, Gamayun, IWSLT 2021	ASR	SWA	http://www.openslr.org/25/ https://catalog.ldc.upenn.edu/LDC2017S05 https://gamayun.translatorswb.org/data/ https://iwslt.org/2021/low-resource
jdcinal	Japanese Dialogue Corpus of Information Navigation and Attentive Listening Annotated with Extended ISO-24617-2 Dialogue Act Tags	SLU	JPN	http://www.lrec-conf.org/proceedings/lrec2018/pdf/464.pdf http://tts.speech.cs.cmu.edu/awb/infomation_navigation_and_attentive_listening_0.2.zip
jkac	J-KAC: Japanese Kamishibai and audiobook corpus	TTS	JPN	https://sites.google.com/site/shinnosuketakamichi/research-topics/j-kac_corpus
jmd	JMD: Japanese multi-dialect corpus for speech synthesis	TTS	JPN	https://sites.google.com/site/shinnosuketakamichi/research-topics/jmd_corpus
jsss	JSSS: Japanese speech corpus for summarization and simplification	TTS	JPN	https://sites.google.com/site/shinnosuketakamichi/research-topics/jsss_corpus
jsut	Japanese speech corpus of Saruwatari-lab., University of Tokyo	ASR/TTS	JPN	https://sites.google.com/site/shinnosuketakamichi/publication/jsut
jtubespeech	Japanese YouTube Speech corpus	ASR/TTS	JPN
jv_openslr35	Javanese	ASR	JAV	http://www.openslr.org/35
jvs	JVS (Japanese versatile speech) corpus	TTS	JPN	https://sites.google.com/site/shinnosuketakamichi/research-topics/jvs_corpus
ksponspeech	KsponSpeech (Korean spontaneous speech) corpus	ASR	KOR	https://aihub.or.kr/aidata/105
kss	Korean single speaker corpus	TTS	KOR	https://www.kaggle.com/bryanpark/korean-single-speaker-speech-dataset
laborotv	LaboroTVSpeech (A large-scale Japanese speech corpus on TV recordings)	ASR	JPN	https://laboro.ai/column/eg-laboro-tv-corpus-jp
librimix	LibriMix: An Open-Source Dataset for Generalizable Speech Separation	SE	ENG	https://github.com/JorisCos/LibriMix
librispeech	LibriSpeech ASR corpus	ASR	ENG	http://www.openslr.org/12
librispeech_100	LibriSpeech ASR corpus 100h subset	ASR	ENG	http://www.openslr.org/12
libritts	LibriTTS corpus	TTS	ENG	http://www.openslr.org/60
ljspeech	The LJ Speech Dataset	TTS	ENG	https://keithito.com/LJ-Speech-Dataset/
lrs3	The Oxford-BBC Lip Reading Sentences 3 (LRS3) Dataset	ASR	ENG	https://www.robots.ox.ac.uk/~vgg/data/lip_reading/lrs3.html
lrs2	The Oxford-BBC Lip Reading Sentences 2 (LRS2) Dataset	Lipreading/ASR	ENG	https://www.robots.ox.ac.uk/~vgg/data/lip_reading/lrs2.html
mini_an4	Mini version of CMU AN4 database for the integration test	ASR/TTS/SE	ENG	http://www.speech.cs.cmu.edu/databases/an4/
mini_librispeech	Mini version of Librispeech corpus	DIAR	ENG	https://openslr.org/31/
mls	MLS (A large multilingual corpus derived from LibriVox audiobooks)	ASR	8 languages	http://www.openslr.org/94/
mr_openslr64	OpenSLR Marathi Corpus	ASR	MAR	http://www.openslr.org/64/
ms_indic_is18	Microsoft Speech Corpus (Indian languages)	ASR	3 langs: TEL TAM GUJ	https://msropendata.com/datasets/7230b4b1-912d-400e-be58-f84e0512985e
nsc	National Speech Corpus	ASR	ENG-SG	https://www.imda.gov.sg/programme-listing/digital-services-lab/national-speech-corpus
open_li52	Corpus combination with 52 languages(Commonvocie + voxforge)	Multilingual ASR	52 languages
polyphone_swiss_french	Swiss French Polyphone corpus	ASR	FRA	http://catalog.elra.info/en-us/repository/browse/ELRA-S0030_02
primewords_chinese	Primewords Chinese Corpus Set 1	ASR	CMN	https://www.openslr.org/47/
puebla_nahuatl	Highland Puebla Nahuatl corpus (endangered language in central Mexico)	ASR	HPN	https://www.openslr.org/92/
reverb	REVERB (REverberant Voice Enhancement and Recognition Benchmark) challenge	ASR	ENG	https://reverb2014.dereverberation.com/
ru_open_stt	Russian Open Speech To Text (STT/ASR) Dataset	ASR	RUS	https://github.com/snakers4/open_stt
ruslan	RUSLAN: Russian Spoken Language Corpus For Speech Synthesis	TTS	RUS	https://ruslan-corpus.github.io/
snips	SNIPS: A dataset for spoken language understanding	SLU	ENG	https://github.com/sonos/spoken-language-understanding-research-datasets
seame	SEAME: a Mandarin-English Code-switching Speech Corpus in South-East Asia	ASR	ENG + CMN	https://catalog.ldc.upenn.edu/LDC2015S04
siwis	SIWIS: Spoken Interaction with Interpretation in Switzerland	TTS	FRA	https://https://datashare.ed.ac.uk/handle/10283/2353
slue-voxceleb	SLUE: Spoken Language Understanding Evaluation	SLU	ENG	https://github.com/asappresearch/slue-toolkit
slurp	SLURP: A Spoken Language Understanding Resource Package	SLU	ENG	https://github.com/pswietojanski/slurp
slurp_entity	SLURP: A Spoken Language Understanding Resource Package	SLU/Entity Classifi.	ENG	https://github.com/pswietojanski/slurp
sms_wsj	SMS-WSJ: A database for in-depth analysis of multi-channel source separation algorithms	SE	ENG	https://github.com/fgnt/sms_wsj
speechcommands	Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition	SLU	ENG	https://www.tensorflow.org/datasets/catalog/speech_commands
spgispeech	SPGISpeech 5k corpus	ASR	ENG	https://datasets.kensho.com/datasets/scribe
su_openslr36	Sundanese	ASR	SUN	http://www.openslr.org/36
swbd	Switchboard Corpus for 2-channel Conversational Telephone Speech (300h)	ASR	ENG	https://catalog.ldc.upenn.edu/LDC97S62
swbd_da	NXT Switchboard Annotations	SLU	ENG	https://catalog.ldc.upenn.edu/LDC2009T26
swbd_sentiment	Speech Sentiment Annotations	SLU	ENG	https://catalog.ldc.upenn.edu/LDC2020T14
tedlium2	TED-LIUM corpus release 2	ASR	ENG	https://www.openslr.org/19/, http://www.lrec-conf.org/proceedings/lrec2014/pdf/1104_Paper.pdf
thchs30	A Free Chinese Speech Corpus Released by CSLT@Tsinghua University	TTS	CMN	https://www.openslr.org/18/
timit	TIMIT Acoustic-Phonetic Continuous Speech Corpus	ASR	ENG	https://catalog.ldc.upenn.edu/LDC93S1
totonac	Highland Totonac corpus (endangered language in central Mexico)	ASR	TOS	http://www.openslr.org/107/
tsukuyomi	つくよみちゃんコーパス	TTS	JPN	https://tyc.rei-yumesaki.net/material/corpus
vctk	English Multi-speaker Corpus for CSTR Voice Cloning Toolkit	ASR/TTS	ENG	http://www.udialogue.org/download/cstr-vctk-corpus.html
vctk_noisyreverb	Noisy reverberant speech database (48kHz)	SE	ENG	https://datashare.ed.ac.uk/handle/10283/2826
vivos	VIVOS (Vietnamese corpus for ASR)	ASR	VIE	https://ailab.hcmus.edu.vn/vivos/
voxforge	VoxForge	ASR	7 languages	http://www.voxforge.org/
wenetspeech	WenetSpeech: A 10000+ Hours Multi-domain Chinese Corpus for Speech Recognition	ASR	CMN	https://wenet-e2e.github.io/WenetSpeech/
wham	The WSJ0 Hipster Ambient Mixtures (WHAM!) dataset	SE	ENG	https://wham.whisper.ai/
whamr	WHAMR!: Noisy and Reverberant Single-Channel Speech Separation	SE	ENG	https://wham.whisper.ai/
wsj	CSR-I (WSJ0) Complete, CSR-II (WSJ1) Complete	ASR	ENG	https://catalog.ldc.upenn.edu/LDC93S6A,https://catalog.ldc.upenn.edu/LDC94S13A
wsj0_2mix	MERL WSJ0-mix multi-speaker dataset	ASR/SE	ENG	http://www.merl.com/demos/deep-clustering
wsj0_2mix_spatialized	MERL WSJ0-mix multi-speaker dataset (Spatialized version)	ASR/Multichannel ASR/SE	ENG	http://www.merl.com/demos/deep-clustering
yesno	The "yesno" corpus	ASR	HEB	http://www.openslr.org/1
yoloxochitl_mixtec	Yoloxochitl-Mixtec corpus (endangered language in central Mexico)	ASR	XTY	http://www.openslr.org/89
zeroth_korean	Zeroth-Korean	ASR	KOR	http://www.openslr.org/40
zh_openslr38	ST-CMDS-20170001_1, Free ST Chinese Mandarin Corpus	ASR	CMN	http://www.openslr.org/38

Name		Name	Last commit message	Last commit date
parent directory ..
TEMPLATE		TEMPLATE
aidatatang_200zh/asr1		aidatatang_200zh/asr1
aishell/asr1		aishell/asr1
aishell3/tts1		aishell3/tts1
aishell4/asr1		aishell4/asr1
ami/asr1		ami/asr1
an4		an4
babel/asr1		babel/asr1
bn_openslr53/asr1		bn_openslr53/asr1
catslu/asr1		catslu/asr1
chime4		chime4
cmu_arctic/tts1		cmu_arctic/tts1
cmu_indic/tts1		cmu_indic/tts1
commonvoice/asr1		commonvoice/asr1
csj/asr1		csj/asr1
csmsc/tts1		csmsc/tts1
css10/tts1		css10/tts1
dirha_wsj/asr1		dirha_wsj/asr1
dns_ins20/enh1		dns_ins20/enh1
dsing/asr1		dsing/asr1
fisher_callhome_spanish		fisher_callhome_spanish
fsc/asr1		fsc/asr1
fsc_challenge/asr1		fsc_challenge/asr1
fsc_unseen/asr1		fsc_unseen/asr1
gigaspeech/asr1		gigaspeech/asr1
googlei18n_lowresource/tts1		googlei18n_lowresource/tts1
grabo/asr1		grabo/asr1
hkust/asr1		hkust/asr1
how2/asr1		how2/asr1
how2_2000h		how2_2000h
hub4_spanish/asr1		hub4_spanish/asr1
hui_acg/tts1		hui_acg/tts1
iemocap/asr1		iemocap/asr1
indic_speech/tts1		indic_speech/tts1
iwslt14/mt1		iwslt14/mt1
iwslt21_low_resource/asr1		iwslt21_low_resource/asr1
iwslt22_dialect		iwslt22_dialect
jdcinal/asr1		jdcinal/asr1
jkac/tts1		jkac/tts1
jmd/tts1		jmd/tts1
jsss/tts1		jsss/tts1
jsut		jsut
jtubespeech		jtubespeech
jv_openslr35/asr1		jv_openslr35/asr1
jvs/tts1		jvs/tts1
ksponspeech/asr1		ksponspeech/asr1
kss/tts1		kss/tts1
laborotv/asr1		laborotv/asr1
librilight_limited/asr1		librilight_limited/asr1
librimix		librimix
librispeech		librispeech
librispeech_100/asr1		librispeech_100/asr1
libritts/tts1		libritts/tts1
ljspeech/tts1		ljspeech/tts1
lrs2/lipreading1		lrs2/lipreading1
lrs3/asr1		lrs3/asr1
mini_an4		mini_an4
mini_librispeech/diar1		mini_librispeech/diar1
misp2021		misp2021
mls/asr1		mls/asr1
mr_openslr64/asr1		mr_openslr64/asr1
ms_indic_18/asr1		ms_indic_18/asr1
mucs21_subtask1/asr1		mucs21_subtask1/asr1
mucs21_subtask2/asr1		mucs21_subtask2/asr1
nsc/asr1		nsc/asr1
open_li52/asr1		open_li52/asr1
polyphone_swiss_french/asr1		polyphone_swiss_french/asr1
primewords_chinese/asr1		primewords_chinese/asr1
puebla_nahuatl		puebla_nahuatl
reverb/asr1		reverb/asr1
ru_open_stt/asr1		ru_open_stt/asr1
ruslan/tts1		ruslan/tts1
seame/asr1		seame/asr1
sinhala/asr1		sinhala/asr1
siwis/tts1		siwis/tts1
slue-voxceleb/asr1		slue-voxceleb/asr1
slurp/asr1		slurp/asr1
slurp_entity/asr1		slurp_entity/asr1
sms_wsj/enh1		sms_wsj/enh1
snips/asr1		snips/asr1
speechcommands/asr1		speechcommands/asr1
spgispeech/asr1		spgispeech/asr1
su_openslr36/asr1		su_openslr36/asr1
swbd/asr1		swbd/asr1
swbd_da/asr1		swbd_da/asr1
swbd_sentiment/asr1		swbd_sentiment/asr1
tedlium2/asr1		tedlium2/asr1
thchs30		thchs30
timit/asr1		timit/asr1
totonac/asr1		totonac/asr1
tsukuyomi/tts1		tsukuyomi/tts1
vctk		vctk
vctk_noisy/enh1		vctk_noisy/enh1
vctk_noisyreverb/enh1		vctk_noisyreverb/enh1
vivos/asr1		vivos/asr1
voxforge/asr1		voxforge/asr1
wenetspeech/asr1		wenetspeech/asr1
wham/enh1		wham/enh1
whamr/enh1		whamr/enh1
wsj/asr1		wsj/asr1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

egs2 (Examples of ESPnet2)

How to use?

Overview of example information

FilesExpand file tree

egs2

Directory actions

More options

Directory actions

More options

Latest commit

History

egs2

Folders and files

parent directory

README.md

egs2 (Examples of ESPnet2)

How to use?

Overview of example information