egs2

egs2 (Examples of ESPnet2)

How to use?

See: https://espnet.github.io/espnet/espnet2_tutorial.html#recipes-using-espnet2

Overview of example information

Directory name	Corpus name	Task	Language	URL
aishell	AISHELL-ASR0009-OS1 Open Source Mandarin Speech Corpus	ASR	ZH	http://www.aishelltech.com/kysjcp
ami	The AMI Meeting Corpus	ASR	EN	http://groups.inf.ed.ac.uk/ami/corpus/
an4	CMU AN4 database	ASR/TTS	EN	http://www.speech.cs.cmu.edu/databases/an4/
babel	IARPA Babel corups	ASR	~20 languages	https://www.iarpa.gov/index.php/research-programs/babel
chime4	The 4th CHiME Speech Separation and Recognition Challenge	ASR/Multichannel ASR	EN	http://spandh.dcs.shef.ac.uk/chime_challenge/chime2016/
cmu_indic	CMU INDIC	TTS	7 languages	http://festvox.org/cmu_indic/
commonvoice	The Mozilla Common Voice	ASR	13 languages	https://voice.mozilla.org/datasets
csj	Corpus of Spontaneous Japanese	ASR	JP	https://pj.ninjal.ac.jp/corpus_center/csj/en/
csmsc	Chinese Standard Mandarin Speech Copus	TTS	ZH	https://www.data-baker.com/open_source.html
css10	CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages	TTS	10 langauges	https://github.com/Kyubyong/css10
dirha_wsj	Distant-speech Interaction for Robust Home Applications	Multichannel ASR	EN	https://dirha.fbk.eu/, https://github.com/SHINE-FBK/DIRHA_English_wsj
dns_ins20	Deep Noise Suppression Challenge – INTERSPEECH 2020	SE	7 languages + singing	https://www.microsoft.com/en-us/research/academic-program/deep-noise-suppression-challenge-interspeech-2020/
fsc	Fluent Speech Commands Dataset	SLU	EN	https://fluent.ai/fluent-speech-commands-a-dataset-for-spoken-language-understanding-research/
gigaspeech	GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio	ASR	EN	https://github.com/SpeechColab/GigaSpeech
hkust	HKUST/MTS: A very large scale Mandarin telephone speech corpus	ASR	ZH	https://catalog.ldc.upenn.edu/LDC2005S15
hui_acg	HUI-audio-corpus-german	TTS	DE	https://opendata.iisys.de/datasets.html#hui-audio-corpus-german
how2	How2: A Large-scale Dataset for Multimodal Language Understanding	ASR/MT/ST	EN->PT	https://github.com/srvk/how2-dataset
iwslt21_low_resource	ALFFA, IARPA Babel, Gamayun, IWSLT 2021	ASR	SW	http://www.openslr.org/25/ https://catalog.ldc.upenn.edu/LDC2017S05 https://gamayun.translatorswb.org/data/ https://iwslt.org/2021/low-resource
jkac	J-KAC: Japanese Kamishibai and audiobook corpus	TTS	JP	https://sites.google.com/site/shinnosuketakamichi/research-topics/j-kac_corpus
jmd	JMD: Japanese multi-dialect corpus for speech synthesis	TTS	JP	https://sites.google.com/site/shinnosuketakamichi/research-topics/jmd_corpus
jsss	JSSS: Japanese speech corpus for summarization and simplification	TTS	JP	https://sites.google.com/site/shinnosuketakamichi/research-topics/jsss_corpus
jsut	Japanese speech corpus of Saruwatari-lab., University of Tokyo	ASR/TTS	JP	https://sites.google.com/site/shinnosuketakamichi/publication/jsut
jtubespeech	Japanese YouTube Speech corpus	ASR/TTS	JP
jv_openslr35	Javanese	ASR	JV	http://www.openslr.org/35
jvs	JVS (Japanese versatile speech) corpus	TTS	JP	https://sites.google.com/site/shinnosuketakamichi/research-topics/jvs_corpus
ksponspeech	KsponSpeech (Korean spontaneous speech) corpus	ASR	KR	https://aihub.or.kr/aidata/105
kss	Korean single speaker corpus	TTS	KO	https://www.kaggle.com/bryanpark/korean-single-speaker-speech-dataset
laborotv	LaboroTVSpeech (A large-scale Japanese speech corpus on TV recordings)	ASR	JP	https://laboro.ai/column/eg-laboro-tv-corpus-jp
librimix	LibriMix: An Open-Source Dataset for Generalizable Speech Separation	SE	EN	https://github.com/JorisCos/LibriMix
librispeech	LibriSpeech ASR corpus	ASR	EN	http://www.openslr.org/12
libritts	LibriTTS corpus	TTS	EN	http://www.openslr.org/60
ljspeech	The LJ Speech Dataset	TTS	EN	https://keithito.com/LJ-Speech-Dataset/
lrs2	The Oxford-BBC Lip Reading Sentences 2 (LRS2) Dataset	Lipreading/ASR	EN	https://www.robots.ox.ac.uk/~vgg/data/lip_reading/lrs2.html
mini_an4	Mini version of CMU AN4 database for the integration test	ASR/TTS/SE	EN	http://www.speech.cs.cmu.edu/databases/an4/
mini_librispeech	Mini version of Librispeech corpus	DIAR	EN	https://openslr.org/31/
mls	MLS (A large multilingual corpus derived from LibriVox audiobooks)	ASR	8 languages	http://www.openslr.org/94/
nsc	National Speech Corpus	ASR	EN-SG	https://www.imda.gov.sg/programme-listing/digital-services-lab/national-speech-corpus
open_li52	Corpus combination with 52 languages(Commonvocie + voxforge)	Multilingual ASR	52 languages
polyphone_swiss_french	Swiss French Polyphone corpus	ASR	FR	http://catalog.elra.info/en-us/repository/browse/ELRA-S0030_02
puebla_nahuatl	Highland Puebla Nahuatl corpus	ASR	HPN	https://www.openslr.org/92/
reverb	REVERB (REverberant Voice Enhancement and Recognition Benchmark) challenge	ASR	EN	https://reverb2014.dereverberation.com/
ru_open_stt	Russian Open Speech To Text (STT/ASR) Dataset	ASR	RU	https://github.com/snakers4/open_stt
ruslan	RUSLAN: Russian Spoken Language Corpus For Speech Synthesis	TTS	RU	https://ruslan-corpus.github.io/
snips	SNIPS: A dataset for spoken language understanding	SLU	EN	https://github.com/sonos/spoken-language-understanding-research-datasets
siwis	SIWIS: Spoken Interaction with Interpretation in Switzerland	TTS	FR	https://https://datashare.ed.ac.uk/handle/10283/2353
sms_wsj	SMS-WSJ: A database for in-depth analysis of multi-channel source separation algorithms	SE	EN	https://github.com/fgnt/sms_wsj
spgispeech	SPGISpeech 5k corpus	ASR	EN	https://datasets.kensho.com/datasets/scribe
su_openslr36	Sundanese	ASR	SU	http://www.openslr.org/36
swbd	Switchboard Corpus for 2-channel Conversational Telephone Speech (300h)	ASR	EN	https://catalog.ldc.upenn.edu/LDC97S62
swbd_da	NXT Switchboard Annotations	SLU	EN	https://catalog.ldc.upenn.edu/LDC2009T26
timit	TIMIT Acoustic-Phonetic Continuous Speech Corpus	ASR	EN	https://catalog.ldc.upenn.edu/LDC93S1
tsukuyomi	つくよみちゃんコーパス	tTS	JP	https://tyc.rei-yumesaki.net/material/corpus
vctk	English Multi-speaker Corpus for CSTR Voice Cloning Toolkit	TTS	EN	http://www.udialogue.org/download/cstr-vctk-corpus.html
vctk_noisyreverb	Noisy reverberant speech database (48kHz)	SE	EN	https://datashare.ed.ac.uk/handle/10283/2826
vivos	VIVOS (Vietnamese corpus for ASR)	ASR	VI	https://ailab.hcmus.edu.vn/vivos/
voxforge	VoxForge	ASR	7 languages	http://www.voxforge.org/
wham	The WSJ0 Hipster Ambient Mixtures (WHAM!) dataset	SE	EN	https://wham.whisper.ai/
whamr	WHAMR!: Noisy and Reverberant Single-Channel Speech Separation	SE	EN	https://wham.whisper.ai/
wsj	CSR-I (WSJ0) Complete, CSR-II (WSJ1) Complete	ASR	EN	https://catalog.ldc.upenn.edu/LDC93S6A,https://catalog.ldc.upenn.edu/LDC94S13A
wsj0_2mix	MERL WSJ0-mix multi-speaker dataset	ASR/SE	EN	http://www.merl.com/demos/deep-clustering
wsj0_2mix_spatialized	MERL WSJ0-mix multi-speaker dataset (Spatialized version)	ASR/Multichannel ASR/SE	EN	http://www.merl.com/demos/deep-clustering
yesno	The "yesno" corpus	ASR	HE	http://www.openslr.org/1
zeroth_korean	Zeroth-Korean	ASR	KR	http://www.openslr.org/40

Name		Name	Last commit message	Last commit date
parent directory ..
TEMPLATE		TEMPLATE
aishell/asr1		aishell/asr1
aishell3/tts1		aishell3/tts1
ami/asr1		ami/asr1
an4		an4
babel/asr1		babel/asr1
chime4		chime4
cmu_indic/tts1		cmu_indic/tts1
commonvoice/asr1		commonvoice/asr1
csj/asr1		csj/asr1
csmsc/tts1		csmsc/tts1
css10/tts1		css10/tts1
dirha_wsj/asr1		dirha_wsj/asr1
dns_ins20/enh1		dns_ins20/enh1
fsc/asr1		fsc/asr1
gigaspeech/asr1		gigaspeech/asr1
hkust/asr1		hkust/asr1
how2/asr1		how2/asr1
hui_acg/tts1		hui_acg/tts1
indic_speech/tts1		indic_speech/tts1
iwslt21_low_resource/asr1		iwslt21_low_resource/asr1
jkac/tts1		jkac/tts1
jmd/tts1		jmd/tts1
jsss/tts1		jsss/tts1
jsut		jsut
jtubespeech		jtubespeech
jv_openslr35/asr1		jv_openslr35/asr1
jvs/tts1		jvs/tts1
ksponspeech/asr1		ksponspeech/asr1
kss/tts1		kss/tts1
laborotv/asr1		laborotv/asr1
librilight_limited/asr1		librilight_limited/asr1
librimix/enh1		librimix/enh1
librispeech		librispeech
libritts/tts1		libritts/tts1
ljspeech/tts1		ljspeech/tts1
lrs2/lipreading1		lrs2/lipreading1
mini_an4		mini_an4
mini_librispeech/diar1		mini_librispeech/diar1
mls/asr1		mls/asr1
mucs21_subtask1/asr1		mucs21_subtask1/asr1
mucs21_subtask2/asr1		mucs21_subtask2/asr1
nsc/asr1		nsc/asr1
open_li52/asr1		open_li52/asr1
polyphone_swiss_french/asr1		polyphone_swiss_french/asr1
puebla_nahuatl/asr1		puebla_nahuatl/asr1
reverb/asr1		reverb/asr1
ru_open_stt/asr1		ru_open_stt/asr1
ruslan/tts1		ruslan/tts1
siwis/tts1		siwis/tts1
slurp/asr1		slurp/asr1
sms_wsj/enh1		sms_wsj/enh1
snips/asr1		snips/asr1
spgispeech/asr1		spgispeech/asr1
su_openslr36/asr1		su_openslr36/asr1
swbd/asr1		swbd/asr1
swbd_da/asr1		swbd_da/asr1
thchs30		thchs30
timit/asr1		timit/asr1
tsukuyomi/tts1		tsukuyomi/tts1
vctk/tts1		vctk/tts1
vctk_noisy/enh1		vctk_noisy/enh1
vctk_noisyreverb/enh1		vctk_noisyreverb/enh1
vivos/asr1		vivos/asr1
voxforge/asr1		voxforge/asr1
wham/enh1		wham/enh1
whamr/enh1		whamr/enh1
wsj/asr1		wsj/asr1
wsj0_2mix		wsj0_2mix
wsj0_2mix_spatialized/enh1		wsj0_2mix_spatialized/enh1
yesno/asr1		yesno/asr1
yoloxochitl_mixtec/asr1		yoloxochitl_mixtec/asr1
zeroth_korean/asr1		zeroth_korean/asr1
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

egs2 (Examples of ESPnet2)

How to use?

Overview of example information

FilesExpand file tree

egs2

Directory actions

More options

Directory actions

More options

Latest commit

History

egs2

Folders and files

parent directory

README.md

egs2 (Examples of ESPnet2)

How to use?

Overview of example information