egs2

egs2 (Examples of ESPnet2)

How to use?

See: https://espnet.github.io/espnet/espnet2_tutorial.html#recipes-using-espnet2

Overview of example information

Directory name	Corpus name	Task	Language	URL
aishell	AISHELL-ASR0009-OS1 Open Source Mandarin Speech Corpus	ASR	ZH	http://www.aishelltech.com/kysjcp
ami	The AMI Meeting Corpus	ASR	EN	http://groups.inf.ed.ac.uk/ami/corpus/
an4	CMU AN4 database	ASR/TTS	EN	http://www.speech.cs.cmu.edu/databases/an4/
babel	IARPA Babel corups	ASR	~20 Languages	https://www.iarpa.gov/index.php/research-programs/babel
chime4	The 4th CHiME Speech Separation and Recognition Challenge	ASR/Multichannel ASR	EN	http://spandh.dcs.shef.ac.uk/chime_challenge/chime2016/
commonvoice	The Mozilla Common Voice	ASR	13 Languages	https://voice.mozilla.org/datasets
csj	Corpus of Spontaneous Japanese	ASR	JP	https://pj.ninjal.ac.jp/corpus_center/csj/en/
csmsc	Chinese Standard Mandarin Speech Copus	TTS	ZH	https://www.data-baker.com/open_source.html
dirha_wsj	Distant-speech Interaction for Robust Home Applications	Multi-Array ASR	EN	https://dirha.fbk.eu/, https://github.com/SHINE-FBK/DIRHA_English_wsj
dns_ins20	Deep Noise Suppression Challenge – INTERSPEECH 2020	SE	7 Languages + singing	https://www.microsoft.com/en-us/research/academic-program/deep-noise-suppression-challenge-interspeech-2020/
gigaspeech	GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio	ASR	EN	https://github.com/SpeechColab/GigaSpeech
hkust	HKUST/MTS: A very large scale Mandarin telephone speech corpus	ASR	ZH.	https://catalog.ldc.upenn.edu/LDC2005S15
how2	How2: A Large-scale Dataset for Multimodal Language Understanding	ASR/Machine Translation/Speech Translation	EN->PT	https://github.com/srvk/how2-dataset
jsss	JSSS: Japanese speech corpus for summarization and simplification	TTS	JP	https://sites.google.com/site/shinnosuketakamichi/research-topics/jsss_corpus
jsut	Japanese speech corpus of Saruwatari-lab., University of Tokyo	ASR/TTS	JP	https://sites.google.com/site/shinnosuketakamichi/publication/jsut
jv_openslr35	Javanese	ASR	JV	http://www.openslr.org/35
jvs	JVS (Japanese versatile speech) corpus	TTS	JP	https://sites.google.com/site/shinnosuketakamichi/research-topics/jvs_corpus
laborotv	LaboroTVSpeech (A large-scale Japanese speech corpus on TV recordings)	ASR	JP	https://laboro.ai/column/eg-laboro-tv-corpus-jp
librimix	LibriMix: An Open-Source Dataset for Generalizable Speech Separation	SE	EN	https://github.com/JorisCos/LibriMix
librispeech	LibriSpeech ASR corpus	ASR	EN	http://www.openslr.org/12
libritts	LibriTTS corpus	TTS	EN	http://www.openslr.org/60
ljspeech	The LJ Speech Dataset	TTS	EN	https://keithito.com/LJ-Speech-Dataset/
lrs2	The Oxford-BBC Lip Reading Sentences 2 (LRS2) Dataset	Lipreading/ASR	EN	https://www.robots.ox.ac.uk/~vgg/data/lip_reading/lrs2.html
mini_an4	Mini version of CMU AN4 database for the integration test	ASR/TTS/SE	EN	http://www.speech.cs.cmu.edu/databases/an4/
mini_librispeech	Mini version of Librispeech corpus	DIAR	EN	https://openslr.org/31/
mls	MLS (A large multilingual corpus derived from LibriVox audiobooks)	ASR	8 languages	http://www.openslr.org/94/
nsc	National Speech Corpus	ASR	EN-SG	https://www.imda.gov.sg/programme-listing/digital-services-lab/national-speech-corpus
open_li52	Corpus combination with 52 languages(Commonvocie + voxforge)	Multilingual ASR	52 languages
polyphone_swiss_french	Swiss French Polyphone corpus	ASR	FR	http://catalog.elra.info/en-us/repository/browse/ELRA-S0030_02
puebla_nahuatl	Highland Puebla Nahuatl corpus	ASR	HPN	https://www.openslr.org/92/
reverb	REVERB (REverberant Voice Enhancement and Recognition Benchmark) challenge	ASR	EN	https://reverb2014.dereverberation.com/
ru_open_stt	Russian Open Speech To Text (STT/ASR) Dataset	ASR	RU	https://github.com/snakers4/open_stt
sms_wsj	SMS-WSJ: A database for in-depth analysis of multi-channel source separation algorithms	SE	EN	https://github.com/fgnt/sms_wsj.
spgispeech	SPGISpeech 5k corpus	ASR	EN	https://datasets.kensho.com/datasets/scribe
su_openslr36	Sundanese	ASR	SU	http://www.openslr.org/36
swbd	Switchboard Corpus for 2-channel Conversational Telephone Speech (300h)	ASR	EN	https://catalog.ldc.upenn.edu/LDC97S62
timit	TIMIT Acoustic-Phonetic Continuous Speech Corpus	ASR	EN	https://catalog.ldc.upenn.edu/LDC93S1
vctk	English Multi-speaker Corpus for CSTR Voice Cloning Toolkit	TTS	EN	http://www.udialogue.org/download/cstr-vctk-corpus.html
vctk_noisyreverb	Noisy reverberant speech database (48kHz)	SE	EN	https://datashare.ed.ac.uk/handle/10283/2826
vivos	VIVOS (Vietnamese corpus for ASR)	ASR	VI	https://ailab.hcmus.edu.vn/vivos/
voxforge	VoxForge	ASR	7 languages	http://www.voxforge.org/
wham	The WSJ0 Hipster Ambient Mixtures (WHAM!) dataset	SE	EN	https://wham.whisper.ai/
whamr	WHAMR!: Noisy and Reverberant Single-Channel Speech Separation	SE	EN.	https://wham.whisper.ai/
wsj	CSR-I (WSJ0) Complete, CSR-II (WSJ1) Complete	ASR	EN	https://catalog.ldc.upenn.edu/LDC93S6A,https://catalog.ldc.upenn.edu/LDC94S13A
wsj0_2mix	MERL WSJ0-mix multi-speaker dataset	ASR/SE	EN	http://www.merl.com/demos/deep-clustering
wsj0_2mix_spatialized	MERL WSJ0-mix multi-speaker dataset (Spatialized version)	ASR/Multichannel ASR/SE	EN	http://www.merl.com/demos/deep-clustering
yesno	The "yesno" corpus	ASR	HE	http://www.openslr.org/1
zeroth_korean	Zeroth-Korean	ASR	KR	http://www.openslr.org/40

Name		Name	Last commit message	Last commit date
parent directory ..
TEMPLATE		TEMPLATE
aishell/asr1		aishell/asr1
ami/asr1		ami/asr1
an4		an4
babel/asr1		babel/asr1
chime4		chime4
commonvoice/asr1		commonvoice/asr1
csj/asr1		csj/asr1
csmsc/tts1		csmsc/tts1
dirha_wsj/asr1		dirha_wsj/asr1
dns_ins20/enh1		dns_ins20/enh1
fsc/asr1		fsc/asr1
gigaspeech/asr1		gigaspeech/asr1
hkust/asr1		hkust/asr1
how2/asr1		how2/asr1
jsss/tts1		jsss/tts1
jsut		jsut
jv_openslr35/asr1		jv_openslr35/asr1
jvs/tts1		jvs/tts1
laborotv/asr1		laborotv/asr1
librimix/enh1		librimix/enh1
librispeech/asr1		librispeech/asr1
libritts/tts1		libritts/tts1
ljspeech/tts1		ljspeech/tts1
lrs2/lipreading1		lrs2/lipreading1
mini_an4		mini_an4
mini_librispeech/diar1		mini_librispeech/diar1
mls/asr1		mls/asr1
nsc/asr1		nsc/asr1
open_li52/asr1		open_li52/asr1
polyphone_swiss_french/asr1		polyphone_swiss_french/asr1
puebla_nahuatl/asr1		puebla_nahuatl/asr1
reverb/asr1		reverb/asr1
ru_open_stt/asr1		ru_open_stt/asr1
sms_wsj/enh1		sms_wsj/enh1
spgispeech/asr1		spgispeech/asr1
su_openslr36/asr1		su_openslr36/asr1
swbd/asr1		swbd/asr1
timit/asr1		timit/asr1
vctk/tts1		vctk/tts1
vctk_noisyreverb/enh1		vctk_noisyreverb/enh1
vivos/asr1		vivos/asr1
voxforge/asr1		voxforge/asr1
wham/enh1		wham/enh1
whamr/enh1		whamr/enh1
wsj/asr1		wsj/asr1
wsj0_2mix/enh1		wsj0_2mix/enh1
wsj0_2mix_spatialized/enh1		wsj0_2mix_spatialized/enh1
yesno/asr1		yesno/asr1
yoloxochitl_mixtec/asr1		yoloxochitl_mixtec/asr1
zeroth_korean/asr1		zeroth_korean/asr1
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

egs2 (Examples of ESPnet2)

How to use?

Overview of example information

FilesExpand file tree

egs2

Directory actions

More options

Directory actions

More options

Latest commit

History

egs2

Folders and files

parent directory

README.md

egs2 (Examples of ESPnet2)

How to use?

Overview of example information