Korean Online Speech Recognition

Implement Transformer Transducer. This repository provides end-to-end training 1,000 hours KsponSpeech dataset. KsponSpeech dataset was processed by referring to here.

Preparation

You can download dataset at AI-Hub. And the structure of the directory should be prepared for getting started as shown below. Preprocesses were used ESPnet for normalizing text from KsponSpeech recipe. It is provided simply as .trn extention files.

root
└─ KsponSpeech_01
└─ KsponSpeech_02
└─ KsponSpeech_03
└─ KsponSpeech_04
└─ KsponSpeech_05
└─ KsponSpeech_eval
└─ scripts

Environment

Warp-transducer needs to install gcc++5 and export CUDA environment variable.

CUDA_HOME settings

export CUDA_HOME=$HOME/tools/cuda-9.0 # change to your path
export CUDA_TOOLKIT_ROOT_DIR=$CUDA_HOME
export LD_LIBRARY_PATH="$CUDA_HOME/extras/CUPTI/lib64:$LD_LIBRARY_PATH"
export LIBRARY_PATH=$CUDA_HOME/lib64:$LIBRARY_PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH
export CFLAGS="-I$CUDA_HOME/include $CFLAGS"

Install gcc++5 and update alternatives

sudo add-apt-repository ppa:ubuntu-toolchain-r/test
sudo apt-get update
sudo apt-get install gcc-5 g++-5
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-5 1

python>=3.6 & pytorch >= 1.7.0 & torchaudio >= 0.7.0

pip install torch==1.7.0+cu101 torchaudio==0.7.0 -f https://download.pytorch.org/whl/torch_stable.html

Usage

Before training, you should already get Ai-Hub dataset. And you needs to check configuration in conf directory and set batch size as fittable as your gpu environment. If you want to use custom configuration, use conf option(default: config/ksponspeech_transducer_base.yaml).

python train.py [--conf config-path]

Checkpoint directory will be created automatically after training. You can check saved model at checkpoint directory. If you want to train continuosly, use continue_from option.

python train.py --conf model-configuration --continue_from saved-model-path

Results

Train Epoch	Model	CER	WER	Preprocessing
3	Transformer	22%	38%	Filter Bank + SpecAugment

Author

Email: [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 177 Commits
kosr		kosr
third-party/warp-rnnt		third-party/warp-rnnt
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Korean Online Speech Recognition

Preparation

Environment

Usage

Results

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Korean Online Speech Recognition

Preparation

Environment

Usage

Results

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages