GitHub - gruly/spraakherkenning

These scripts may be used to convert speech contained in audio files into text using the Kaldi open-source speech recognition system. Running them requires installation of Kaldi (http://kaldi-asr.org/) and SoX (http://sox.sourceforge.net/).

Before running the decoder for the first time, make sure to set KALDI_ROOT to the proper value in path.sh and set the (desired) location of the models at model_root in configure.sh, before running that. The configure.sh script will set up the decoder, including creating FST's from the (automatically downloaded) acoustic and language models.

The decode script is called with:

./decode.sh [options] ||

All parameters before the last one are automatically interpreted as one of the three types listed above. After the process is done, the main results are produced in /1Best.ctm. This file contains a list of all words that were recognised in the audio, with one word per line. The lines follow the standard .ctm format:

1

As part of the transcription process, the LIUM speech diarization toolkit is utilized. This produces a directory /liumlog, which contains .seg files that provide information about the speaker diarization. For more information on the content of these files, please visit http://www-lium.univ-lemans.fr/diarization/.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
conf		conf
lib		lib
local		local
README.md		README.md
README.md~		README.md~
configure.sh		configure.sh
decode.sh		decode.sh
decode.sh~		decode.sh~
path.sh		path.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages