Skip to content

gruly/spraakherkenning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

These scripts may be used to convert speech contained in audio files into text using the Kaldi open-source speech recognition system. Running them requires installation of Kaldi (http://kaldi-asr.org/) and SoX (http://sox.sourceforge.net/).

Before running the decoder for the first time, make sure to set KALDI_ROOT to the proper value in path.sh and set the (desired) location of the models at model_root in configure.sh, before running that. The configure.sh script will set up the decoder, including creating FST's from the (automatically downloaded) acoustic and language models.

The decode script is called with:

./decode.sh [options] ||

All parameters before the last one are automatically interpreted as one of the three types listed above. After the process is done, the main results are produced in /1Best.ctm. This file contains a list of all words that were recognised in the audio, with one word per line. The lines follow the standard .ctm format:

1

As part of the transcription process, the LIUM speech diarization toolkit is utilized. This produces a directory /liumlog, which contains .seg files that provide information about the speaker diarization. For more information on the content of these files, please visit http://www-lium.univ-lemans.fr/diarization/.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors