Skip to content

labspire/NLTM-Spire

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Signal Processing Interpretation and REpresentation (SPIRE) Lab is led by Prof. Prasanta, in the department of Electrical Engineering at Indian Institute of Science (IISc), Bangalore.
Research at SPIRE Lab is in the area of human-centered data processing, particularly speech, audio and language processing in Indian languages with applications in education and healthcare.
SPIRE Lab's contribution to NLTM include development of resource and models for speech recognition in three low resourced languages, namely, Maithili, Konkani and Santali.

Demo

Our research at SPIRE Lab centers around developing ASR techniques for languages with limited resources.
Typically these languages lacks annotated text, speech data, or other linguistic resources needed to train and develop effective models.
Also the pool of people speaking these languages are low.
Our work is primarly focused on three low resource language Maithili, Santali, Konkani.
We are using techniques like SSL finetuning, Adapters, Transfer Learning.
Checkout our demos:
Automatic Speech Recognition for low-Resource Indic Languages

Models

The WERs specified are without the use of any language model.

Model Pre-training data Fine-tuning data Model Links WER (test-RESPIN)
IndicWav2Vec Base --- Maithili fairseq 18.95
  • fine-tuning procedures can be found here.
  • Inference procedures can be found here.
  • Single file inference procedures can be found here

Directory Structure

NLTM-SPIRE
├── configs
│   ├── indic.yaml
│   └── spring.yaml
├── data
│   ├── examples
│   └── mt
├── models
│   ├── finetuned
│   │   └── indic_finetuned
│   └── pretrained
├── recipes
│   ├── Training
│   │   └── train.sh
│   ├── Inference
│   │   └── infer.sh
│   ├── Single_File_infer
│   │   ├── infer.py
│   │   └── config.yaml
│   └── fairseq_preprocessing
│       ├── data_prep.py
│       ├── metrics.py
│       └── run_data_prep.sh
├── requirements.txt
└── README.md

Requirements and Installation

  • Create a new conda environment:
conda create -n env_name python=3.8
conda activate env_name
  • Python version >= 3.8
  • PyTorch version >= 2.0.0
  • Fairseq version >= 0.12.2
  • CUDA >= 11.8
  • For training new models, you'll also need an NVIDIA GPU and NCCL
  • To install requirements:
pip install -r requirements.txt
  • To install fairseq and develop locally:
git clone https://github.com/pytorch/fairseq
cd fairseq
pip install --editable ./
  • For faster training install NVIDIA's apex library:
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" \
  --global-option="--deprecated_fused_adam" --global-option="--xentropy" \
  --global-option="--fast_multihead_attn" ./
pip install flashlight-text

git clone https://github.com/flashlight/sequence && cd sequence
pip install .
  • To install parse_options:
wget https://raw.githubusercontent.com/kaldi-asr/kaldi/master/egs/wsj/s5/utils/parse_options.sh && sudo mv parse_options.sh /usr/local/bin/

Reference Code

  1. Facebook AI Research Sequence-to-Sequence Toolkit written in Python. fairseq
  2. AI4Bharat IndicWav2Vec

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors