Signal Processing Interpretation and REpresentation (SPIRE) Lab is led by Prof. Prasanta, in the department of Electrical Engineering at Indian Institute of Science (IISc), Bangalore.
Research at SPIRE Lab is in the area of human-centered data processing, particularly speech, audio and language processing in Indian languages with applications in education and healthcare.
SPIRE Lab's contribution to NLTM include development of resource and models for speech recognition in three low resourced languages, namely, Maithili, Konkani and Santali.
Our research at SPIRE Lab centers around developing ASR techniques for languages with limited resources.
Typically these languages lacks annotated text, speech data, or other linguistic resources needed to train and develop effective models.
Also the pool of people speaking these languages are low.
Our work is primarly focused on three low resource language Maithili, Santali, Konkani.
We are using techniques like SSL finetuning, Adapters, Transfer Learning.
Checkout our demos:
Automatic Speech Recognition for low-Resource Indic Languages
The WERs specified are without the use of any language model.
| Model | Pre-training data | Fine-tuning data | Model Links | WER (test-RESPIN) |
|---|---|---|---|---|
| IndicWav2Vec Base | --- | Maithili | fairseq | 18.95 |
- fine-tuning procedures can be found here.
- Inference procedures can be found here.
- Single file inference procedures can be found here
NLTM-SPIRE
├── configs
│ ├── indic.yaml
│ └── spring.yaml
├── data
│ ├── examples
│ └── mt
├── models
│ ├── finetuned
│ │ └── indic_finetuned
│ └── pretrained
├── recipes
│ ├── Training
│ │ └── train.sh
│ ├── Inference
│ │ └── infer.sh
│ ├── Single_File_infer
│ │ ├── infer.py
│ │ └── config.yaml
│ └── fairseq_preprocessing
│ ├── data_prep.py
│ ├── metrics.py
│ └── run_data_prep.sh
├── requirements.txt
└── README.md
- Create a new conda environment:
conda create -n env_name python=3.8
conda activate env_name- Python version >= 3.8
- PyTorch version >= 2.0.0
- Fairseq version >= 0.12.2
- CUDA >= 11.8
- For training new models, you'll also need an NVIDIA GPU and NCCL
- To install requirements:
pip install -r requirements.txt- To install fairseq and develop locally:
git clone https://github.com/pytorch/fairseq
cd fairseq
pip install --editable ./- For faster training install NVIDIA's apex library:
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" \
--global-option="--deprecated_fused_adam" --global-option="--xentropy" \
--global-option="--fast_multihead_attn" ./- Flashlight version >= 0.0.7
- To install flashlight-text and flashlight-sequence
pip install flashlight-text
git clone https://github.com/flashlight/sequence && cd sequence
pip install .- To install parse_options:
wget https://raw.githubusercontent.com/kaldi-asr/kaldi/master/egs/wsj/s5/utils/parse_options.sh && sudo mv parse_options.sh /usr/local/bin/- Facebook AI Research Sequence-to-Sequence Toolkit written in Python. fairseq
- AI4Bharat IndicWav2Vec