PaddleAudio

PaddleAudio: The audio library for PaddlePaddle

Introduction

PaddleAudio is the audio toolkit to speed up your audio research and development loop in PaddlePaddle. It currently provides a collection of audio datasets, feature-extraction functions, audio transforms,state-of-the-art pre-trained models in sound tagging/classification and anomaly sound detection. More models and features are on the roadmap.

Features

Spectrogram and related features are compatible with librosa.
State-of-the-art models in sound tagging on Audioset, sound classification on esc50, and more to come.
Ready-to-use audio embedding with a line of code, includes sound embedding and more on the roadmap.
Data loading supports for common open source audio in multiple languages including English, Mandarin and so on.

Install

git clone https://github.com/PaddlePaddle/models
cd models/PaddleAudio
pip install .

Quick start

Audio loading and feature extraction

import paddleaudio

audio_file = 'test.flac'
wav, sr = paddleaudio.load(audio_file, sr=16000)
mel_feature = paddleaudio.melspectrogram(wav,
                                       sr=sr,
                                       window_size=320,
                                       hop_length=160,
                                       n_mels=80)

Speech recognition using wav2vec 2.0

import paddleaudio
from paddleaudio.models.wav2vec2 import Wav2Vec2ForCTC, Wav2Vec2Tokenizer

model = Wav2Vec2ForCTC('wav2vec2-base-960h', pretrained=True)
tokenizer = Wav2Vec2Tokenizer()
# Load audio and normalize
wav, _ = paddleaudio.load('your_audio.wav', sr=16000, normal=True, norm_type='gaussian')

with paddle.no_grad():
    x = paddle.to_tensor(wav)
    logits = model(x.unsqueeze(0))
    # Get the token index prediction
    idx = paddle.argmax(logits, -1)
    # Decode prediction to text
    text = tokenizer.decode(idx[0])
    print(text)

Examples

We provide a set of examples to help you get started in using PaddleAudio quickly.

Please refer to example directory for more details.

Name		Name	Last commit message	Last commit date
parent directory ..
examples		examples
paddleaudio		paddleaudio
test		test
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.style.yapf		.style.yapf
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

PaddleAudio: The audio library for PaddlePaddle

Introduction

Features

Install

Quick start

Audio loading and feature extraction

Speech recognition using wav2vec 2.0

Examples

FilesExpand file tree

PaddleAudio

Directory actions

More options

Directory actions

More options

Latest commit

History

PaddleAudio

Folders and files

parent directory

README.md

PaddleAudio: The audio library for PaddlePaddle

Introduction

Features

Install

Quick start

Audio loading and feature extraction

Speech recognition using wav2vec 2.0

Examples