audio-super-resolution

Modified and unofficial implementation of the audio super-res network proposed by Kim and Sathe.

Summary

This repo is modified and unofficial implementation of the audio super-res network proposed by Kim and Sathe [1]. The original discriminator network has FC, dropout, and leakyReLU layer on top of 8 downsampling blocks. To reduce the network size, I replaced that part with a global maxpooling layer.

Dependencies

python 3.7.9
pytorch 1.5.0
torchaudio 0.5.0
resampy 0.2.2

Quick start

Download pretrained model (the model is trained for Speaker1 of VCTK dataset (p225) for 200 epochs, and performs upsample 8kHz to 16kHz).
Run python 04.extend.py --checkpoint mugan-generator-200.tar --gpu input.wav output.wav to perform superresolution.

Usage

Setting hyperparameters

Edit model_settings.py to match desired experimental setup.

supersampling_rate()
conv_kernel_size(): a list of filter sizes for multiscale conv. layers
conv_out_channels(): a list of output feature sizes for the corresponding downsampling/upsampling blocks
out_conv_in_channel(): feature size of the conv. layer on the top of the autoencoders and generator networks.
out_conv_kernel(): kernel size of the conv. layer on the top of the autoencoders and generator networks.
the first constant in sample_segment_length(): duration of training/validation sample size in seconds.

Training

Run 01.train-autoencoder.py to train the autoencoder then run 02.train-extender.py to train the GAN.

Prediction and evaluation

Simple evaluation and prediction program are also provided. Run 03.eval-extender.py for evaluation and 04.extend.py for prediction (make super resolution of any .wav file)

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
01.train-autoencoder.py		01.train-autoencoder.py
02.train-extender.py		02.train-extender.py
03.eval-extender.py		03.eval-extender.py
04.extend.py		04.extend.py
LICENSE		LICENSE
README.md		README.md
datasets.py		datasets.py
model_settings.py		model_settings.py
models.py		models.py
p225_366.png		p225_366.png
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

audio-super-resolution

Summary

Dependencies

Quick start

Usage

Setting hyperparameters

Training

Prediction and evaluation

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

audio-super-resolution

Summary

Dependencies

Quick start

Usage

Setting hyperparameters

Training

Prediction and evaluation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages