GitHub - sripathisridhar/tau-av: CS677 final project: A study in audio-visual scene classification

This repository is the final project in the CS677: Deep Learning class at NJIT, titled "A study in audio-visual scene classification".

The model is trained on the TAU Audio-Visual Urban Scenes 2021 dataset as per the DCASE 2021 Task 1B instructions.

Environment setup

To setup the environment, do the following in order for a MacOS machine:

conda install -c conda-forge ffmpeg
conda install pytorch torchvision torchaudio -c pytorch
pip install pandas tqdm h5py sklearn seaborn tabulate soundfile opencv-python
pip install mir_eval
pip install pyyaml

The ordering more easily resolves conflicts from the ffmpeg installation.
For a linux or windows machine, install the pytorch libraries with your cuda version from here.

Training the model

To train a model, run
srun python main.py --features_dir <path_to_features_directory> --config <path_to_config_file>
Alternatively, you can modify the gpu-train.sh script to run the program on a cluster.

The program expects the features_directory to contain audio_features and video_features sub-directories. This will automatically happen if you download the features from this link.

By default, the config file trains an audio+video model. To train an audio only or video only model, modify MODE to audio or video in the config file.

Evaluation

For evaluation, run
srun python evaluate.py --features_path <path_to_features_directory> --model_type audio_video

Change the --model_type argument to audio or video to evaluate an audio only or video only model.

The evaluate script expects the trained model weights to be in the models/ directory.

The confusion matrices for the audio+video, audio only and video only models are in the outs directory.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
baseline		baseline
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Environment setup

Training the model

Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Environment setup

Training the model

Evaluation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages