A minimal app for visualizing and comparing phonetic audio samples using waveforms and spectrograms. A user can compare two waveforms and examine differences in their waveform and spectrogram to analyze their pronounciation of words when learning a language.
Built using C++ and Qt.
Write about 1-2 paragraphs describing the purpose of your project.
This project was developed as a final project for Middlebury College's CS318: OOP & GUI Development. The original concept of the Phonetics Visualizer is to aid language learners' pronounciation by algorithmically identifying differences in pronounciation between an uploaded and a recorded sentence. These could be differences in length of phoneme, pitch, relative emphasis, et cetera. We believe that identifying these differences for the listener, then giving them capability to isolate, listen, and visualize these shortcomings in pronounciation would significantly enhance the language learning process.
Although over the short course of the semester we were not able to implement an algorithmic framework to recognize differences in these across recordings, our team developed visualization and playback capabilities that allow for user analysis.
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.
What things you need to install the software and how to install them.
-
An instance of Qt and Qt Creator. This is for the development version.
- This requires XCode if you are on a Mac.
-
Right now the software is dependent on the FFTW library. We have been running it from within our Qt development instance, which is dynamically linking to the FFTW headers from our
\usr\local\libdirectory. Hopefully for the production version we will have this code fully linked and included in a standalone executable.
A step by step series of examples that tell you how to get a development env running.
-
Install XCode, Qt, Qt Creator. There are plenty of better guides to do this online than anything we could write.
-
Git clone the project into your desired directory.
git clone https://github.com/jorredahl/LinguisticCS.git
- Install FFTW. We did this using homebrew, then to get the FFTW package to where our
.profile was looking we used
sudo cp /opt/homebrew/Cellar/fftw/3.x.x/lib/* /usr/local/lib/
Upon starting the Phonetics Visualizer, the user will be greeted with some controls and a blank waveform visualizer. Almost every control is initially disabled - you have to first upload an audio file.
The first step is to load an audio file to compare.
- Click the top "Upload" button to upload a source file to compare your audio against.
- There will be a pop-up file explorer. Navigate in your computer directory to the desired audio file and click "Open" in the bottom right to open that.
Note: only .wav files are accepted at the moment. A 24 or 32 bitrate is encouraged for the file.
- @jkwarren
- @mbamaca
- @terryluongo
- @jorredahl
- Abraham Merino
See also the list of contributors who participated in this project.
- Thank you to Professor Swenton for the plethora of help throughout the semester!
- Thank you to Professor Baird for the linguistics resources and Professor Abe for the Japenese resources!