Group-Synth

Inspiration

We currently have smart pianos that allow one to have fine-tune control over parameters like pitch and amplitude. However, these pianos cost quite a lot of money and don't allow for easy collaboration (you have to have expensive wiring to link multiple devices up). Group-synth establishes a framework for solving both of these problems: easy access to fine-tune control and real-time group playing.

What it does

Musicians come together on an online website and join together to create music in real-time. After starting a music session, we first extract webcam data from the local users and use a version of OpenCV on JS to draw a convex hull around your hand, extract its center, and convert that into a frequency and amplitude value. We then send these values from the local machines to Firebase and use a global timing mechanism to synchronize the sending. Each user then extracts the latest, synchronized values from Firebase and calls updates to a JS music library called Gibber. Each user is predefined to a specific instrument, and we render the group music locally on each machine so that each user has real-time feedback.

How we built it

HTML to build the website that each user logs into.
OpenCV in JS to extract the center of each local user's hand all from within the website.
Firebase to synchonize the frequencies extracted from each user.
Gibber to render the frequencies from each user onto different synthesize instruments and play the music in real time.

Challenges we ran into

Synchronization was a major challenge. You need some way of determining what notes come at the same time from different users. To solve this, we have a global start time and have each user pushing to Firebase every x time multiples from that global start time.
OpenCV was another issue. We needed to run hand extraction for relatively low computation costs, so we spent a long time searching for algorithms online and modifying these to run in the web browser in JS. Finally, we were able to build a Computer Vision algorithm that extracts the hand location in a multitude of locations.
Getting audio generation to run smoothly even with multiple instruments updating multiple times over a real-time database. We solved this by hacking Gibber.