meemomi

initial A4 sheet template for the prototype
final (more simplified) A4 sheet template for the prototype

Inspiration

Since a young age, I've been involved with musical instruments such as the electric and acoustic guitar, drums, and the flute. However, one of the most well-known instruments—the piano—was something I had never been able to play. It was simply inaccessible to me, and I often questioned whether I wanted to spend $80 or more just to try it.

I was inspired by Imogen Heap’s Imogen Heap's Mi.Mu gloves, which allow users to create beautiful, customized electronic music simply by moving their hands in the air. After researching further, I realized that using a Raspberry Pi and a few other hardware components, I could potentially replicate her glove-based interface.

Still, I felt the concept could be simplified and made more scalable. If I was going to build something for myself, I wanted it to be accessible to others too. That led to the creation of meemomi—a prototype program that lets users play a virtual piano (currently with five keys) using nothing more than a flat surface and printed A4 template sheets.

What it does

meemomi allows users to access and play musical instruments—currently focused on the piano—using standard A4 sheets printed with piano key markings. By anchoring a camera above the sheet, the program records user interaction in real time and uses a Convolutional Neural Network (CNN) to detect finger movements.

When a user touches the paper within predefined boundary boxes (based on a custom dataset I trained specifically for this prototype template), the system identifies the contact and plays the corresponding piano sound. This approach makes the instrument not only accessible but also scalable—removing the need for expensive physical equipment.

How I built it

The program was built by first creating a custom dataset for my piano sheet template using Roboflow, which I used to draw bounding boxes and split the dataset into training, validation, and testing sets. This dataset was then trained using YOLOv8 and iteratively tested to refine detection accuracy.

For finger tracking, I opted to use an existing large-scale dataset to ensure reliability and save time during prototyping. This allowed for faster development and minimized potential errors, enabling a smoother integration of gesture detection with the custom piano key recognition system.

Challenges I ran into & methodology

I encountered multiple challenges during development. Initially, I was unsure about the template design—at first considering the use of ArUco markers to track finger positions. However, I realized that leveraging machine learning would offer more flexibility and accuracy. So, I pivoted to a custom-designed piano sheet template, originally consisting of 14 keys.

Creating a dataset for all 14 keys proved extremely time-consuming and conflicted with the limited time I had for the hackathon. To stay within the deadline, I simplified the prototype to just 5 keys. I initially planned on training a YOLOv8 mini model on 10 manually labeled images, then use that model to detect piano keys for over 90 additional images—manually correcting any inaccuracies, but however, this failed. Thus, I had to manually label all the 100 piano keys on Robo-flow. Additionally, I faced training difficulties, I had initially planned on using Robo-flow's automatic YOLOv8 training, but however, the use of API keys meant the interaction was extremely slow, and thus, I had to train the dataset locally. For this, I used Google Collab, and then exported the best weights-- which was then used in the final program.

Once the model was trained, I integrated it with a finger detection system, along with finger moment detection system to detect fingers pressing downwards to indicate " positive " to execute XYZ program later, furthermore, wrote the logic to connect key touches to corresponding sounds, and sourced piano audio samples online. The result was a functional system that translates real-world interactions on paper into digital music.

What's next for meemomi

To evolve meemomi from a prototype into a polished, scalable product, the following development steps are planned—prioritized for impact and feasibility:

1. Expand and Improve the Model Train a more accurate and reliable model using a larger, higher-quality dataset to improve piano key detection across varied lighting, angles, and setups.

2. Support Full-Range Standard Piano Keys Transition from the current 5-key prototype to a full 88-key (or at least 24-key) virtual layout to allow for more complex and expressive musical performance.

3. Build a User-Friendly GUI Develop an intuitive graphical user interface for setup, interaction, and customization—critical for non-technical users and wider adoption.

4. Add Advanced Musical Features Incorporate effects like reverb, bass, treble, and features like record and playback—bringing meemomi closer to the experience of professional digital pianos.

5. Deploy to Public Platforms (App Store / Play Store) Package and optimize meemomi for cross-platform release on iOS and Android stores, form a better GitHub page for this, enabling easy access and distribution to the general public for free.

I also plan through looking into optimizing code as well along with this, thus, pooling in funding for this project.

bibliography

[ 1 ] University of Iowa Electronic Music Studios. Musical Instrument Samples: Piano. University of Iowa, https://theremin.music.uiowa.edu/mispiano.html. Accessed 22 June 2025.

Built With

mediapipe
python
roboflow
yolov8

Updates

deleted deleted started this project — Jun 22, 2025 10:32 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.