MoodSong
Inspiration
We live in a world saturated with stimuli, leading to constant low-level stress and an often overwhelmed nervous system. We were inspired by the power of sound to influence mood and physiology but noticed a gap in truly responsive wellness tools. We wanted to create something that didn't just offer static calming tracks, but could actively adapt to the user's current emotional state, providing personalized sonic support precisely when needed to help regulate and soothe.
What it does
MoodSong is a mobile sanctuary designed to help users calm their nervous system and find emotional balance through sound. It offers two primary modes:
- User-Driven: Browse and play from a curated library of calming, focusing, or elevating soundscapes and music.
- Adaptive Mode: With user permission, MoodSong can access the front camera. Using on-device AI, it analyzes the user's facial expression in real-time and intelligently selects and plays appropriate sounds – shifting towards calming audio if stress is detected, or more uplifting tracks if a positive expression is seen – creating a responsive, bio-feedback loop through sound
How we built it
- Frontend: Built using React Native for cross-platform access (iOS & Android).
- UI/UX: Focused on a clean, minimalist, and calming interface design.
- Audio Playback: Integrated
react-native-track-playerfor robust background audio playback, controls, and playlist management. - Camera & AI: Utilized
react-native-vision-camerafor accessing the camera feed. Implemented on-device facial expression recognition (FER) using TensorFlow Lite / Google ML Kit wrappers to ensure real-time processing, low latency, and user privacy (facial data stays on the device). - Adaptive Logic: Developed algorithms to map detected expressions (e.g., neutral, happy, stressed-indicators like sad/angry/fearful) to specific sound categories or tracks within the app.
- State Management: Used Zustand (or React Context API) for managing application state (current track, playback status, camera mode, detected expression).
- AI Integration (Gemini): Leveraged the Gemini API for suggesting sounds based on user text input (e.g., "I feel anxious") or potentially analyzing user journal entries in future iterations. (Initially, Gemini is NOT used for the real-time facial analysis).
- Authentication: Integrated Clerk for user sign-up and sign-in.
Challenges we ran into
- FER Accuracy & Realism: Getting reliable expression detection across different lighting conditions, face angles, and diverse users was challenging. Mapping discrete detected emotions (like "happy", "sad") into nuanced "calm" or "elevated" states required careful tuning.
- Performance Optimization: Real-time camera feed processing and ML model inference are resource-intensive. We had to optimize frame processing rates to balance responsiveness with battery consumption and device heat.
- Audio Format Compatibility: Encountered issues with specific audio formats not being universally supported by the native playback engines (like the
AVFoundationerror-11828on iOS), requiring careful selection or transcoding of audio assets. - Seamless Transitions: Ensuring smooth, non-jarring transitions between different sounds in the adaptive mode was crucial for a positive user experience.
- Privacy Implementation: Clearly communicating camera usage and ensuring all sensitive facial analysis happened strictly on-device was a top priority.
Accomplishments that we're proud of
- Successfully implementing the core adaptive audio feature – the real-time link between facial expression and sound playback.
- Integrating camera, on-device ML, and audio playback into a cohesive React Native application.
- Creating a functional proof-of-concept demonstrating how responsive technology can aid mental wellness.
- Building a system that prioritizes user privacy by performing facial analysis locally.
- Overcoming the initial technical hurdles related to audio playback and ML performance.
What we learned
- The complexities and nuances of implementing real-time, on-device machine learning on mobile platforms.
- Deepened knowledge of React Native's capabilities and limitations, especially regarding native module integration (camera, audio, ML).
- The critical importance of performance optimization for background tasks and real-time processing.
- Best practices for handling user permissions and privacy concerns, particularly with camera access.
- The need for robust error handling, especially around native APIs like audio playback.
- How to iterate on UX design for a potentially novel interaction model (adaptive audio).
What's next for MoodSong
- Expand Sound Library: Add a wider variety of sounds, including nature scapes, binaural beats, isochronic tones, and potentially short guided meditations triggered by mood.
- Refine AI Model: Improve the accuracy and nuance of the expression detection and the mapping logic to sound selection. Explore detecting expression intensity.
- User Customization: Allow users to customize the sensitivity of the adaptive mode or map expressions to their preferred sound categories.
- Introduce Text Input: Integrate Gemini more deeply to allow users to type how they feel and receive sound or brief mindfulness exercise suggestions.
- Analytics (Privacy-Preserving): Implement anonymized analytics to understand which sounds are most effective and how users interact with the adaptive mode, to guide future improvements.
- Wearable Integration: Explore potential integration with smartwatches (e.g., heart rate variability) as an additional input for the adaptive mode.
Built With
- nextjs
- react-native
- tailwind
- typescript


Log in or sign up for Devpost to join the conversation.