Inspiration:

Mental health challenges are often invisible.

Many people struggle quietly because of stigma, fear of judgment, or difficulty expressing how they feel. This is especially true for individuals who withdraw instead of seeking help.

We were inspired to create a system that lowers the pressure of traditional face-to-face clinical interactions and helps people engage earlier in a more comfortable and honest way.

What it does:

Dr. Lume is an adaptive mental health support platform that responds to how a patient is feeling in real time. By observing emotional cues through the device camera, the system dynamically adjusts its questions, tone, and pacing to better match the patient’s emotional state.

Patients can interact with Dr. Lume through voice or text, allowing for a more natural and low-pressure experience. The platform offers two modes: a fully private mode for patients who want personal support without involving a clinician, and a clinician-connected mode for patients who are waiting for professional care. In the clinician-connected mode, emotional context gathered during sessions can be shared to help providers better understand the patient before an appointment.

Throughout a session, Dr. Lume tracks changes in emotional patterns over time, helping surface behaviors that may otherwise go unnoticed in traditional interactions. The goal is not to diagnose, but to reduce friction, encourage engagement, and provide clearer context that supports more informed and meaningful care.

Dr. Lume offers two modes: A fully private mode for users who want personal support without connecting to a clinician. A clinician-connected mode for users waiting for professional care, where emotional context from sessions can be shared to help providers better understand the patient before a visit.

Together, these modes reduce barriers to care while improving continuity once a patient enters the healthcare system.

How we built it:

We built Dr. Lume using Next.js, React, and TypeScript to create a fast and reliable application with a clean session flow, while TailwindCSS and PostCSS were used to keep the interface simple, calm, and responsive. The conversational layer is powered by Featherless AI, using text and multimodal models with a custom system prompt focused on emotional awareness and concise responses.

To understand how users are feeling, we used a combination of Google Gemini 2.0 Flash Lite for real-time emotion classification from webcam frames and MediaPipe FaceLandmarker, which runs locally in the browser to extract facial blendshapes and stabilize emotion signals with lightweight smoothing. For voice output, we integrated ElevenLabs as the primary text-to-speech engine, with Google Cloud Text-to-Speech as a fallback to improve reliability.

To keep the experience responsive, we used Socket.IO for real-time messaging and low-latency back-and-forth between the client and the server. We also leveraged key browser APIs, including getUserMedia, Web Speech, and Web Audio, to capture camera and microphone input, enable speech-to-text, and manage audio playback. On the backend, a custom Node.js server hosts the application, coordinates vision processing and session logic, and supports real-time communication.

On the backend, session data is organized into clear summaries that can be shared with clinicians when appropriate. The system is designed to be modular, allowing future integration of secure live video calls between clinicians and patients as part of the care workflow.

Challenges we ran into:

One of the biggest challenges was accurately interpreting emotional signals from facial data in a reliable and responsible way. Emotions are complex, and we had to carefully tune our system to avoid overconfidence or misinterpretation.

Another challenge was handling variability in real-world conditions. Differences in lighting, camera quality, and user movement can significantly affect facial analysis, so we focused on stabilizing signals and preventing sudden or misleading changes in emotional state.

We also had to balance responsiveness with comfort. Adapting questions too aggressively based on emotion risked making the experience feel intrusive or unnatural, while adapting too slowly reduced usefulness. Finding the right pacing required repeated testing and iteration.

Finally, designing around ethical and privacy considerations was critical. We needed to ensure emotional cues were used strictly to guide the interaction rather than label or diagnose patients, while also making the system feel safe enough for patients who may already be hesitant to engage.

Another challenge was balancing responsiveness with user comfort. We spent significant time refining how and when the system adapts questions to ensure the experience felt supportive rather than intrusive.

Accomplishments that we're proud of:

We built a real-time adaptive session experience that adjusts questions, tone, and pacing based on a patient’s emotional cues instead of relying on static questionnaires.

We integrated camera-based emotion sensing with stabilization so the system responds consistently rather than reacting to noisy moment-to-moment changes.

We also designed two clear paths for patients, a fully private mode for low-pressure support and a clinician-connected mode that can share meaningful emotional context when care is available.

Throughout the build, we maintained strong ethical boundaries by using emotional signals to guide the interaction and provide context, not to diagnose or label patients. Finally, we delivered a cohesive end-to-end workflow that brings emotion detection, conversation flow, and voice interaction together into one smooth experience.

What we learned:

We learned just how challenging it is to build technology around human emotion.

Even with strong tools and models, emotions are subtle, fluid, and highly context-dependent, which makes them difficult to interpret and respond to in a consistent way.

We also learned that building something meaningful in this space requires constant iteration, testing, and refinement, since small design or logic changes can significantly alter how the system is perceived.

Beyond the technical difficulty, we gained a deeper understanding of how careful you have to be when working in healthcare, where reliability, trust, and user comfort matter just as much as functionality.

What's next for Dr. Lume:

Next, we plan to introduce secure live communication between clinicians and patients so care can continue seamlessly when professional support becomes available.

We also want to further improve emotion detection accuracy by expanding signal sources and refining how emotional changes are interpreted over time. In addition, we aim to deepen personalization so sessions adapt not just moment to moment, but across multiple interactions.

Longer term, we plan to integrate Dr. Lume more closely into clinical workflows to support continuity of care and help providers better track progress over time.

Built With

Share this project:

Updates