VoiceKeep

Inspiration

Our team was inspired by the unique power of voice. Voice carries emotion, identity, and memory, but preserving and reusing voices dynamically has always been difficult. We wanted to build something that could capture a person’s voice and give it new life, enabling applications in accessibility, education, storytelling, and entertainment.

For two of our three teammates, this was our first-ever hackathon, so a big part of the inspiration was to dive into unfamiliar tools and push ourselves to build something ambitious in just a weekend.

What it does

VoiceKeep allows users to:

Record or upload any audio clip.
Transcribe the speech into clean text automatically.
Clone the speaker’s voice and generate new audio from any input text.

The result is a personalized AI narrator — anyone’s voice can be turned into a dynamic tool for narration, communication, or creativity.

How we built it

We combined frontend, backend, and AI voice synthesis technologies to bring VoiceKeep to life:

A React + TypeScript frontend built with Vite and Tailwind CSS for recording, uploading, and generating AI voices.
Supabase Functions (Deno runtime) as the backend for handling audio processing, user sessions, and database storage.
Integration with the ElevenLabs API for advanced voice cloning, speech-to-text, and text-to-speech capabilities.
A proprietary AI processing engine that analyzes voice characteristics and ensures consistent tone, clarity, and timbre in generated speech.
Real-time audio visualization and playback, allowing users to preview both recorded and generated voices with minimal latency.

We built the system collaboratively during the hackathon, experimenting with new frameworks, APIs, and real-time audio technologies throughout the development process.

Challenges we ran into

First-time hackathon experience: Two team members had to quickly adapt to hackathon workflows, Git collaboration, and rapid prototyping.
Audio format inconsistencies: Different file types, noise levels, and recording lengths caused unexpected bugs.
Voice cloning realism: Achieving natural-sounding output required experimentation with audio quality, timing, and model parameters.
System integration: Connecting transcription, cloning, and generation into one smooth pipeline required extensive debugging.

Accomplishments that we're proud of

Fully functional prototype: We built a working end-to-end voice recording, transcription, and cloning pipeline in under 36 hours.
Beginner team achievement: Two first-time hackers contributed to a complex AI system.
Impressively realistic output: The cloned voices sound natural, making our demos genuinely exciting to show.

What we learned

How to collaborate efficiently under time pressure as a mixed-experience team.
How speech-to-text and voice synthesis technologies connect to form a complete audio pipeline.
How to debug real-world audio processing issues and quickly adapt to new tools.
That even as beginners, we can tackle complex AI challenges and build something meaningful.

What's next for VoiceKeep

Refined web app: Build a polished, user-friendly UI for wider use.
Multilingual support: Expand to support multiple languages and accents.
Voice preservation libraries: Allow users to save and manage cloned voices for storytelling or family archiving.
Advanced editing tools: Add customization features for creative voice applications.