Inspiration

We've all got Notes apps full of half-written lyrics and 2 a.m. voice memos that we never actually finish. The ideas are there – we can hear the hooks and the beats clearly in our heads – but there's usually a big gap between having a spark of an idea and actually sitting down to produce it.

At the same time, we love the process of writing music. We didn't want a "generate" button that does everything for us; we wanted to stay in the driver's seat – choosing the words, tweaking the melodies, and building the track piece by piece.

We built ProduceThing to help bridge that gap. It's a studio that helps you finish what you start, acting as a collaborator that handles the technical friction while you keep the creative control.

What it does

ProduceThing is an AI music studio built around writing.

Grammarly for lyrics A real-time lyric editor that highlights clichés, filler words, repetition, and weak phrasing as you type – with one-click tighten, de-cliché, punch-up, and hook suggestions.

Cursor for songwriting Ghost-text autocomplete that understands your rhyme scheme, cadence, and full-song context. Press Tab to accept.

AI co-writer + agent Chat with GPT/Claude to iterate on lines, fix flow, or generate ideas. In autonomous mode, Claude can compose, layer, and refine an entire session from a single prompt.

Layer-based production Tracks are split into stems so you can add, remove, regenerate, mute/solo, A/B compare, and export — like a lightweight DAW.

Music video generation Generate a lip-synced video from a selfie with AI backgrounds and share-ready output.

How we built it

The Frontend Built with Next.js 16, React 19, and TypeScript. We used a transparent textarea layered over a highlighted overlay – the same pattern used in modern code editors – to provide real-time lyric analysis without breaking the writing flow.

The Audio Engine We integrated the Suno API for music generation and 12-stem separation. To speed things up, we deployed htdemucs on Modal using T4 GPUs for parallel 3-stem separation. We ran both pipelines in a race – whichever finished first (usually Modal in ~20s vs ~60s) served the user.

AI Orchestration Powered by Vercel AI SDK v6 with dual-model routing:

  • GPT-5 Nano handles fast chat and lyric rewrites.
  • Claude Opus 4.6 manages the autonomous "Agent Mode."

Lyric Analysis Engine A custom client-side engine that tracks 40+ cliché phrases, 15 filler words, and uses suffix-normalized rhyme clustering and syllable counting.

UI/UX Styled with Tailwind CSS 4 and Radix UI, with multi-track mixing handled via the Web Audio API and waveform-playlist.

Challenges we ran into

Defining the "unexplored" space: AI tools already offer massive flexibility in generating sounds, but we realized the lyric space was still largely a "black box" of generic text. We spent hours riffing with mentors and sponsors, trying to figure out where we could actually contribute.

Designing the agent: Designing the agent wasn't just engineering – it was musicology. We had to teach it the nuances of bridge transitions, tension and release, and song architecture so it could act as a peer, not a script.

Balancing vibes and rigid songwriting constraints: Lyrics are a messy mix of math and soul. We had to build an editor that tracks rigid constraints – like syllable counts and rhyme density – without making the writer feel like they were filling out a spreadsheet. The challenge was keeping the creative flow alive while the engine calculated the structure in the background.

Accomplishments that we're proud of

We successfully built a parallelized audio pipeline that races Modal (T4 GPUs) against Suno's native processing. Dropping the stem separation time from ~60s to ~20s wasn't just a performance win; it saved the "creative flow" of the entire app.

The Lyric Coach actually coaches: Our client-side analysis engine doesn't just find rhymes; it genuinely improves writing. Seeing it catch clichés and filler words in real-time makes the AI feel like a rigorous editor rather than just a ghostwriter.

Getting Claude Opus 4.6 to autonomously plan and execute a multi-track session – orchestrating lyrics, structure, and stems – was our biggest technical breakthrough. Watching it "think" through a song structure is a true "wow" moment.

What we learned

Agency > Automation: Most people don't want an AI to make a song for them; they want the AI to help them make a song better. Control is the most important feature we built.

Hybrid Logic wins: LLMs are great for brainstorming, but they're bad at counting. Combining hard-coded rule-based analysis (syllable counting) with LLM rewrites gave us much better results than an LLM alone.

When building for artists, you have to engineer for flow. In music, a 60-second wait for a stem separation kills the creative flow. We learned that engineering for speed – like our parallelized Modal + Suno pipeline – isn't just a technical "nice-to-have"; it's a requirement for staying in the "flow state."

What's next for ProduceThing

Live vocal coaching. We want to move beyond lyrics and stems by adding real-time recording capabilities. Using the same logic as our lyric analyzer, we'll build a vocal coach that provides instant feedback on pitch, rhythm, and delivery as you record.

Translate to different languages This is a fun challenge because translating a song isn't just swapping words – it's about preserving the rhyme and the rhythm. We want to build a tool that helps artists "port" their tracks into new languages without losing the syllable count or the "vibe."

Built With

  • gpt5
  • modal
  • next.js
  • opus4.6
  • radix
  • react
  • suno
  • tailwind
  • typescript
  • vercel
  • waveform-playlist
Share this project:

Updates