ProduceThing

ProduceThing — a writing-first AI music studio where you can create, refine, and produce a full track in one place.
Real-time lyric coaching. Clichés, filler words, and repetition are highlighted as you write — with one-click tighten and hook suggestions.
Ask for flow or hook feedback and get actionable, line-by-line suggestions grounded in your song’s rhythm and structure.
Advanced lyric insights. See your rhyme scheme, cadence profile, and technique suggestions to level up your writing.
Autonomous production agent in action. Claude uses tool calls to inspect composition state, add layers, and refine the track.
Turn your finished track into a shareable music video. Upload a selfie and generate a lip-synced video in seconds.

Inspiration

We've all got Notes apps full of half-written lyrics and 2 a.m. voice memos that we never actually finish. The ideas are there – we can hear the hooks and the beats clearly in our heads – but there's usually a big gap between having a spark of an idea and actually sitting down to produce it.

At the same time, we love the process of writing music. We didn't want a "generate" button that does everything for us; we wanted to stay in the driver's seat – choosing the words, tweaking the melodies, and building the track piece by piece.

We built ProduceThing to help bridge that gap. It's a studio that helps you finish what you start, acting as a collaborator that handles the technical friction while you keep the creative control.

What it does

ProduceThing is an AI music studio built around writing.

Grammarly for lyrics A real-time lyric editor that highlights clichés, filler words, repetition, and weak phrasing as you type – with one-click tighten, de-cliché, punch-up, and hook suggestions.

Cursor for songwriting Ghost-text autocomplete that understands your rhyme scheme, cadence, and full-song context. Press Tab to accept.

AI co-writer + agent Chat with GPT/Claude to iterate on lines, fix flow, or generate ideas. In autonomous mode, Claude can compose, layer, and refine an entire session from a single prompt.

Layer-based production Tracks are split into stems so you can add, remove, regenerate, mute/solo, A/B compare, and export — like a lightweight DAW.

Music video generation Generate a lip-synced video from a selfie with AI backgrounds and share-ready output.

How we built it

The Frontend Built with Next.js 16, React 19, and TypeScript. We used a transparent textarea layered over a highlighted overlay – the same pattern used in modern code editors – to provide real-time lyric analysis without breaking the writing flow.

The Audio Engine We integrated the Suno API for music generation and 12-stem separation. To speed things up, we deployed htdemucs on Modal using T4 GPUs for parallel 3-stem separation. We ran both pipelines in a race – whichever finished first (usually Modal in ~20s vs ~60s) served the user.

AI Orchestration Powered by Vercel AI SDK v6 with dual-model routing:

GPT-5 Nano handles fast chat and lyric rewrites.
Claude Opus 4.6 manages the autonomous "Agent Mode."

Lyric Analysis Engine A custom client-side engine that tracks 40+ cliché phrases, 15 filler words, and uses suffix-normalized rhyme clustering and syllable counting.

UI/UX Styled with Tailwind CSS 4 and Radix UI, with multi-track mixing handled via the Web Audio API and waveform-playlist.

Challenges we ran into

Defining the "unexplored" space: AI tools already offer massive flexibility in generating sounds, but we realized the lyric space was still largely a "black box" of generic text. We spent hours riffing with mentors and sponsors, trying to figure out where we could actually contribute.

Designing the agent: Designing the agent wasn't just engineering – it was musicology. We had to teach it the nuances of bridge transitions, tension and release, and song architecture so it could act as a peer, not a script.

Balancing vibes and rigid songwriting constraints: Lyrics are a messy mix of math and soul. We had to build an editor that tracks rigid constraints – like syllable counts and rhyme density – without making the writer feel like they were filling out a spreadsheet. The challenge was keeping the creative flow alive while the engine calculated the structure in the background.

Accomplishments that we're proud of

We successfully built a parallelized audio pipeline that races Modal (T4 GPUs) against Suno's native processing. Dropping the stem separation time from ~60s to ~20s wasn't just a performance win; it saved the "creative flow" of the entire app.

The Lyric Coach actually coaches: Our client-side analysis engine doesn't just find rhymes; it genuinely improves writing. Seeing it catch clichés and filler words in real-time makes the AI feel like a rigorous editor rather than just a ghostwriter.

Getting Claude Opus 4.6 to autonomously plan and execute a multi-track session – orchestrating lyrics, structure, and stems – was our biggest technical breakthrough. Watching it "think" through a song structure is a true "wow" moment.

What we learned

Agency > Automation: Most people don't want an AI to make a song for them; they want the AI to help them make a song better. Control is the most important feature we built.

Hybrid Logic wins: LLMs are great for brainstorming, but they're bad at counting. Combining hard-coded rule-based analysis (syllable counting) with LLM rewrites gave us much better results than an LLM alone.

When building for artists, you have to engineer for flow. In music, a 60-second wait for a stem separation kills the creative flow. We learned that engineering for speed – like our parallelized Modal + Suno pipeline – isn't just a technical "nice-to-have"; it's a requirement for staying in the "flow state."

What's next for ProduceThing

Live vocal coaching. We want to move beyond lyrics and stems by adding real-time recording capabilities. Using the same logic as our lyric analyzer, we'll build a vocal coach that provides instant feedback on pitch, rhythm, and delivery as you record.

Translate to different languages This is a fun challenge because translating a song isn't just swapping words – it's about preserving the rhyme and the rhythm. We want to build a tool that helps artists "port" their tracks into new languages without losing the syllable count or the "vibe."