Inspiration

Having gone to a dozen hackathons, Vishnu and Alex recall always having to struggle over recording demos and editing at the last minute. Even when things weren't down to the wire, the hassle of editing meticulously with no time to spare is not a great way to impress. We thought about how we could improve our workflow and decided to have the editing phase fully automated and embedded directly in the recording software. That is how we came up with FlowStudio.

We introduced Isabelle and Richard to the idea, and they were really keen to jump in. Isabelle was deeply fascinated by the concept of making something with a fluid and intentional frontend design philosophy. She was looking forward to creating software that guided users through a fluid and intentional journey. Richard was really into the business opportunities. With FlowStudio, we could target business people for corporate demos, creators for YouTube tutorials, and individuals who just need this problem solved. There is a huge market for it. Since we are essentially the exact clients for our own app, we decided to embark on the development of FlowStudio.

What it does

FlowStudio is a recording software that outputs a ready-to-share, fully edited video by following your cursor interactions, mouse clicks, keyboard input, and context-aware autonomous editing. Instead of just capturing pixels, it captures intent. The user still has full control over the effects applied in the post-production phase just in case they want a manual level of control. Users can manage all of their recording projects directly in the cloud. It is basically a powerful video editing software built right into a recording tool, complete with its own creative file system.

How we built it

We built the app using Next.js 16 with a TypeScript and React framework. The frontend employs beautiful typography and imagery derived from Pexels, aiming for an "Amber on Charcoal" aesthetic inspired by the iOS game Dune. We used GSAP for fluid animations to appeal to the design philosophy of our application, making sure that every piece of motion carries meaning instead of just being applied for decoration.

Our backend is quite extensive. FlowStudio is a strict-mode TypeScript monorepo with 254 files and around 39,000 lines of code across 19 packages, all built with pnpm workspaces. The backbone of our data layer is SpacetimeDB v2. This is a real-time WASM database that replaces a traditional Postgres, Redis, and Celery stack. It runs on a Google Compute Engine VM and handles task coordination, state management, and automatic task chaining entirely over WebSockets with instant push subscriptions.

Our video processing pipeline consists of 13 specialized workers running as serverless Google Cloud Run services. We organized this into a directed acyclic graph. When a user uploads a screen recording, the pipeline runs automatically. First, signal extraction happens in parallel. FFmpeg extracts audio, video frames are sampled, cursor and keyboard interactions are tracked, Deepgram Nova-2 transcribes speech, Gemini analyzes video frames multimodally, and UI changes are detected through frame differencing. Next, an interaction clustering stage merges cursor and typing signals into meaningful patterns. From there, three sequential AI planning stages powered by Claude Sonnet 4 build an intent graph, create a narrative plan, and produce specific edit decisions. Finally, FFmpeg assembles the timeline into a rendered output video.

On top of this, we built a Railtracks Gateway, which is a Python FastAPI agentic orchestration layer. Here, users can reprompt the AI in natural language to refine their edits, functioning just like prompting an AI code editor. Three agents run sequentially to handle intent, narrative, and edit planning with full observability.

All of our infrastructure is defined as code with Terraform, covering VPC networking, 15 Cloud Run services, a GCE VM, Google Cloud Storage, Secret Manager, and Artifact Registry. We ran 10 code review sweeps across the entire codebase to identify around 130 potential issues and fix 38 confirmed bugs.

Challenges we ran into

The hardest bug Richard had to face had no error message. Drag-and-drop on the editor would silently fail. The animation would activate, but the result would not stick. It was a difficult issue to solve because without errors, it was hard to track where the code went wrong. After extensive digging, he discovered that defining React sub-components inside a parent causes React to destroy and recreate DOM nodes on every state update, which kills the browser's active drag session. The fix required rearchitecting the timeline into module-level components connected by React Context. It was a very unorthodox way of solving the issue, but it was highly enlightening.

Isabelle faced the challenge of balancing the technical logic of the product with the emotional experience it delivers. Translating feelings like "this should feel like relief" into actionable design guidance and conveying design reasoning clearly to the team was difficult but highly rewarding.

For our DevOps and backend deployment, we had a grueling post-mortem after deploying our 13 Cloud Run workers. Initially, every worker crashed at startup due to a syntax error caused by SpacetimeDB using ES2024 declarations. We had to upgrade our Node environments from Node 20 up to Node 24 to get the runtime to understand the syntax. Immediately after fixing that, we faced a protocol error that completely broke the modules. We were ready to assume the architecture was fundamentally limited until we traced the issue to a single SDK import line that was mistakenly looking for the WASM runtime instead of the vanilla Node runtime. This was a valuable lesson in digging deeper before blaming the software package.

Accomplishments that we're proud of

Alex is really proud to ship another frontend design that is truly clean and intentional. He meticulously brainstormed the frontend design philosophy with Isabelle to target the exact pain points we were solving, resulting in an app that emotionally appeals to the user with smooth animations and empowering typography.

Vishnu is incredibly proud of his backend engineering. He began by drawing out the full architecture on Excalidraw and mapping out the full infrastructure. By going through a strict test-driven development paradigm, he ensured that development was precise and accurate. We are proud to have shipped such a strong backend while working meticulously with all teammates.

Our team is especially proud of our communication. We regularly checked in with each other to stay synced with our progress so that no parts were left behind across the frontend, backend, or graphics design. We also passed all 10 of our strict verification conditions, maintaining type safety, zero hardcoded secrets, input validation on all reducers, and error handling on all asynchronous operations.

What we learned

Alex learned that meticulously working through the frontend by starting with intentionality, design, and philosophy helps immensely. Getting the designs drawn out by hand before any coding begins gave us a completely clear picture of the final product.

Isabelle learned that good product design is not just about appearance. It is about curating a rewarding emotional journey and ensuring every visual and interactive choice serves both function and story. Every screen needs to communicate progress and purpose.

Vishnu learned the value of documenting the full expectancy of everything as he developed the backend. Even with his years of coding experience, picking up a test-driven development approach enabled him to be highly efficient and accomplished with his prompt work.

Richard learned a lot about bleeding-edge React lifecycle idiosyncrasies by solving the timeline drag-and-drop bug. Learning about this technical niche allows him to create stronger, more reliable frontend designs moving forward.

We all learned a valuable operational lesson: run the container, not just the build. A clean TypeScript compilation only tells you the types check out, not whether the runtime will survive first contact with an environment import. We also learned to always verify problems empirically before assuming it is a fault of the upstream software.

What's next for FlowStudio

FlowStudio intends to become a genuine app that targets the creators and product demo market. It targets the market of sales, creation, marketing, and communication to seamlessly create professional corporate videos. We plan to launch FlowStudio as a Tauri desktop app, making it available for download and commercialization.

We plan to optimize the process flow of the app to bring more functionality without killing the minimalism. We will be adding browser screen capture integration to connect the MediaRecorder API directly to the pipeline for richer signal extraction. We also aim to build a real-time preview system using Canvas and WebCodecs so users can see AI-generated edits instantly in a non-destructive view before the final FFmpeg render. Finally, we want to expand our collaborative editing features to leverage SpacetimeDB's real-time push architecture to let teams work on the same video project simultaneously.

Built With

Share this project:

Updates