Videmo

A video messaging platform with AI-powered summarization.

Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...

Howdy!

Modern async communication relies heavily on video, but I wanted to understand what goes into building a platform like Loom from the ground up. This meant tackling video capture, upload pipelines, real-time processing, AI integration, and multi-platform deployment-all while maintaining a clean user experience.

What I Built

Cross-Platform Recording

Built a native desktop app using Electron that captures screen and camera footage, then streams uploads directly to the backend. The recorder works seamlessly across macOS, Windows, and Linux, handling video encoding and chunked uploads efficiently.

Web Dashboard

Created a Next.js web application where users can view, manage, and share their recordings. The interface displays video metadata, auto-generated subtitles, and AI summaries inline-making it easy to scan through content without watching entire videos.

Key Features:

  • Clean, modern UI built with Tailwind and Radix components
  • Secure authentication via Clerk
  • Real-time upload progress and processing status
  • Share links for easy distribution

Processing Pipeline

Designed a Node.js backend that orchestrates the entire video lifecycle:

  • Receives uploads from desktop and web clients
  • Stores videos in S3-compatible storage (MinIO)
  • Triggers AI processing workflows
  • Broadcasts real-time updates via Socket.io
  • Manages metadata and user permissions

AI Integration

Integrated Whisper for accurate speech-to-text transcription and Mistral for generating concise summaries. This automation removes the manual work of documenting video content and makes recordings searchable.

Technical Architecture

Frontend Stack:

  • Next.js 15 with React 19 for the web app
  • Electron + Vite for the desktop recorder
  • TypeScript throughout for type safety
  • Radix UI primitives for accessible components

Backend Infrastructure:

  • Express.js API server
  • Socket.io for real-time communication
  • Prisma ORM for database operations
  • AWS SDK for S3-compatible storage

AI & Processing:

  • Whisper API for transcription
  • Mistral for summarization
  • Automated subtitle generation
  • Background job processing

What I Learned

Building Videmo end-to-end taught me about:

  • Video processing complexity: Handling different formats, codecs, and streaming uploads
  • Multi-platform development: Shipping a consistent experience across web and desktop
  • Real-time architecture: Using WebSockets to keep clients synchronized during long-running operations
  • AI integration: Chaining transcription and summarization models into a smooth workflow
  • Performance optimization: Managing large file uploads and video streaming efficiently

Project Goals

This was never meant to be a commercial product - it’s a learning project where I could experiment with modern tooling and ship something complete. The goal was to refresh my full-stack skills while diving deep into domains I hadn’t explored much before: video processing, desktop apps, and AI pipelines.

Tech Stack

Frontend:    Next.js - React 19 - TypeScript - Tailwind CSS - Radix UI
Desktop:     Electron - Vite - TypeScript
Backend:     Node.js - Express - Socket.io - Prisma
Storage:     MinIO (S3-compatible)
AI:          Whisper - Mistral
Auth:        Clerk
Database:    PostgreSQL (via Prisma)

Current Status

Videmo is functional but intentionally kept as a side project. It successfully demonstrates the core concepts I wanted to explore: multi-platform development, video processing, and AI integration. The codebase serves as both a portfolio piece and a reference for future projects.


This project represents my approach to learning: identify an interesting problem, build a complete solution, and share what I learned along the way.