Inspiration

I myself have faced a lot of difficulties with Hackathon Submissions. Last-minute demos, multiple requirements, missing README at the end, transcription not available, and what not. Now consider the same situation for Disabled People, whose efforts will be in vain or not fully recognised without proper submission.

What it does

An AI-powered video-to-submission pipeline built with universal accessibility at its core. AccessSubmit transforms your demo video into a complete hackathon submission with comprehensive documentation—all designed for people with disabilities.

How we built it

  • Frontend: Next.js (app router), TypeScript, Tailwind CSS
  • Backend: FastAPI, Python 3.12, runs with Uvicorn
  • Database: MongoDB (GridFS for media blobs)
  • Third-party integrations: Google Gemini (Vision Analyser/Content Generation), Cohere(Content Generation), ElevenLabs (TTS/STT/Agent), OpenAI Whisper (Video Transcribe)

Challenges ran into

  • Quota exceeded from LLM/TTS providers — check billing and usage limits for Gemini/Cohere/ElevenLabs.
  • MongoDB connection failures — verify mongodb_uri and network access (Atlas IP whitelist/VPC).
  • Large uploads failing — ensure max_file_size_mb is configured and reverse-proxy limits (nginx) accept large bodies.
  • Failed to deploy with Vultr, have lost of constraints with respect to Credits/Credit card usage
  • Couldn't achieve the auth flow accessible friendly hence had to remove (looks like that itself is a project in itself 😜)

Accomplishments that we're proud of

  • Platform is mostly accessible (cannot comment on % basis without checking with any disabled person)
  • MongoDB GridFS for storage (Audio, Video, Screenshots, Transcriptions) besides other storage use cases.
  • Elevenlabs Audio Support (For the entirety of the site navigation, sky is the limit here)
  • Gemini Vision Analyzer (Get metadata from Video, i.e. Video->Screenshot->Image->Bytes->Content Extraction->Generation)

What we learned

What's next for Access Submit

I have lots of metadata which is useful but not rendered currently, like transcription file, screenshots gathered from video, Audio generation for video for dumb people, and lots more.

Built With

Share this project:

Updates