-
-
Home - List of templates
-
Create Template - Chat interface
-
Create Template - Form Builder
-
Create Template - Dictation
-
Create Template - First Message
-
Create Template - AI Response
-
Create Template - AI generated form
-
Create Template - Manually editing form
-
Create Template - Drag and drop form fields
-
Create Template - Edit field details
-
Template Page - Empty list of entries (new template)
-
Create Entry - Respond to AI
-
Create Entry - AI generated entry (page 1)
-
Create Entry - AI generated entry (page 2)
-
Create Entry - AI generated entry (page 3)
-
Entry Details - Print off a PDF
-
Template Page - List of entries with options to add new ones
Inspiration
As a Fractional CTO, my brain is my best tool and my biggest bottleneck. I can architect complex technical solutions and run through thousands of optimisations in my head in minutes. The problem? Explaining those solutions to others. Documentation is a mandatory part of my job, but getting my thoughts onto paper in a succinct way takes 20x to 100x longer than it takes to actually solve the problem.
When LLMs like Claude first introduced dictation, I thought I’d found the holy grail. I started "brain dumping" my ideas and asking the AI to write them up. But reality quickly set in. My perfectionism meant I was spending hours restructuring sections, rewording paragraphs, and correcting hallucinations. The workflow was also a mess, constantly copying Markdown between chat windows and Notion. Even worse, the AI would radically change the structure every time I made an update. I realised I didn't just need a transcriber; I needed a tool where I could pre-define a structure and "pour" my thoughts into it.
Combining this with my past experience building software for surgeon who constantly asked for voice-driven operation notes and fire safety inspectors who were buried in paper forms and WhatsApp messages, I realised this wasn't just my problem. It was a universal professional pain point. Dicto was born from the need to make speaking the fastest route to professional-grade documentation.
What it does
Dicto is a voice-first platform that transforms unstructured "brain dumps" into polished, structured records.
- Structure First: I can describe a template once, and Dicto generates a Google Forms style form with specific fields and rules.
- Natural Dictation: I speak naturally and freely. I don't have to worry about the order; I just talk.
- Smart Clarification: Dicto extracts my data into the predefined template. If I’ve missed a required field, it doesn't just guess, it asks me a targeted clarifying question within a feedback, loop to fill the gap(s).
- Consistency: Because the structure is locked in a schema, I can reuse it 100 times and know exactly where every piece of information will land.
- Professional Output: It renders the final Markdown into a branded, formatted PDF directly on my device that I can customise into any format (email, professional record, bullet list, etc) simply by asking
How I built it
I built Dicto using a high-performance stack optimised for speed and accuracy:
- Frontend: Built with Flutter, allowing for a smooth, reactive experience across all platforms (including potential for smart watches, apple car play / android auto, etc in the future).
- Transcription: While I initially explored on-device options, I found they couldn't match the precision I needed. I moved to Whisper running on Groq Cloud for near-instant, highly accurate speech-to-text.
- Intelligence: I integrated Google’s Gemini 3 models via API to handle the heavy lifting of gene, template generation, data extraction, and document synthesis.
- Backend: I used Serverpod (a Dart-native framework) to manage the orchestration layer, secure storage, and user authentication.
- PDF Engine: I built a local rendering engine that takes the Markdown and applies custom CSS/styling to produce professional PDFs using the flutter PDF package.
Challenges I ran into
The biggest technical hurdle was schema enforcement. I needed the LLM to strictly adhere to my strict response schema. If the AI hallucinated a field that wasn't in the template, the whole document would break. I had to implement a rigorous validation logic that checks the AI’s output against the template schema and forces a regeneration if it doesn't match perfectly.
Another challenge was the "perfectionist's paradox"—making the AI's clarifying questions feel helpful rather than annoying. I solved this by moving to section-level prompting, which allows the AI to capture multiple different data points from one long narrative burst rather than interrupting me for every single field.
Accomplishments that I'm proud of
I’m incredibly proud of the custom generative UI layer I built on top of googles gemini models. It’s handles everything from creating the template to the final export. I’m also proud of the speed; by using Groq for transcription and optimising the Gemini prompts, the turnaround time from "finished speaking" to "structured document" feels almost magical. I’ve essentially built the tool I’ve been wishing for throughout my entire career as a CTO.
What we learned
This project taught me that structure is the key to making AI useful for professionals. Raw transcripts are a liability; structured data is an asset. I also learned that the "last mile" of documentation is where most of the friction lies. By solving the formatting problem through a hidden schema, I can give users total flexibility in how they speak while maintaining total control over how the document looks.
What's next for Dicto
This isn't just a hackathon project for me anymore, it’s become a commercial venture. My next steps include:
- Commercial Launch: I am moving Dicto into a full commercial product immediately following the hackathon.
- Targeted Demos: I already have interest from professionals in highly regulated industries (Healthcare, Legal, and Site Inspection) and will be performing direct demos for them next month.
- Advanced Logic: Adding conditional branching—where the questions Dicto asks change based on the answers you’ve already given.
- API integration: Imagine if you could enter data into every app your business uses all from one place by simply speaking. With my structured approach I can take API schemas and turn them into Dicto templates users can create using natural language.


Log in or sign up for Devpost to join the conversation.