Inspiration
My friend said writing LaTeX in google docs was too annoying to do and he wished he could just draw it and it would be interpreted easily
What it does
Uses Gemini 2.5 flash lite to interpret the drawing into LaTeX better than any OCR could
How I built it
Most of the front end was made using vscode copilot with Claude 4.5 Sonnet which is why it looks so AI-y. The backend with response parsing and dynamic context injection based on what the user wants was mostly me but I had Claude stub out a lot of it, uses the Gemini API to send context engineered prompts with "tools" that are essentially just specific text that do something in the application like "------- SPEECH" would send a request to ElevenLabs to generate the audio of the contents and play it to the user. These are parsed via regular expressions so the default system prompt is very clear on having the response be in a specific structure.
Challenges we ran into
Literally everything. A lot of prompt engineering had to be done to get consistent responses from gemini. I dont think live speech is working yet.
Accomplishments that we're proud of
Working text editor, working chat, most tools are probably working I am running out of time just writing this.
What we learned
Trying to get consistent results is hard and you need to look out for edge cases like generating latex with and without dollar signs or formatting in tags instead of normal .md style formatting with like stars and stuff.
What's next for LaTeXIsHard
Implementing actual accounts with like firebase, getting more storage by using a paid mongodb cluster, probably a subscription service via stripe or something.
Built With
- elevenlabs
- gemini
- javascript
- mongodb
- node.js
- ts
Log in or sign up for Devpost to join the conversation.