Inspiration

My friend said writing LaTeX in google docs was too annoying to do and he wished he could just draw it and it would be interpreted easily

What it does

Uses Gemini 2.5 flash lite to interpret the drawing into LaTeX better than any OCR could

How I built it

Most of the front end was made using vscode copilot with Claude 4.5 Sonnet which is why it looks so AI-y. The backend with response parsing and dynamic context injection based on what the user wants was mostly me but I had Claude stub out a lot of it, uses the Gemini API to send context engineered prompts with "tools" that are essentially just specific text that do something in the application like "------- SPEECH" would send a request to ElevenLabs to generate the audio of the contents and play it to the user. These are parsed via regular expressions so the default system prompt is very clear on having the response be in a specific structure.

Challenges we ran into

Literally everything. A lot of prompt engineering had to be done to get consistent responses from gemini. I dont think live speech is working yet.

Accomplishments that we're proud of

Working text editor, working chat, most tools are probably working I am running out of time just writing this.

What we learned

Trying to get consistent results is hard and you need to look out for edge cases like generating latex with and without dollar signs or formatting in tags instead of normal .md style formatting with like stars and stuff.

What's next for LaTeXIsHard

Implementing actual accounts with like firebase, getting more storage by using a paid mongodb cluster, probably a subscription service via stripe or something.

Built With

Share this project:

Updates