Inspiration

The inspiration for this project came from the countless hours spent transcribing handwritten notes into LaTeX. Whether it was lecture notes, assignments, or research ideas, the manual process of converting handwritten text to structured, formatted LaTeX documents felt tedious and time-consuming. We wanted to create a solution that could streamline this process and make it accessible for students, researchers, and academics alike.

What it does

VisionTex takes a picture of handwritten text and transforms it into a polished LaTeX document. Users can upload an image, and our application processes it to generate accurate LaTeX code, complete with formatting for both text and equations. The output is ready for direct use in academic papers, professional documents and all sorts of use cases.

How we built it

  1. Text Extraction: We used Google's Vision API to extract handwritten text from images with high accuracy.
  2. Equation Identification: OpenAI's API was utilized to identify mathematical equations and expressions within the extracted text.
  3. LaTeX Code Generation: OpenAI's API was also leveraged to generate LaTeX code for the recognized text, including proper formatting for equations and expressions.
  4. Frontend Development: The frontend was built using React, providing a simple and user-friendly interface for users to upload images and download LaTeX files.

Challenges we ran into

  1. Recognizing Mathematical Symbols: One major challenge was getting the OCR software to accurately recognize mathematical symbols like ε. These symbols often blend with handwritten text and are misrecognized, making detection difficult.
  2. Classifying Mathematical Content: Distinguishing between regular text and portions that needed to be formatted as mathematical equations or expressions was tricky. Ensuring accurate classification required additional fine-tuning.
  3. Complex LaTeX Formatting: Generating precise LaTeX code for complex structures proved challenging. Handling these cases without introducing errors required significant effort and testing.

Accomplishments that we're proud of

  • Successfully developing a system that can accurately recognize and convert diverse handwritten content into LaTeX.
  • Overcoming significant challenges, such as improving OCR accuracy for symbols like ε (epsilon) and implementing robust classification for distinguishing mathematical expressions from regular text.
  • Creating logic to handle complex LaTeX formatting, ensuring the output is error-free and meets professional standards.
  • Building an intuitive and accessible user interface that simplifies the entire process for users.
  • Seamlessly integrating multiple APIs to deliver a fast, reliable, and user-friendly application.

What's next for VisionTex

  • Improved Accuracy: Further refining the OCR models to handle even more diverse handwriting styles.
  • Advanced Features: Adding support for diagrams, tables, flowcharts, and a bulk-upload feature that converts multiple images uploaded into a single LaTeX pdf.
  • Interactive Whiteboard: Introducing a feature where users can write directly on a virtual whiteboard on the website, with real-time LaTeX generation for their input.
  • Mobile Application: Developing a mobile version for on-the-go accessibility.

Built With

+ 2 more
Share this project:

Updates