Inspiration
Exams. A word that every student, good or bad, fears at least to some degree. Not always because they are hard, but sometimes because being involved in other non-academic activities, such as clubs, internships, or jobs leaves us with limited opportunities to participate in study sessions, seek assistance and guidance, and fully prepare ourselves to perform at our best.
However, with ReVision, you now get the chance to scan and effectively practice a variety of problems from different topics from your device, anywhere you want! Using an integrated AI Tutor, you can receive a constant loop of instant feedback, detailed breakdowns, and personalized suggestions tailored to how to solve or approach each problem.
Whether you're tackling topics like algebra, calculus, and even some written code late at night, ReVision actively revises your solution process and helps correctly practice and refine logic. Using the power of OCR and recursive validation, ReVision transforms passive review into active learning, training students not just how to get the right answer, but also understanding why it’s right and build confidence.
What it does
ReVision is a web application that takes in a document or image from the user, extracts questions from the document builds a module for the user to solve. If the user is struggling or makes a mistake, the app makes sure to guide the user to the right path. This real-time assistance acts similar to a tutor. Given the canvas, the user is free to write what think is correct. Based on how close they are to the answer, hints are displayed at the bottom of the canvas! In the case of an irrelevant response, our app directs the student back in the right direction in a humorous way. Our app is not limited to English. It can reply in other languages, expanding its accessibility to a wide range of students. Its use cases range from helping students to even developers who are new to technologies and want active practice.
How we built it
Our React-based web application takes in an image (jpg, png, etc.) prompt of problems that you wish to work on, learn or practice solving on your own with some possible assistance. Using python's compatibility with Flask and Google Vision's Object Character Recognition features, we extracted strings of characters and symbols.
The app then calls the data and passes it into Google Gemini Pro to create a json of the questions that were recognized from the Google Vision output. This json file is then sent in through the frontend interface displayed as a question prompt among the other components (canvas, pencil, eraser, next buttons, and a dedicated area for feedback/hints). We generated a blank canvas utilizing Next.js and Flask to request inputs to the python script hosting Gemini API 2.5 Flash Lite.
The app recursively captures any changes in input from the whiteboard and sends in the updates to the LLM periodically in a time interval of 2-3 seconds. The LLM evaluates the user solution process and returns possible suggestions or feedback to guide the user towards the correct answer. The feedback is shown visually to the user as a pop-up message which is color-coded based on how close the user is to the solution (red for not close and green for correct). To handle inputs that may not make sense or may be irrelevant, the LLM guides the user towards the correct topic and methods.
Challenges we ran into
Some challenges that ran into along the way are as follows:
- Not being able to handle diagrams and flow charts due to OCR limitations from API model
- LLM's capacity to handle large inputs (questions/prompts) caused delays and extensive runtimes
- OCR heavily restricted PDF parsing integration to input
- App effectiveness relying on better models limiting us to affordable alternatives
- Integrating front and back ends via routing required caution and ample testing.
Accomplishments that we're proud of
Some accomplishments that we are proud of are as follows:
- Deploying a functional real-time looping algorithm featuring a LLM
- Applying Google Vision OCR Parsing to address a Real-World problem
- Simulating a whiteboard-style canvas that can be edited with a stylus and eraser feature
- Achieving Support for Multilingual Tutoring
What we learned
Through this experience, we learned a lot about what it takes to develop a versatile multi-purpose application that uses real-time data updates and operates with recursive thought processes. In our development cycle, we have deepened our appreciation and understanding of the immense power, versatility, and applicability of Google's modern AI tools and frameworks such as GeminiAI and Cloud Vision components. We also expanded our skillset in using databases like Supabase, committing app updates through Git and GitHub, and deploying applications in cloud environments such as Vercel hosting environments.
What's next for ReVision
We are really proud and excited about the idea of committing future updates to this project (maybe even making it a start-up business idea)! Despite the struggles and the frustrations, we had a great time collaborating and learning from each other in ways that contributed to our goal of developing this project to help other people, and students, like us! We look forward towards coming up with new features and updates to this project that adjust it to a modern world of education-based and AI application deployment!
Built With
- flask
- gemini
- javascript
- next.js
- postgresql
- python
- react
- sql
- supabase
- typescript






Log in or sign up for Devpost to join the conversation.