Inspiration
In the fast-paced academic world, students are constantly juggling assignments, deadlines, and notes scattered across various formats. We were inspired to create a tool that streamlines academic organization by bridging the gap between visual information—like a screenshot of a syllabus or a photo of a whiteboard—and a structured, actionable study plan. NoteVision was born from the desire to turn cluttered images into a clear, manageable task list with a single click.
What It Does
NoteVision is a smart task management application designed for students. Users can upload an image of their study materials, and our application leverages the Google Gemini AI to intelligently parse the image, identify tasks, subjects, and due dates, and automatically populate them into a personal study planner.
Key features include:
- AI Task Extraction: Upload any image (PNG, JPG, etc.), and the AI extracts relevant tasks, automatically creating entries with subjects, descriptions, and due dates.
- Full Task Management: Manually add, view, and mark tasks as complete through a clean and intuitive web interface.
- Smart Reminders: The system automatically identifies tasks due on the current day and displays them on a dedicated reminders page.
- Persistent, Cloud-Based Storage: All tasks are securely stored in a MongoDB Atlas cloud database, ensuring data is saved and accessible.
How We Built It
NoteVision is a full-stack web application built with a modern, robust technology set.
Backend: We developed a RESTful API using Python and the Flask framework. The backend handles all business logic, including file uploads, user requests, and database operations. We used
werkzeugfor secure file handling andpython-dotenvto manage environment variables securely.AI Integration: The core of our project is the integration with the Google Gemini 1.5 Flash API. We engineered a sophisticated prompt that instructs the model to analyze an image and return a structured JSON object containing a list of tasks. This required careful data handling to parse the model's response reliably.
Database: We migrated from a simple local JSON file to a scalable and persistent MongoDB Atlas cluster. Our Flask application uses the
pymongodriver to perform all CRUD (Create, Read, Update) operations, storing tasks in a cloud-hosted NoSQL database.Frontend: The user interface was built with vanilla HTML5, CSS3, and JavaScript. We designed a responsive and interactive experience, featuring a drag-and-drop file uploader with real-time progress indicators and status updates, all handled by custom client-side JavaScript.
Challenges We Ran Into
Integrating the AI was a significant challenge. The Gemini API's response, while structured, was often wrapped in Markdown code fences (e.g., json ...). To handle this, we implemented a data sanitization step in our Python backend to strip these extraneous characters, ensuring the string could be reliably parsed into a JSON object with json.loads(). This made our AI pipeline far more robust.
Another challenge was architecting the application to move from a stateless file-based system to a persistent database. This required refactoring our entire data layer to use MongoDB, replacing simple file reads/writes with database queries and learning to work with BSON ObjectIDs for identifying and manipulating specific tasks.
Accomplishments That We're Proud Of
We are proud of building a complete, end-to-end application that solves a real-world problem. Architecting a system that seamlessly integrates a powerful AI model with a web frontend and a cloud database was a major accomplishment. We are particularly proud of the sophisticated AI prompt engineering required to get consistent, structured data from an image, and the dynamic, user-friendly file upload interface we built from scratch.
What We Learned
This project was a significant learning experience. We deepened our understanding of full-stack development, from building RESTful routes in Flask to manipulating the DOM with JavaScript. We gained practical, hands-on experience with:
- API Integration: Effectively calling and processing data from a third-party AI service.
- Database Management: Implementing and interacting with a cloud-based NoSQL database (MongoDB Atlas).
- Modern Development Practices: Using version control (Git), managing dependencies, and securing credentials with environment variables.
What's Next for NoteVision
We believe NoteVision has the potential to grow even further. Our future roadmap includes:
- Task Editing and Deletion: Implementing full CRUD functionality to allow users to edit or delete existing tasks.
- Native Mobile Apps: Developing native iOS and Android applications for on-the-go task management.
- Collaborative Features: Allowing users to share study plans and collaborate on tasks with classmates.... ``
). To handle this, we implemented a data sanitization step in our Python backend to strip these extraneous characters, ensuring the string could be reliably parsed into a JSON object withjson.loads()`. This made our AI pipeline far more robust.
Another challenge was architecting the application to move from a stateless file-based system to a persistent database. This required refactoring our entire data layer to use MongoDB, replacing simple file reads/writes with database queries and learning to work with BSON ObjectIDs for identifying and manipulating specific tasks.
Accomplishments That We're Proud Of
We are proud of building a complete, end-to-end application that solves a real-world problem. Architecting a system that seamlessly integrates a powerful AI model with a web frontend and a cloud database was a major accomplishment. We are particularly proud of the sophisticated AI prompt engineering required to get consistent, structured data from an image, and the dynamic, user-friendly file upload interface we built from scratch.
What We Learned
This project was a significant learning experience. We deepened our understanding of full-stack development, from building RESTful routes in Flask to manipulating the DOM with JavaScript. We gained practical, hands-on experience with:
- API Integration: Effectively calling and processing data from a third-party AI service.
- Database Management: Implementing and interacting with a cloud-based NoSQL database (MongoDB Atlas).
- Modern Development Practices: Using version control (Git), managing dependencies, and securing credentials with environment variables.
What's Next for NoteVision
We believe NoteVision has the potential to grow even further. Our future roadmap includes:
- Task Editing and Deletion: Implementing full CRUD functionality to allow users to edit or delete existing tasks.
- Native Mobile Apps: Developing native iOS and Android applications for on-the-go task management.
- Collaborative Features: Allowing users to share study plans and collaborate on tasks with classmates.
Log in or sign up for Devpost to join the conversation.