CodeNotes

Home Page
Features
Canvas
Canvas with Live Code Editor
Canvas Running Custom OCR

Inspiration 🌎

This past week, one of our team members had a paper quiz for their introduction to data structures and algorithms class. Covering linked lists and pointers, the quiz was 3 pages, which didn't seem harmful until we remembered there were 300 students and 4 quizzes in the quarter. Therefore the instructors would have 900 sheets of paper for the first quiz and 3600 sheets of paper for the quarter. If the class is offered year round, then 108000 sheets of paper would be used for quizzes alone. These calculations disregard any additional assignments, finals that students might take. More importantly, this is only for one class, if more classes wanted to adopt this model, it would require more instructors to harmfully waste paper.

What it does 💻

CodeNotes strives to save the world one assignment at a time, by providing a virtual environment for students to handwrite their code and later be able to run their code. Students can write C++ code on a canvas which can then be translated to text using a custom Optical Character Recognition model. Students can then execute their code to test and ensure their code behaves accordingly.

How we built it 🔨

The frontend was implemented with Next.js, TailwindCSS, and Aceternity UI to create stunning visuals to capture the student and provide a user friendly workspace. Using Excalidraw a popular online white boarding platform allows students to integrate the same interface into CodeNotes to bring them the familiar and robust diagraming tool.

The backend consists of 2 core elements: code executor and a custom OCR model. The code executor is deployed with Judge0, an open source code executor, that is self hosted and tailored to our needs on Docker containers. The OCR CNN model is developed using Tensorflow on a custom dataset built to handle various characters. The goal of the model is to convert handwritten images and text to text the computer can understand and later execute. The OCR model is implemented through a Flask endpoint that hosts the Keras model.

Challenges we ran into 😥

Our biggest challenge was creating an OCR that would work on real life data. When attempting to work with well known datasets such as MNIST or EMNIST, the datasets would provide phenomenal training results with high levels of accuracy, but would instantly fail if provided with custom handwriting. Attempting to tweak the architecture to account for this, or changing the dataset, or changing the algorithm from a CNN to KNN and back to CNN provided with a tough obstacle. Eventually we were able to get a functional and naive CNN OCR model.

Other challenges included smaller debugging issues, where we would spend 2 hours debugging our API calls when in reality it was a typo between localhost:2358 and localhost:2385. Integrating Next.js with Aceternity proved to be more cumbersome than initially thought as the compatibility issues forced us to use other components. Integrating Excalidraw proved to be challenging as well, due to the new API that provided limited functionality forcing us to work around what we could and could not do.

Our overall planning also caused us to shift course and remove parts of our initial designs and ideas. We originally made plans to implement an assignment submission portal and had integrated authentication with Google with Next Auth and Firestore to store our information in a NoSQL database. Eventually, we scrapped those plans/code in favor for focusing on the canvas and code executor.

Accomplishments that we're proud of 😎

We are proud of our code executor and custom OCR model. Our code executor works for C++ and makes it very easy and fast to create and test code online. Although our OCR model does not have a very high accuracy model, we learned a lot about the process and difficulties of image processing and got a deeper appreciation for machine learning engineers!

The team worked with each other for the first time and our team chemistry was beautiful. We were able to easily assist one another with blockers and easily divide tasks which helped abstract units for CodeNotes.

What we learned 📚

We learned a ton about OCR and its various advantages and disadvantages, especially when trying to implement a custom OCR from scratch. We also gained a better understanding of authentication and database services. Although it was scrapped, we learned a lot about how to implement and integrate these technologies into our current tech stack.

What's next for CodeNotes 💫

Developing a more robust OCR model would be our primary goal. This will also consist of creating a dataset that adheres more to coding languages and their special uses of alpha, numeric, and special characters in the same language. A robust submission portal would also be on the agenda to help students easily submit an assignment and be able to view their old submissions.