-
Code Snippet of Optical Character Recognition
-
Image JSON file
-
Text JSON file
-
Original Note Picture
-
Webpage produced from the program
-
Original Note Picture
-
Webpage produced from the program
-
Original Note Picture
-
Webpage produced from the program
-
Original Note Picture
-
Webpage produced from the program
NoteSquire
A UCLA Hackathon(2019) project by Han Bin Lee, Benjamin Choi, Ashkan Faghihi, and Giovanni Moya
NoteSquire is a project that aims to make taking class notes to a new level. To store a handwritten page of a note, all you need to do is take a picture of the page and upload it. Characters will be detected and converted into digital text, and diagrams will be cropped and added into the digital note!
Roles:
Han - create python module that takes an image and outputs the detected data in the image, using GCP Vision API Ben - create a React app that takes in Han's python module's output and create an HTML file (end product) Gio - frontend implementations and web design Ash - backend implementations
Motivation
It started from a small conversation between us in a management course, here in UCLA. To take notes in the class, we started to use google docs/ oneNote to take notes digitally. But at one point in the course, our professor began to draw a lot of diagrams and graphs, which we couldn't draw and save into our digital notes. It became annoying as we had to start taking old fashioned notes with pen and paper (to draw the diagrams and such) and we wouldn't be able to save our notes digitally. Then we thought that hey, it would be nice if we can simply take a picture of a page of handwritten notes and a program can text-ize the characters, and crop out the diagrams automatically using machine learning - which is exactly what this project is about
Workflow, APIs and Libraries used
For the basic web framework we used node.js with react with Material UI, which we used to create a simple 2-page web page where you can upload an image and view the resulting digitalized notes. For the implementation of the digitalization, we used Google Cloud Vision API 's OCR with python scripts to extract text and their positional values from the image. Then the python scripts outputs two JSON files (one containing string & positional values of the detected text in image, and the other containing positional values of the cropped diagram images in the image) and varying amount of cropped diagram image files. A react app then loads the JSON and image files, which then creates the resulting HTML file.
Built With
- css
- google-cloud
- google-vision-api
- html
- javascript
- material-ui
- node.js
- npm
- npx
- python
Log in or sign up for Devpost to join the conversation.