Inspiration

According to a Harvard psychology research, "Visual long-term memory has a massive storage capacity for object details" conducted by Timothy F. Brady, Talia Konkle, and George A. Alvarez, shows that participants perform remarkably well in memorizing with continuous stream of visual representation of items.

What it does

Therefore, we managed to build a mobile application that takes in an audio note that a user wants to review, transcribe to text, parsing into a set of nouns within the text, scraping objects google images, and select the most accurate representation of object to pair with the noun, and present images along with text.

How We built the Backend

We built it by using Android speech detection to read in audio notes from an android phone and built a custom API to read the text and return the noun and image pair.

The API took a string and would run it through Google's Natural Language Processing API to parse text into a set of nouns.

After, we would feed the nouns into google-image-download's API to scrap the images where we extracted urls of images.

In order to maintain the accuracy of our pictures & nouns, we used Google's Cloud Vision API to pick the most accurate representations of the noun.

The API was deployed with Google cloud Run by configuring an OpenAPI Specification to a execute a serverless Google Cloud Function

How we built the Frontend

We used react-native's vanilla project after struggling with utilizing expo recording API.

We strived for good ux in our development and plan to keep an aesthetic on our app.

Challenges We ran into

  1. how to connect between frontend (react-native) and backend (python). (Mainly circumnavigating hotlinking)

  2. understand how each API works and integrate them into our code and makes sense of every single business logic

  3. Prepare the data in the right format in the backend for front end to utilize

  4. Deployment of Google Cloud functions and how to use them

  5. Setting up recording on react-native

Accomplishments that we are proud of

  1. We connected the dots and made it work
  2. Successful collaboration in a two-person team
  3. The app can be truly impactful to learning efficiencies and productivity
  4. Learn about APIs and technologies very quickly on the spot

What we learned

  1. Lots of Google Cloud APIs and tools
  2. API request and response between front end/back end
  3. How to put different APIs together and process I/O properly
  4. communication between teammates and be on the same page
  5. How to deploy a google function and use it

What's next for Dottie

We envision several different kinds of applications that can potentially be extremely impactful to memorization as well as learning. Things such as learning how to memorize English words for kids where the application breaks down each word into words (ex. "carson" -> "car" and "son") and generate accurate representations of those words for kids to associate words with images. (ex. shows images of "car" and "son" for the word "carson"). Another potential application is to allow users to input questions that they have, and the application generates a user story with some kind of ML algorithms that produces images based on the user story for people to memorize and understand things better.

Built With

Share this project:

Updates