Inspiration

I was inspired to create this project after noticing how many bad study habits I picked up after quarantine. Although I had more time to study, I wasn't utilizing my time properly and getting off track by checking my phone. I also noticed I was not taking care of my health by sitting at my desk for prolonged periods of time and forgetting to drink water.

What it does

FocusBot is a web application that uses a TensorFlow object detection model to detect and label all the objects through a view of your webcam. It will give different voice alerts when a certain object is detected. For example, if the stream recognizes that you are on your phone, it will give a voice alert to stay on topic. If it detects that you picked up a water bottle, it will also send out a different voice command. The user interface is very minimalist and also is dynamically

How we built it

We built the backend model using a TensorFlow object detection model, the specific algorithm we used was the SSD ResNet101 V1 FPN 640x640. We integrated this with OpenCV in order to get the video feed from the webcam. We then defined the label maps to correspond index numbers to category names and took only specific categories to give us an output, which where cell phone and bottle. These outputs were stored in a json file and sent to the front-end. We used reactjs and three.js to build the front end. There were multiple options for connecting backend with the front, however given time constraints and for the purpose of demonstration we opted for using a log file that the model write in and the ReactJS app reads from. As soon as there is a log in the file, depending whether it is “cell phone” or “bottle”, the front ends plays according message.

Challenges we ran into

A big challenge we ran into when working on this project was fine-tuning the model. Even though I had even imported the model I was trying to train to my PC with a 1660 super, it still took far too long to train even with a very small dataset. I eventually opted in to use a pre-trained model without fine-tuning it, and it was far more accurate than I had ever expected. Also, there were challenges with three.js and three-fiber.js libraries. Rendering 3D graphics was one of the things I wasn’t quite familiar with, but doing the project led me to learn tons about both react and three.js library.

Accomplishments that we're proud of

Getting the live stream to work. Overcoming obstacles and making the collaboration work despite all the odds.

What we learned

We learned a lot about real-time vision as the projects we built in the past did not incorporate OpenCV into the project. Detecting and labeling objects real-time through a camera is a lot cooler than just uploading an image. We also learned how to render 3D graphics using nothing but JS libraries, working with Canvas, Camera, Scene tools of the three js library.

What's next for FocusBot

I think adding more features would be incredibly helpful. Tensorflow is an amazing library that offers endless possibilities to machine learning technology. We want to be able to add more features by giving more outputs when detecting different things when the user is working and we also want to make the front end more interactive with allowing the user to add events to their calender, giving users a to do list for the day, etc. There is a big for potential adding such features as google calendar events/reminder, note tanking, planning tools etc.

Built With

Share this project:

Updates