Inspiration
I often had wished I had the ability to just copy and paste things from the real world like we have on devices, so I thought this would be cool to build.
What it does
Uses AI to copy objects and text from the real world onto a personal clipboard.
How I built it
- The mobile app is built using Expo, and it allows users to copy objects and text, along with viewing their clipboard
- The mobile app sends requests to my FastAPI server where objects and text are extracted and stored
- To extract objects I used the U2-Net in pytorch, using the rembg wrapper, and used S3 for storing objects
- To extract text I used Google's Tesseract-OCR, in the pytesseract wrapper, and used a PostgreSQL to store results
- The web clipboard is built using TailwindCSS and allows users to view and manage their clipboard on the web and makes it easy to transfer your clipboard contents
Challenges I ran into
None really, I put together a plan and design and stuck to it. I had experience with almost all the libraries I was using so I didn't run into any major bugs I couldn't fix fairly quickly
Accomplishments that I'm proud of
I was able to build an AI powered app that I find useful and is fun to use
What I learned
I had never built a computer vision app on a mobile device so it was a great learning experience.
What's next for AR Copy Paste
Add account management, so this can be a real service people can use
Optimize storage of text results
Optimize display of images and text on web and mobile clipboards
The current U2-Net model lacks the ability to discern between hairs, I would like to use transfer learning to train model to be able to learn how to do it.
Add additional pre and post processing to improve overall performance
Store text in a searchable pdf overlaid on the original image
Built With
- ai
- digitalocean
- expo.io
- fastapi
- html
- javascript
- machine-learning
- pil
- postgresql
- python
- pytorch
- react-native
- s3
- tailwindcss
- tesseract
Log in or sign up for Devpost to join the conversation.