Inspiration
I always thought that AI browser assistants were neat, but was unimpressed by how they didn't actually do anything. All of the big AI companies are making their own AI browsers, but Chrome still dominates two thirds of the market. So, we wanted to provide this co-pilot functionality to everyday people to revolutionize their workflow.
What it does
Like Jarvis, you can command Supersurf to navigate the web for you. It can open new tabs to any website, just by interpreting what you say. You can also tell it to close, pin, and mute your tabs. This is also helpful for users who are injured or disabled. Voice commands as an input would make browsing more accessible for those who can't interact with physical objects as well. Instead of moving a mouse or typing out commands, one keybind is all it takes to navigate the browser.
How we built it
Supersurf is made up of four parts. We use the microphone access to record a file to send to fish.audio. Then, the speech-to-text transcribes it and sends it as a prompt to Gemini. Finally, Gemini interprets and calls functions using Chrome's API. In addition, there is a popup front-end that allows the user to change their keybinding and shows some example features.
Challenges we ran into
Everything was straightforward. The primary problem came from integrating the fish and Gemini API's into the browser extension. We couldn't package any development kits, so we had to use HTTP requests to call each API directly. We also had to make content scripts and the service worker communicate with each other. Once we figured this out, the rest came smoothly.
Accomplishments that we're proud of
Having your browser do what you say feels like magic. You feel like Iron Man. This is a feature that is easily accessible and works out of the box thanks to the nature of browser extensions. Even with the functionality we implemented, there is so much potential for what Supersurf can do.
What we learned
None of us had built an app in JavaScript; the frameworks were entirely new. We learned best practices for handling APIs, and made our first AI app.
What's next for Supersurf
There is so much functionality that's possible with this app. Here is what we couldn't implement in the prototype: sorting tabs, switching to tabs, and being able to access a tab just by saying what site it's on. Also, having multiple actions in one prompt is something new. It would access your history as supplemental context for user prompts, greatly expanding the range of possibilities.
Log in or sign up for Devpost to join the conversation.