Inspiration
Inspired by our experience at large companies and seeing how bad documentation practices cost not just significant money to be lost and inefficiencies, but for critical knowledge to disappear.
We believe that complete, total browser control is an impending reality that will transform workforce dynamics. We are working towards this by building helpful copilot guidance.
What it does
Dokai learns skills from user demonstration and can then interactively guide others in the company through accomplishing tasks and workflows. We are optimizing for fastest time-to-documentation and time-to-answer.
Use Cases: Customer-facing workforce (e.g. Customer Support Agents and Sales Development Reps) can rely on Dokai to speedily help and service customers instead of relying on other employees or static, text-based guidance
How we built it
We are building a browser extension to monitor user interactions on the browser. We also have LLM RAG on the server which learns each company's workflow and text-to-action models.
- Milvus, LlamaIndex, Postgres, FastAPI, React, Chrome extension, OpenAI
- Architecture will be shown on booth
- Milestones:
- Learning mode: Record action that was done on browser, implement RAG application to ingest relevant knowldege
- Figuring out which components to pass to LLM given restricted context window
- Text guidance based on previous skills taught
- Visual guidance: create new DSL between chrome extension, main window and backend to manipulate DOM component
Challenges we ran into
- Chrome extension MV3 limitations
- Prompt engineering
- Limited context window
- Creating interactive guidance
- Contextualizing different types of browser actions
- Inability to run string script on Chrome Extension content script
Accomplishments that we're proud of
Scrappily built MVP of learning and guidance modes that optimizes time-to-answer and time-to-documentation for users
What we learned
Chrome extensions currently have limitations, but it has immense potential and we are excited to be at the forefront of developing its capabilities
What's next for Dokai
- Polish UI to create powerful, interactive overlayed guidance
- Finetuning text-to-action ML model
- Prompt engineering
Built With
- llamaindex
- llm
- openai
- postgresql
- python
- vectordb
Log in or sign up for Devpost to join the conversation.