Inspiration
Incorporating large language models (LLMs) as judges in domain-specific tasks presents unique challenges. Aligning these models with the expertise of domain specialists is often tedious and time-consuming. To reduce friction and improve the accuracy of these models, a seamless and intuitive UI/UX becomes crucial. With this in mind, we aimed to streamline the process of annotating and integrating human expertise into LLMs.
What it does
The platform allows users to upload a .csv file containing a set of questions, LLM-generated answers, human expert ratings, and expert feedback. Our tool then calls DSPy to optimize the LLM’s prompts, refining them to generate responses that closely align with those of the human expert. Additionally, users can adjust various optimization parameters within the interface, offering greater control and flexibility to improve the model’s performance further, with the goal of generating even more accurate, expert-like responses
How we built it
Our development process centered around creating an intuitive user experience, especially for non-coding experts. We designed an elegant yet straightforward user interface that allows users to interact with the optimization process effortlessly. Key features include the ability to view the top-performing prompts and trace the entire workflow visually, using Weave to create a clear and understandable representation of each step. This combination of usability and transparency ensures that domain experts can focus on refining their models without needing deep technical expertise.
Real-World Example: Computational Chemistry
We used computational chemistry as a test case to demonstrate the practical application of our tool. In this domain, extracting simulation parameters from foundational models is a challenging task, as physics-based simulations often rely on scientific literature that doesn’t easily translate into structured outputs. Moreover, handling sensitive, company-specific data raises concerns when transmitting it through API calls. By curating a specialized dataset, we explored how DSPy can streamline this process, helping to extract critical simulation parameters more effectively. Looking forward, these optimized parameters could be integrated into agentic workflows, enabling chemical simulations to proceed with minimal human intervention, ultimately reducing delays in next generation drug and material discovery.
What's next for OptoPrompt
Our next goal is to develop an end-to-end solution that streamlines the entire optimization process. In this vision, domain experts will be able to easily label an initial set of LLM responses, which will serve as the starting point for DSPy’s optimization process. Following this, additional responses from the LLM will be iteratively used to fine-tune and optimize the prompt, ensuring that the output consistently aligns with expert-level insights. This automated, feedback-driven loop will further enhance prompt quality, reducing manual effort and increasing accuracy over time.
Built With
- dspy
- next.js
- python
- typescript
- weave
Log in or sign up for Devpost to join the conversation.