Inspiration
The inspiration was that Tree of Thoughts(ToT) paper which shows that it can be more effective than Chain of Thoughts(CoT). BigCode benchmark is a more challenging task than traditional coding benchmarks like HumanEval.
What it does?
Ricard.io is an open-source code generation system with a mission to solve coding tasks and get top performance for smaller LLM using the Tree of Thoughts(ToT) prompting technique that uses the BigCode benchmark. We forked the Tree of Thoughts(ToT) repository and created a BigCode class, tasks, and prompts.
Process
Tree of thoughts(ToT) is relatively expensive prompting engineering technique because it significantly increases the number of API calls and the number of generated tokens. To tackle this, we used the Groq API. One of the key challenges we've encountered in our project is benchmarking our models using the BigCode dataset. Specifically, we've had to tailor our approach to optimize performance on this specific task. However, the biggest hurdle we've faced is integrating our tools and models to work seamlessly together. We're working with a wide range of technologies, including BigCode evaluations which are used to compute the scores for the benchmark and Hugging Face, which has required significant effort to ensure hardware compatibility and smooth workflow. Despite these challenges, we're committed to overcoming them and achieving our goal.
What we learned?
We have learned Tree of Thoughts(ToT) approach to prompting, integrating datasets, testing errors, understanding the architecture and tweaking the parameters.
What's next for Ricard.io?
Our future plan for the project is to integrate with development tools like Github, to be able to resolve Repo Issue.

Log in or sign up for Devpost to join the conversation.