BotOrNot

BotOrNot Poster (Tracks B and C)

Inspiration

One of the most daunting parts of starting a research project is just that - the research. Reliably finding experts in the field is paramount to laying the good foundations for months, or years, of research work. From our own projects, we know that this can be a long and arduous process - one that can be heavily sped up with AI - given it doesn't hallucinate! With so many academic papers out there, we decided focusing on a transparent agent that does not fake its information would be not just highly useful for students and researchers alike, but also encouraging more people to reach out to experts - many of whom are really excited to share their work.

What it does

BotOrNot finds you the experts in the field, with just one simple text prompt. One sentence and you can easily find the most popular or highest contributing people in academia - streamlining the foundations of research projects across the globe. Alternatively, you can tailor this to specific locations and even more specific field, all in natural language. Furthermore, the design of BotOrNot incorporates easily chain-of-thought reasoning, so users can easily see what the AI was thinking, with clarity and assurance.

How we built it

TRACK C We utilised prompt engineering in order to come up with our identifications for the different AI animals and models. Using different response times and their reactions to the different red-teaming datasets, we devised different possibilities for these models, and paid attention to where these models had errors. We bore this in mind for our development of Track B, linking what we learned in this track to our transparency expert-finding project.

TRACK B This is a LangChain agent constructed using an LLM proxy and RAG flow, powered by Valyu AI search. Utilising LangGraph, an execution chain is developed and is subsequently wrapped together through a Gradio web interface. Enforcing AI governance and bias audit is also crucial to the system, as well as having clear train-of-though transparency logs. These logs of the AI "thinking" are then passed through to another AI, turning verbose LangSmith logs into a clear tabular format and easy-to-digest information, stating confidence level, verification sources, and the synthesis process.

Challenges we ran into

It was originally challenging to look into prompt engineering for Track C, but we devised different solutions to recognise the models, with several different time-tracking ideas, powered by Python code. The challenges of the transparency of the project was strongly in devising how to ensure full transparency, as well as gaining a deeper understanding of utilising several different APIs in one go.

Accomplishments that we're proud of

We're incredibly proud of our possible recognition of the different animal models, as well as devising our transparent AI project for a problem we really care about - and really feel like can make a difference for students and researchers across the globe.

What we learned

Building guiderails to protect against hallucination is a non trivial task. In addition we learned how to incorporate specific APIs into an Agents workflow. We found that the combination of API call results along with open web browsing enabled us to gain a higher accuracy rate when siting for specific researcher and individuals pages, as well as streamlining the process overall.

What's next for BotOrNot

First, we want to build a "Golden Set" (test questions) and integrate Ragas to automate RAG evaluation. You can't improve what you don't measure. Furthermore, we wish to better integrate our Holistic AI governance layer to actively monitor bias and improve RAG with hybrid search and re-ranking for production-grade accuracy.