What it does

We used a fuzzing approach to create a system that iteratively generates adversarial prompts to fool LLMs. We created a dataset of hundreds of adversarial prompts and scores on Mistral LLM responses.

How we built it

  • NextJS Frontend
    • ShadCN, RadixUI, and Tailwind CSS components
  • Python Flask Backend
  • Mistral Chat API
    • Prompt Generation
    • Testing
    • LLM as Judge
  • Mistral Embedding API
  • MongoDB Atlas
    • Vector Search

Built With

Share this project:

Updates