Adversarial Taboo

A competitive platform for pitting Large Language Models against each other in a strategic word-guessing game. Watch AI models battle in real-time as attackers try to trick defenders into revealing secret words.

🎯 Overview

Adversarial Taboo is an interactive demonstration platform that showcases the strategic capabilities of different LLM models. In this game:

Attacker: Knows a secret word and must craft clever hints to trick the defender into saying it
Defender: Must infer the secret word from hints while trying to guess it correctly
Objective: Compare how different models perform in adversarial scenarios

✨ Features

Real-time LLM Battles: Watch AI models compete in live conversations
Model Comparison: Compare pre-trained vs post-trained model performance
Multiple API Support: Compatible with OpenAI, Anthropic, and custom API endpoints
Interactive Gameplay: Up to 5 turns with automatic win/tie detection
Visual Feedback: Animated messages and game state indicators
Comprehensive Rules: Clear game mechanics with instant feedback

Note about pre/post comparison

This project intentionally showcases a side-by-side comparison between a model in its original (pre-trained) state and a model after additional adaptation. Specifically, the demo is built to demonstrate the effect of applying the SPAG project (see: https://github.com/Linear95/SPAG) for post-training/adaptation. The "pre-trained" vs "post-trained" labels in the UI are intended to help you compare behavior before and after such adaptation.

Important: this is only a demonstration — in practice you may supply any model or endpoint you prefer. The app accepts OpenAI-compatible endpoints, Anthropic endpoints, or custom proxies, so feel free to plug in your own models, keys, or fine-tuning pipelines.

🚀 Quick Start

Prerequisites

Node.js 18+
npm or yarn
API keys for your preferred LLM providers

Installation

Clone the repository

git clone https://github.com/Haydenkkk/adversarial-taboo.git
cd adversarial-taboo

Install dependencies
```
npm install
```

Configure environment variables

Create a .env file in the root directory:

# Attacker Model (fixed)
VITE_ATTACKER_MODEL=gpt-5
VITE_ATTACKER_API_KEY=your_attacker_api_key
VITE_ATTACKER_BASE_URL=https://api.openai.com/v1

# Defender Models
VITE_DEFENDER_PRE_TRAINED_MODEL=gpt-3.5-turbo
VITE_DEFENDER_PRE_TRAINED_API_KEY=your_pre_trained_api_key
VITE_DEFENDER_PRE_TRAINED_BASE_URL=https://api.openai.com/v1

VITE_DEFENDER_POST_TRAINED_MODEL=gpt-4
VITE_DEFENDER_POST_TRAINED_API_KEY=your_post_trained_api_key
VITE_DEFENDER_POST_TRAINED_BASE_URL=https://api.openai.com/v1

Start the development server
```
npm run dev
```
Open your browser

Navigate to http://localhost:7404 to start playing!

🎮 How to Play

Game Rules

Setup: A secret word is randomly selected and assigned to the Attacker
Attacker's Turn: Provides hints about the secret word without saying it directly
Defender's Turn: Responds to hints and may attempt to guess the word
Guessing Format: To guess, the Defender must say: "I know the word! It is {guess}"
Win Conditions:
- Attacker Wins: If Defender says the secret word unconsciously OR makes an incorrect formatted guess
- Defender Wins: If Defender correctly guesses the word in the proper format
- Tie: If maximum turns (5) are reached without a winner

Model Selection

Pre-trained: Standard model performance
Post-trained: Fine-tuned or specialized model performance
The platform defaults to pre-trained model selection

🛠️ Configuration

Supported APIs

The platform supports multiple LLM providers:

OpenAI: GPT-5, GPT-4, and compatible models
Anthropic: Claude models via their API
Custom Endpoints: Any OpenAI-compatible API

Environment Variables

Variable	Description	Default
`VITE_ATTACKER_MODEL`	Model used for the attacker role	`gpt-3.5-turbo`
`VITE_ATTACKER_API_KEY`	API key for attacker model	Required
`VITE_ATTACKER_BASE_URL`	Base URL for attacker API	`https://api.openai.com/v1`
`VITE_DEFENDER_PRE_TRAINED_MODEL`	Pre-trained defender model	`gpt-3.5-turbo`
`VITE_DEFENDER_PRE_TRAINED_API_KEY`	API key for pre-trained defender	Required
`VITE_DEFENDER_PRE_TRAINED_BASE_URL`	Base URL for pre-trained defender	`https://api.openai.com/v1`
`VITE_DEFENDER_POST_TRAINED_MODEL`	Post-trained defender model	`gpt-4`
`VITE_DEFENDER_POST_TRAINED_API_KEY`	API key for post-trained defender	Required
`VITE_DEFENDER_POST_TRAINED_BASE_URL`	Base URL for post-trained defender	`https://api.openai.com/v1`

🏗️ Architecture

Tech Stack

Frontend: React 18 + TypeScript
Build Tool: Vite
UI Library: Shadcn/ui + Radix UI
Styling: Tailwind CSS
Icons: Lucide React
State Management: React Hooks

Project Structure

src/
├── components/
│   ├── GameArena.tsx      # Main game component
│   ├── GameMessage.tsx    # Individual message display
│   ├── GameStats.tsx      # Game statistics
│   ├── ModelSelector.tsx  # Model selection interface
│   └── ui/                # Reusable UI components
├── services/
│   └── llm.ts            # LLM API integration
├── hooks/
│   └── use-toast.ts      # Toast notifications
└── lib/
    └── utils.ts          # Utility functions

📊 Game Flow

Initialization: Secret word selected, models configured
Turn Loop: Alternating attacker/defender turns up to 5 rounds
Word Detection: Real-time monitoring for secret word usage
Win Evaluation: Automatic determination of game outcome
Result Display: Animated feedback with detailed reasoning

Ready to watch AI models battle? Start your first game now! 🎯🤖

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
public		public
src		src
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
bun.lockb		bun.lockb
components.json		components.json
eslint.config.js		eslint.config.js
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
tailwind.config.ts		tailwind.config.ts
tsconfig.app.json		tsconfig.app.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Adversarial Taboo

🎯 Overview

✨ Features

Note about pre/post comparison

🚀 Quick Start

Prerequisites

Installation

🎮 How to Play

Game Rules

Model Selection

🛠️ Configuration

Supported APIs

Environment Variables

🏗️ Architecture

Tech Stack

Project Structure

📊 Game Flow

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Adversarial Taboo

🎯 Overview

✨ Features

Note about pre/post comparison

🚀 Quick Start

Prerequisites

Installation

🎮 How to Play

Game Rules

Model Selection

🛠️ Configuration

Supported APIs

Environment Variables

🏗️ Architecture

Tech Stack

Project Structure

📊 Game Flow

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages