⚽ Phantom Coach

Your AI Tactical Analyst — See the Game. Hear the Coach. Speak the Play.

Built for the Gemini Live Agent Challenge · Live Agents 🗣️

What is Phantom Coach?

What if you could have a Jarvis for soccer — an AI analyst that watches the match alongside you, listens to your questions, and breaks down every tactical detail with its voice in real-time?

Phantom Coach is a multimodal AI coaching assistant powered by the Gemini 2.5 Flash Live API. You stream any soccer match and have a natural voice conversation with an AI that operates at the level of a UEFA Pro License tactical analyst. Ask it what you see, interrupt it mid-sentence, tell it to switch to the tactical board, and watch as it draws corrections and simulates plays on a live 2D pitch — without ever touching your mouse.

This isn't a chatbot that analyzes screenshots. It's a live, bidirectional session — Phantom Coach sees every frame, tracks every player, and speaks back to you with grounded tactical analysis. No typing. No waiting. Just talk.

Features

🎙️ Live Voice Analysis

Stream video and have a real-time voice conversation with Gemini 2.5 Flash. The AI watches the match, identifies formations, detects pressing triggers, and narrates tactical insights — all while you can interrupt and redirect naturally.

📐 2D Tactical Board

An auto-generated, real-time tactical map powered by computer vision. Players are tracked via YOLOv8 + ByteTrack, positions are calibrated via RANSAC homography, and the board updates live as the match progresses.

🎯 Tactical Simulations

Voice-command the AI to simulate plays directly on the 2D board — "Show me where the left winger should be" — and watch as the AI animates player movements, draws correction arrows, and highlights passing lanes.

🧠 Context-Aware Coaching

A state machine with 4 distinct modes (Live Analysis → Transition → Tactical Board → Waiting for Confirmation) gates which tools and prompts are active, ensuring the AI stays grounded and contextually aware at all times.

⚡ Real-Time Tactical Alerts

The backend agent pipeline detects turnovers, pressing triggers, line-breaking passes, and formation shifts in real-time — feeding grounded data into Gemini to prevent hallucination.

🗣️ Barge-In & Interruption Support

Full support for natural interruption. Ask a question mid-analysis, redirect the AI to a different part of the pitch, or tell it to switch views — all via voice, handled seamlessly by the Live API.

Architecture

Layer	What It Does
User	Voice input via microphone, video feed, browser interaction
Frontend	React/Vite/TypeScript UI with VideoPlayer, TacticalBoard, GhostOverlay, CommandCenter. Zustand for state. MultimodalStreamer sends audio + JPEG frames over WebSocket
Backend	FastAPI on Cloud Run. GeminiLiveClient manages bidirectional Gemini sessions. EventBus-driven agent pipeline: VisionTrackingAgent → StandardizerEngine → TacticalAnalysisAgent
Computer Vision	YOLOv8 (detection, pose, segmentation), ByteTrack with appearance Re-ID, RANSAC + Kalman pitch calibration, team classification
Tactical Analytics	Voronoi pitch control, formation detection, Expected Threat (xT) model, moment indexing (turnovers, shots, set-pieces, PPDA)
Google Cloud	Gemini 2.5 Flash Live API, Cloud Run (hosting), Firebase Auth + Firestore + Storage

Tech Stack

Category	Technologies
Frontend	React 19, Vite 7, TypeScript, Tailwind CSS 4, Zustand, Framer Motion, Lucide Icons
Backend	FastAPI, Python 3.11, Uvicorn
AI / ML	Gemini 2.5 Flash (Live API via `google-genai` SDK), YOLOv8 (Ultralytics), ByteTrack
Computer Vision	OpenCV, RANSAC Homography, Kalman Filtering, Voronoi Tessellation, DBSCAN Clustering
Cloud	Google Cloud Run, Firebase (Authentication, Cloud Firestore, Firebase Storage)
DevOps	Docker, automated deployment via `deploy.sh`

Google Cloud Services Used

Service	How It's Used
Gemini 2.5 Flash Live API	Bidirectional streaming — real-time voice + vision analysis via `google-genai` SDK
Google Cloud Run	Container hosting for the FastAPI backend with auto-scaling (1–5 instances, 4 vCPU, 4 GB RAM, always-on CPU)
Firebase Authentication	User sign-in and session management
Cloud Firestore	Persistent storage for coaching projects, tactical sessions, and match data
Firebase Storage	Video uploads and extracted frame storage
Cloud Build	Container image building, fully automated via `deploy.sh`

Getting Started

Prerequisites

Node.js 18+ & npm
Python 3.11+
Google Cloud project with billing enabled
Gemini API Key from Google AI Studio
Firebase project (Auth, Firestore, Storage enabled)

1. Clone the Repository

git clone https://github.com/luminousyinyang/Phantom_Coach.git
cd Phantom_Coach

2. Backend Setup

cd backend
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

Create backend/.env:

GEMINI_API_KEY=your_gemini_api_key
GOOGLE_APPLICATION_CREDENTIALS=./Service_Account_Key.json
FIRESTORE_DATABASE_ID=your_firestore_database_id
FIREBASE_STORAGE_BUCKET=your_firebase_storage_bucket

Place your Google Cloud Service Account JSON file as backend/Service_Account_Key.json.

Start the backend:

uvicorn main:app --reload

3. Frontend Setup

cd frontend
npm install

Create frontend/.env:

VITE_FIREBASE_API_KEY=your_firebase_api_key
VITE_FIREBASE_AUTH_DOMAIN=your_project.firebaseapp.com
VITE_FIREBASE_PROJECT_ID=your_project_id
VITE_FIREBASE_STORAGE_BUCKET=your_project.appspot.com
VITE_FIREBASE_MESSAGING_SENDER_ID=your_sender_id
VITE_FIREBASE_APP_ID=your_app_id
VITE_GEMINI_API_KEY=your_gemini_api_key

Start the frontend:

npm run dev

The app will be available at http://localhost:5173.

Cloud Deployment

Phantom Coach deploys to Google Cloud Run as a single containerized service (backend serves the built frontend).

Automated Deployment

A single-command deployment script is included:

bash deploy.sh

What deploy.sh does:

Builds the React frontend (npm run build)
Copies the build output into the backend directory
Submits the Docker image to Cloud Build (gcloud builds submit)
Deploys to Cloud Run with the included environment variables
Frontend and backend are now fully deployed!

See deploy.sh for the full deployment automation script.

Project Structure

Phantom_Coach/
├── frontend/                  # React + Vite + TypeScript
│   └── src/
│       ├── components/        # UI: VideoPlayer, TacticalBoard, GhostOverlay, CommandCenter, etc.
│       ├── services/          # MultimodalStreamer (WebSocket audio/video streaming)
│       ├── store/             # Zustand state management
│       ├── context/           # Firebase Auth context
│       └── types/             # TypeScript definitions
├── backend/                   # FastAPI + Python
│   ├── api/                   # REST + WebSocket route handlers
│   ├── intelligence/          # GeminiLiveClient, AgentStateMachine, EventBus, tool declarations
│   ├── agents/                # VisionTrackingAgent, StandardizerEngine, TacticalAnalysisAgent
│   ├── services/              # CV (tracker, calibrator, classifier), Tactical (semantics, xT, indexing)
│   ├── main.py                # FastAPI application entrypoint
│   ├── Dockerfile             # Production container
│   └── requirements.txt
├── assets/                    # Architecture diagram (.mmd + .png)
├── deploy.sh                  # Automated Cloud Run deployment
└── README.md

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 87 Commits
assets		assets
backend		backend
frontend		frontend
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cloudbuild.yaml		cloudbuild.yaml
deploy.sh		deploy.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⚽ Phantom Coach

What is Phantom Coach?

Features

🎙️ Live Voice Analysis

📐 2D Tactical Board

🎯 Tactical Simulations

🧠 Context-Aware Coaching

⚡ Real-Time Tactical Alerts

🗣️ Barge-In & Interruption Support

Architecture

Tech Stack

Google Cloud Services Used

Getting Started

Prerequisites

1. Clone the Repository

2. Backend Setup

3. Frontend Setup

Cloud Deployment

Automated Deployment

Project Structure

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

⚽ Phantom Coach

What is Phantom Coach?

Features

🎙️ Live Voice Analysis

📐 2D Tactical Board

🎯 Tactical Simulations

🧠 Context-Aware Coaching

⚡ Real-Time Tactical Alerts

🗣️ Barge-In & Interruption Support

Architecture

Tech Stack

Google Cloud Services Used

Getting Started

Prerequisites

1. Clone the Repository

2. Backend Setup

3. Frontend Setup

Cloud Deployment

Automated Deployment

Project Structure

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages