Pong Game with Reinforcement Learning

A comprehensive Pong game implementation featuring both traditional gameplay and reinforcement learning capabilities. The system is built with a modular architecture that separates game logic, UI components, and AI training.

Overview

This project implements a complete Pong game system with the following features:

Traditional Pong Gameplay: Human vs Human and Human vs AI modes
Reinforcement Learning: AI agents trained using PPO (Proximal Policy Optimization)
Modular Architecture: Clean separation of game logic, UI, and AI components
Multiple AI Types: Scripted AI, trained RL agents, and random agents
Comprehensive Testing: Unit tests for all components
Visualization Tools: Real-time game visualization with training metrics

Architecture

The system follows a modular design with clear separation of concerns:

Core Components

Game Logic (game_logic.py): Pure game mechanics without UI dependencies
Environment (pong_env.py): Gymnasium-compatible RL environment
UI Components: Multiple UI implementations for different use cases
Training System (train_rl_agent.py): Complete RL training pipeline
Testing Framework: Comprehensive unit tests

Design Principles

Single Responsibility: Each module has a specific purpose
Dependency Inversion: High-level modules don't depend on low-level details
Testability: All components can be tested independently
Extensibility: Easy to add new features and AI types

Installation

Prerequisites

Python 3.8+
Pygame
Stable Baselines3
Gymnasium
NumPy

Setup

# Install dependencies
pip install pygame stable-baselines3 gymnasium numpy --break-system-packages

# Clone or download the project
cd pong

Usage

Quick Start

# Start basic human vs human game
python3 main.py --mode human_vs_human

# Start human vs AI game
python3 main.py --mode human_vs_ai

# Start AI vs AI game
python3 main.py --mode ai_vs_ai

Training an AI Agent

# Train a new RL agent
python3 train_rl_agent.py --mode train

# Test a trained agent
python3 train_rl_agent.py --mode test

# Visualize training with UI
python3 train_rl_agent.py --mode test-ui

Game Modes

1. Human vs Human (`human_vs_human`)

Classic Pong gameplay
Two human players
Controls: Z/S (left paddle), Arrow keys (right paddle)

2. Human vs AI (`human_vs_ai`)

Human player vs AI opponent
Human controls right paddle (arrow keys)
AI controls left paddle (scripted or trained)

3. AI vs AI (`ai_vs_ai`)

Both paddles controlled by AI
Can use different AI types for each paddle
Useful for training and evaluation

4. Demo Mode (`demo_rl_ui.py`)

Shows random agent vs scripted AI
Demonstrates RL environment without training
Good for understanding the system

Reinforcement Learning

Environment Details

Action Space: 3 discrete actions (0=nothing, 1=up, 2=down)
Observation Space: 5 normalized values [ball_x, ball_y, paddle_y, ball_vel_x, ball_vel_y]
Reward System:
- +1.0 for successful paddle hits
- +10.0 for winning a point
- -10.0 for losing a point
- -0.001 per timestep (encourages fast play)

Training Configuration

Algorithm: PPO (Proximal Policy Optimization)
Network: MLP with [64, 64] architecture
Learning Rate: 4e-4
Batch Size: 64
Steps per Update: 2048
Epochs per Update: 10
Gamma: 0.99 (discount factor)

Training Outputs

pong_agent_final.zip: Final trained model
best_model/: Best model during training
checkpoints/: Regular training checkpoints
logs/: Training logs and metrics
tensorboard_logs/: TensorBoard visualization data

Testing

Run All Tests

# Run all unit tests
python3 -m pytest tests/

# Run specific test categories
python3 tests/test_game_logic.py
python3 tests/test_rl_integration.py
python3 tests/test_ui_components.py

Test Coverage

Game Logic: Physics, collision detection, scoring
RL Environment: Observation space, reward calculation, termination conditions
UI Components: Rendering, event handling, game modes
Training Pipeline: Model loading, training callbacks, evaluation

File Structure

pong/
├── game_logic.py          # Core game mechanics
├── pong_env.py            # RL environment
├── pong_ui.py             # Basic UI for human players
├── pong_rl_ui.py          # UI with RL agent support
├── demo_rl_ui.py          # Demo UI with random agent
├── train_rl_agent.py      # RL training pipeline
├── main.py                # Main entry point
├── tests/                 # Unit tests
│   ├── test_game_logic.py
│   ├── test_rl_integration.py
│   ├── test_ui_components.py
│   └── ...
├── best_model/            # Best trained models
├── checkpoints/           # Training checkpoints
├── logs/                  # Training logs
├── tensorboard_logs/      # TensorBoard data
├── README.md              # This file
└── README_RL.md           # Detailed RL documentation

Technical Details

Game Logic API

from game_logic import PongGame

# Create game instance
game = PongGame(width=600, height=400)

# Update game state
game.update()

# Control paddles
game.set_paddle1_velocity(velocity)
game.set_paddle2_velocity(velocity)

# Get game state
state = game.get_game_state()

RL Environment API

from pong_env import PongEnv

# Create environment
env = PongEnv(opponent_model_path="path/to/model.zip")

# Standard gym interface
obs, info = env.reset()
obs, reward, terminated, truncated, info = env.step(action)

Training API

from train_rl_agent import train_agent, test_agent

# Train agent
train_agent(opponent_model_path="path/to/opponent.zip")

# Test agent
test_agent(model_path="path/to/model.zip")

Performance Metrics

Training Time: ~2-4 hours for 100M steps
Memory Usage: ~2GB during training
Game Performance: 60 FPS with Pygame
Model Size: ~1-2MB per trained model

Contributing

When contributing to this project:

Follow the existing code structure and naming conventions
Add unit tests for new features
Update documentation for any API changes
Use English for all code comments and documentation
Ensure all tests pass before submitting changes

License

This project is open source and available under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
tests		tests
.gitignore		.gitignore
API_DOCUMENTATION.md		API_DOCUMENTATION.md
README.md		README.md
README_A2C_OPTIMIZATION.md		README_A2C_OPTIMIZATION.md
README_ALGORITHM_SELECTION.md		README_ALGORITHM_SELECTION.md
README_CONFIG.md		README_CONFIG.md
README_CONFIG_SUMMARY.md		README_CONFIG_SUMMARY.md
README_FINETUNING.md		README_FINETUNING.md
README_PERFORMANCE_OPTIMIZATION.md		README_PERFORMANCE_OPTIMIZATION.md
README_RL.md		README_RL.md
README_UI_CONFIG.md		README_UI_CONFIG.md
cleanup_training.py		cleanup_training.py
config.yml		config.yml
config_finetuning.yml		config_finetuning.yml
config_manager.py		config_manager.py
demo_rl_ui.py		demo_rl_ui.py
game_logic.py		game_logic.py
main.py		main.py
pong_env.py		pong_env.py
pong_rl_ui.py		pong_rl_ui.py
pong_ui.py		pong_ui.py
test_collision.py		test_collision.py
test_config_manager.py		test_config_manager.py
test_reward_function.py		test_reward_function.py
train_rl_agent.py		train_rl_agent.py

Folders and files

Latest commit

History

Repository files navigation

Pong Game with Reinforcement Learning

Table of Contents

Overview

Architecture

Core Components

Design Principles

Installation

Prerequisites

Setup

Usage

Quick Start

Training an AI Agent

Game Modes

1. Human vs Human (human_vs_human)

2. Human vs AI (human_vs_ai)

3. AI vs AI (ai_vs_ai)

4. Demo Mode (demo_rl_ui.py)

Reinforcement Learning

Environment Details

Training Configuration

Training Outputs

Testing

Run All Tests

Test Coverage

File Structure

Technical Details

Game Logic API

RL Environment API

Training API

Performance Metrics

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Human vs Human (`human_vs_human`)

2. Human vs AI (`human_vs_ai`)

3. AI vs AI (`ai_vs_ai`)

4. Demo Mode (`demo_rl_ui.py`)

Packages