🎥 YouTube Summarizer

A powerful Flask application for AI-powered YouTube video and playlist summarization

YouTube Summarizer is a Flask web application that generates AI-powered summaries of YouTube videos and playlists. The app extracts transcripts from YouTube videos, creates concise summaries using multiple AI models (Google Gemini and OpenAI GPT models), and can convert summaries to audio using Google's Text-to-Speech API.

📋 Table of Contents

✨ Features

📹 Video Summarization - Generate AI-powered summaries for individual YouTube videos
📋 Playlist Support - Process and summarize entire YouTube playlists
🤖 Multi-Model AI Support - Choose from Google Gemini and OpenAI GPT models for summarization
🔊 Audio Generation - Convert summaries to MP3 audio files using Text-to-Speech
💾 Smart Caching - Store summaries and audio files to minimize API calls
🎨 Clean Interface - Simple, responsive web UI for easy interaction
⚡ Batch Processing - Handle multiple videos or playlists simultaneously
🔐 Optional Authentication - Secure your application with passcode-based login

🔒 Login Authentication (Optional)

YouTube Summarizer includes an optional login system to secure access to your application. This is particularly useful when deploying the application publicly or sharing it with a limited group of users.

Security Features

Simple passcode authentication - Single user access with a configurable passcode
Brute force protection - Automatic IP-based lockout after failed attempts
Session management - Secure session handling with configurable session keys
Rate limiting - Configurable maximum attempts and lockout duration

Environment Variables

Configure login functionality using these environment variables:

LOGIN_ENABLED=true                    # Enable/disable login (default: false)
LOGIN_CODE=your_secret_passcode      # The passcode users must enter
SESSION_SECRET_KEY=your_random_key   # Secret key for session encryption
MAX_LOGIN_ATTEMPTS=5                 # Failed attempts before lockout (default: 5)
LOCKOUT_DURATION=15                  # Lockout time in minutes (default: 15)
FLASK_DEBUG=false                    # Enable Flask debug mode (default: true, set false for production)

Setup Examples

Docker Compose - Add to your .env file:

GOOGLE_API_KEY=your_google_api_key_here
OPENAI_API_KEY=your_openai_api_key_here
LOGIN_ENABLED=true
LOGIN_CODE=MySecurePasscode123
SESSION_SECRET_KEY=a-long-random-string-for-session-encryption
MAX_LOGIN_ATTEMPTS=3
LOCKOUT_DURATION=30
FLASK_DEBUG=false

Manual Setup - Export environment variables:

export LOGIN_ENABLED=true
export LOGIN_CODE="MySecurePasscode123"
export SESSION_SECRET_KEY="a-long-random-string-for-session-encryption"
export MAX_LOGIN_ATTEMPTS=3
export LOCKOUT_DURATION=30
export FLASK_DEBUG=false

User Experience

When login is enabled:

Users are redirected to /login when accessing the application
After successful authentication, users can access all features normally
Failed login attempts are tracked per IP address
After exceeding max attempts, users are temporarily locked out
Sessions persist until logout or browser closure

Security Recommendations

Use a strong passcode - Combine letters, numbers, and symbols
Generate a random session key - Use a cryptographically secure random string
Configure appropriate lockout settings - Balance security with user experience
Use HTTPS in production - Encrypt all communication with SSL/TLS
Regularly rotate credentials - Change passcode and session key periodically

Testing Override

During development and testing, authentication is automatically bypassed when the TESTING environment variable is set to true. This ensures all existing tests continue to work without modification.

🌐 Webshare Proxy Support

YouTube Summarizer supports using webshare proxies for fetching YouTube transcripts. This can help bypass IP restrictions, rate limiting, or geographic blocks that may prevent transcript retrieval from certain server environments.

Proxy Configuration

Configure webshare proxy support using these environment variables:

WEBSHARE_PROXY_ENABLED=true                    # Enable/disable proxy (default: false)
WEBSHARE_PROXY_HOST=proxy.webshare.io         # Proxy server hostname
WEBSHARE_PROXY_PORT=8080                      # Proxy server port
WEBSHARE_PROXY_USERNAME=your_username         # Webshare proxy username
WEBSHARE_PROXY_PASSWORD=your_password         # Webshare proxy password

Setup Examples

Docker Compose - Add to your .env file:

GOOGLE_API_KEY=your_google_api_key_here
WEBSHARE_PROXY_ENABLED=true
WEBSHARE_PROXY_HOST=proxy.webshare.io
WEBSHARE_PROXY_PORT=8080
WEBSHARE_PROXY_USERNAME=myusername
WEBSHARE_PROXY_PASSWORD=mypassword

Manual Setup - Export environment variables:

export WEBSHARE_PROXY_ENABLED=true
export WEBSHARE_PROXY_HOST="proxy.webshare.io"
export WEBSHARE_PROXY_PORT="8080"
export WEBSHARE_PROXY_USERNAME="myusername"
export WEBSHARE_PROXY_PASSWORD="mypassword"

When to Use Proxies

Consider enabling proxy support when:

Transcript fetching fails with "YouTube is temporarily blocking requests" errors
Running the application on cloud servers (AWS EC2, Google Cloud, etc.)
Experiencing rate limiting from YouTube's transcript API
Working with videos that may have geographic restrictions

Security Recommendations

Keep proxy credentials secure - Store them in environment variables, not in code
Use reputable proxy providers - Choose trusted services like Webshare.io
Monitor proxy usage - Track bandwidth and request costs
Test thoroughly - Verify transcript fetching works both with and without proxies

Troubleshooting

If you experience issues with proxy configuration:

Verify all proxy environment variables are set correctly
Check proxy credentials with your webshare provider
Ensure the proxy service is active and has available bandwidth
Test without proxy first to isolate issues
Check application logs for proxy-related error messages

🔧 Prerequisites

Google API Key (required) with access to:
- YouTube Data API v3
- Google Generative AI (Gemini)
- Google Cloud Text-to-Speech API
OpenAI API Key (optional) for GPT models

🐳 Usage with Docker (Recommended)

1. Clone the Repository

git clone <repository-url>
cd youtube-summarizer

2. Set Up Environment Variables

Create a .env file in the project root:

GOOGLE_API_KEY=your_google_api_key_here
OPENAI_API_KEY=your_openai_api_key_here  # Optional, for GPT models

3. Initialize Data Directory

Run the initialization script to create the proper directory structure:

./init_data.sh

This creates the data directory with the correct file structure to avoid volume mounting issues.

4. Run with Docker Compose

docker-compose up -d

The application will be available at http://localhost:5001

5. Stop the Application

docker-compose down

Docker Notes

Summaries and audio files are persisted in the ./data directory on your host machine
The container automatically restarts if it crashes
Logs can be viewed with: docker-compose logs -f

💻 Usage without Docker

1. Clone the Repository

git clone <repository-url>
cd youtube-summarizer

2. Create Virtual Environment

python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

3. Install Dependencies

pip install -r requirements.txt

4. Set Up Environment Variables

Create a .env file or export the environment variables:

export GOOGLE_API_KEY=your_google_api_key_here
export OPENAI_API_KEY=your_openai_api_key_here  # Optional, for GPT models

On Windows:

set GOOGLE_API_KEY=your_google_api_key_here
set OPENAI_API_KEY=your_openai_api_key_here

5. Run the Application

For development:

python app.py

For production (using Gunicorn):

gunicorn --bind 0.0.0.0:5001 app:app

The application will be available at http://localhost:5001

🚀 How to Use

Open the Web Interface: Navigate to http://localhost:5001 in your browser
Login (if enabled):
- If authentication is enabled, you'll be redirected to the login page
- Enter the configured passcode to access the application
- You'll be automatically redirected to the main interface
Enter YouTube URLs:
- Paste one or more YouTube video URLs
- Playlist URLs are also supported
- Multiple URLs can be entered on separate lines
Generate Summaries: Click the "Summarize" button to process the videos
View Results:
- Summaries appear below each video
- Cached summaries are displayed in the sidebar
- Click the speaker icon to generate and play audio

📁 Project Structure

youtube-summarizer/
├── app.py                 # Main Flask application
├── templates/
│   └── index.html        # Web interface
├── audio_cache/          # Generated MP3 files
├── summary_cache.json    # Cached summaries
├── requirements.txt      # Python dependencies
├── Dockerfile           # Docker image configuration
├── docker-compose.yml   # Docker Compose configuration
└── .dockerignore        # Docker ignore patterns

🔑 API Requirements

This project supports multiple AI providers. You can use either or both:

Google APIs (Required for YouTube access and Gemini models)

YouTube Data API v3 - For fetching video metadata and playlist information
Generative AI API - For accessing Google's Gemini models
Cloud Text-to-Speech API - For converting summaries to audio

To set up Google APIs:

Go to Google Cloud Console
Create a new project or select an existing one
Enable the required APIs
Create an API key and add it as GOOGLE_API_KEY to your .env file

OpenAI API (Optional, for GPT models)

OpenAI API - For accessing GPT-4o, GPT-4 Turbo, and GPT-3.5 Turbo models

To set up OpenAI API:

Go to OpenAI Platform
Create an account and generate an API key
Add it as OPENAI_API_KEY to your .env file

Available AI Models

Google Gemini 2.5 Flash - Fast and efficient (default)
Google Gemini 2.5 Pro - More capable for complex content
OpenAI GPT-5 - Most advanced AI model available
OpenAI GPT-5 Mini - Faster GPT-5 variant
OpenAI GPT-4o - Advanced multimodal capabilities
OpenAI GPT-4o Mini - Fast and cost-effective

🛠️ Troubleshooting

Common Issues

"GOOGLE_API_KEY environment variable is not set"
- Ensure your .env file exists and contains your API key
- For Docker: Make sure the .env file is in the same directory as docker-compose.yml
"No transcripts are available for this video"
- The video doesn't have captions/transcripts available
- The video might be private or age-restricted
API Quota Errors
- Check your Google Cloud Console for API usage limits
- The app uses caching to minimize API calls
Port Already in Use
- Change the port in docker-compose.yml or when running the app
- Example: python app.py --port 5002

Login-Related Issues

Stuck on login page / Invalid passcode
- Verify LOGIN_CODE environment variable is set correctly
- Ensure LOGIN_ENABLED=true is set
- Check for typos in the passcode (case-sensitive)
"Too many failed attempts" / Account locked
- Wait for the lockout duration to expire (default: 15 minutes)
- Or restart the application to clear the lockout
- Reduce MAX_LOGIN_ATTEMPTS or increase LOCKOUT_DURATION if needed
Session expires immediately
- Ensure SESSION_SECRET_KEY is set and consistent
- Check that cookies are enabled in your browser
- Verify the session key doesn't contain special characters that might cause issues
Login not working in tests
- Tests automatically bypass authentication when TESTING=true
- This is expected behavior - tests should always pass regardless of login settings

🧪 Testing

The project includes comprehensive unit and integration tests.

Quick Test

./quick_test.sh
# or
make test

Full Test Suite with Coverage

python run_tests.py
# or
make coverage

Test Structure

tests/test_app.py - Flask endpoint tests
tests/test_transcript_and_summary.py - Transcript and summary generation tests
tests/test_cache.py - Cache functionality tests
tests/test_integration.py - End-to-end integration tests

🔍 Code Quality

Run All Quality Checks

./run_quality_checks.sh
# or
make quality

Auto-fix Formatting Issues

./run_quality_checks.sh --fix
# or
make fix

Individual Checks

make format   # Check code formatting
make lint     # Run linting (pylint, flake8)
make test     # Run tests only

Development Commands

Use the Makefile for convenient development commands:

make help      # Show all available commands
make install   # Install all dependencies
make run       # Run Flask app locally
make clean     # Clean up cache files

Quality Tools

Black - Code formatting
isort - Import sorting
Flake8 - Style guide enforcement
Pylint - Static code analysis
Bandit - Security linting
Coverage - Test coverage reports

🤝 Contributing

Feel free to submit issues, fork the repository, and create pull requests for any improvements. Please ensure all tests pass before submitting a PR.

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
.claude/agents		.claude/agents
.github/workflows		.github/workflows
docker		docker
examples		examples
plans		plans
static		static
subagent-prompts		subagent-prompts
templates		templates
tests		tests
.dockerignore		.dockerignore
.flake8		.flake8
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
Dockerfile.sse		Dockerfile.sse
Makefile		Makefile
README.md		README.md
ai_models.py		ai_models.py
app.py		app.py
auth.py		auth.py
cache.py		cache.py
docker-compose-podman.yml		docker-compose-podman.yml
docker-compose-sse.yml		docker-compose-sse.yml
docker-compose.yml		docker-compose.yml
error_handler.py		error_handler.py
gunicorn_config.py		gunicorn_config.py
job_models.py		job_models.py
job_queue.py		job_queue.py
job_state.py		job_state.py
nginx-sse-fixed.conf		nginx-sse-fixed.conf
package.json		package.json
pytest.ini		pytest.ini
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
run_quality_checks.sh		run_quality_checks.sh
sse_manager.py		sse_manager.py
tts_helpers.py		tts_helpers.py
voice_config.py		voice_config.py
worker_manager.py		worker_manager.py
youtube_helpers.py		youtube_helpers.py

Folders and files

Latest commit

History

Repository files navigation

🎥 YouTube Summarizer

A powerful Flask application for AI-powered YouTube video and playlist summarization

📋 Table of Contents

✨ Features

🔒 Login Authentication (Optional)

Security Features

Environment Variables

Setup Examples

User Experience

Security Recommendations

Testing Override

🌐 Webshare Proxy Support

Proxy Configuration

Setup Examples

When to Use Proxies

Security Recommendations

Troubleshooting

🔧 Prerequisites

🐳 Usage with Docker (Recommended)

1. Clone the Repository

2. Set Up Environment Variables

3. Initialize Data Directory

4. Run with Docker Compose

5. Stop the Application

Docker Notes

💻 Usage without Docker

1. Clone the Repository

2. Create Virtual Environment

3. Install Dependencies

4. Set Up Environment Variables

5. Run the Application

🚀 How to Use

📁 Project Structure

🔑 API Requirements

Google APIs (Required for YouTube access and Gemini models)

OpenAI API (Optional, for GPT models)

Available AI Models

🛠️ Troubleshooting

Common Issues

Login-Related Issues

🧪 Testing

Quick Test

Full Test Suite with Coverage

Test Structure

🔍 Code Quality

Run All Quality Checks

Auto-fix Formatting Issues

Individual Checks

Development Commands

Quality Tools

🤝 Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages