Skip to content

rahul201722/youtube-analytics-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

YouTube Analytics & Recommendation Engine

A scalable Rust application for fetching, analyzing, and generating insights from YouTube channel data using the YouTube Data API v3.

Features

  • Automated Data Extraction: Periodically fetch video, channel, and comment metadata
  • ETL Pipeline: Process and clean data using efficient formats (Parquet, CSV)
  • Analytics Engine: Calculate performance metrics, trends, and engagement rates
  • Performance Reports: Generate comprehensive analytics reports in JSON format
  • Scalable Storage: Store historical data for time-series analysis
  • CLI Interface: Easy-to-use command-line interface

Prerequisites

Installation

  1. Clone the repository:
git clone <your-repo-url>
cd youtube-analytics-engine
  1. Build the project:
cargo build --release
  1. Set up configuration:
cp config/default.toml config/config.toml
# Edit config/config.toml with your YouTube API key

Configuration

Update config/config.toml with your settings:

# Your YouTube Data API v3 key
youtube_api_key = "YOUR_API_KEY_HERE"

# Directory to store collected data
data_dir = "data"

# How often to fetch new data (in hours)
fetch_interval_hours = 6

# Maximum number of videos to fetch per channel
max_videos_per_channel = 50

[analytics]
enable_trend_detection = true
min_views_for_trending = 10000
lookback_days = 30

Usage

Fetch Channel Data

# Fetch videos from a specific channel
cargo run -- --fetch-data --channel-id UC_x5XG1OV2P6uZZ5FSM9Ttw

# Fetch videos based on search query
cargo run -- --fetch-data --search-query "rust programming"

Run Analytics

# Generate analytics report on stored data
cargo run -- --analyze

# Fetch data and run analytics in one command
cargo run -- --fetch-data --analyze --channel-id UC_x5XG1OV2P6uZZ5FSM9Ttw

Custom Configuration

# Use a different config file
cargo run -- --config config/production.toml --fetch-data --channel-id UC_x5XG1OV2P6uZZ5FSM9Ttw

Project Structure

src/
├── main.rs              # CLI entry point
├── api/
│   └── youtube_client.rs # YouTube Data API client
├── models/
│   ├── config.rs        # Configuration structures
│   └── youtube.rs       # YouTube data models
├── storage/
│   └── file_storage.rs  # File-based data storage
└── analytics/
    ├── engine.rs        # Analytics engine
    └── metrics.rs       # Performance metrics

config/
└── default.toml         # Default configuration

data/                    # Data storage directory
├── videos/              # Video data (Parquet format)
├── videos_csv/          # Video data (CSV format)
├── comments/            # Comments data
└── reports/             # Analytics reports

Data Flow

  1. Extraction: Fetch video metadata from YouTube Data API
  2. Storage: Save data in Parquet format for efficient analytics
  3. Processing: Load and transform data using Polars
  4. Analytics: Calculate metrics, trends, and performance indicators
  5. Reporting: Generate JSON reports with actionable insights

Analytics Features

  • Performance Metrics: Views, likes, comments, engagement rates
  • Trend Detection: Identify trending videos and viral content
  • Time Series Analysis: Track performance over time
  • Top Performers: Rank videos by various metrics
  • Keyword Analysis: Extract insights from titles and descriptions
  • Posting Optimization: Identify optimal posting times

Example Output

{
  "generated_at": "2024-01-15T10:30:00Z",
  "total_videos": 150,
  "total_views": 2500000,
  "average_views_per_video": 16666.67,
  "like_to_view_ratio": 0.045,
  "top_performing_videos": [
    {
      "video_id": "dQw4w9WgXcQ",
      "title": "Never Gonna Give You Up",
      "views": 1000000,
      "engagement_rate": 0.067,
      "views_per_day": 1234.56
    }
  ]
}

Development

Building

cargo build

Testing

cargo test

Linting

cargo clippy

Formatting

cargo fmt

Roadmap

  • Web API endpoint (using Axum)
  • Database support (SQLite/PostgreSQL)
  • Real-time streaming data processing
  • Machine learning recommendations
  • Dashboard UI
  • Cloud deployment support
  • Automated scheduling

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Submit a pull request

License

MIT License - see LICENSE file for details.

API Rate Limits

Be mindful of YouTube API quotas:

  • Default quota: 10,000 units per day
  • Videos.list: 1 unit per video
  • Search.list: 100 units per request
  • PlaylistItems.list: 1 unit per request

Troubleshooting

Common Issues

  1. API Key Invalid: Ensure your YouTube API key is correct and has the YouTube Data API v3 enabled
  2. Quota Exceeded: Check your API usage in Google Cloud Console
  3. Channel Not Found: Verify the channel ID is correct (starts with 'UC')
  4. Permission Denied: Some channels may have restricted access to their data

Logs

Set log level for debugging:

RUST_LOG=debug cargo run -- --fetch-data --channel-id UC_x5XG1OV2P6uZZ5FSM9Ttw

About

A scalable YouTube channel analytics and recommendation engine built with Rust

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages