A scalable Rust application for fetching, analyzing, and generating insights from YouTube channel data using the YouTube Data API v3.
- Automated Data Extraction: Periodically fetch video, channel, and comment metadata
- ETL Pipeline: Process and clean data using efficient formats (Parquet, CSV)
- Analytics Engine: Calculate performance metrics, trends, and engagement rates
- Performance Reports: Generate comprehensive analytics reports in JSON format
- Scalable Storage: Store historical data for time-series analysis
- CLI Interface: Easy-to-use command-line interface
- Rust 1.70+
- YouTube Data API v3 key (Get one here)
- Clone the repository:
git clone <your-repo-url>
cd youtube-analytics-engine- Build the project:
cargo build --release- Set up configuration:
cp config/default.toml config/config.toml
# Edit config/config.toml with your YouTube API keyUpdate config/config.toml with your settings:
# Your YouTube Data API v3 key
youtube_api_key = "YOUR_API_KEY_HERE"
# Directory to store collected data
data_dir = "data"
# How often to fetch new data (in hours)
fetch_interval_hours = 6
# Maximum number of videos to fetch per channel
max_videos_per_channel = 50
[analytics]
enable_trend_detection = true
min_views_for_trending = 10000
lookback_days = 30# Fetch videos from a specific channel
cargo run -- --fetch-data --channel-id UC_x5XG1OV2P6uZZ5FSM9Ttw
# Fetch videos based on search query
cargo run -- --fetch-data --search-query "rust programming"# Generate analytics report on stored data
cargo run -- --analyze
# Fetch data and run analytics in one command
cargo run -- --fetch-data --analyze --channel-id UC_x5XG1OV2P6uZZ5FSM9Ttw# Use a different config file
cargo run -- --config config/production.toml --fetch-data --channel-id UC_x5XG1OV2P6uZZ5FSM9Ttwsrc/
├── main.rs # CLI entry point
├── api/
│ └── youtube_client.rs # YouTube Data API client
├── models/
│ ├── config.rs # Configuration structures
│ └── youtube.rs # YouTube data models
├── storage/
│ └── file_storage.rs # File-based data storage
└── analytics/
├── engine.rs # Analytics engine
└── metrics.rs # Performance metrics
config/
└── default.toml # Default configuration
data/ # Data storage directory
├── videos/ # Video data (Parquet format)
├── videos_csv/ # Video data (CSV format)
├── comments/ # Comments data
└── reports/ # Analytics reports
- Extraction: Fetch video metadata from YouTube Data API
- Storage: Save data in Parquet format for efficient analytics
- Processing: Load and transform data using Polars
- Analytics: Calculate metrics, trends, and performance indicators
- Reporting: Generate JSON reports with actionable insights
- Performance Metrics: Views, likes, comments, engagement rates
- Trend Detection: Identify trending videos and viral content
- Time Series Analysis: Track performance over time
- Top Performers: Rank videos by various metrics
- Keyword Analysis: Extract insights from titles and descriptions
- Posting Optimization: Identify optimal posting times
{
"generated_at": "2024-01-15T10:30:00Z",
"total_videos": 150,
"total_views": 2500000,
"average_views_per_video": 16666.67,
"like_to_view_ratio": 0.045,
"top_performing_videos": [
{
"video_id": "dQw4w9WgXcQ",
"title": "Never Gonna Give You Up",
"views": 1000000,
"engagement_rate": 0.067,
"views_per_day": 1234.56
}
]
}cargo buildcargo testcargo clippycargo fmt- Web API endpoint (using Axum)
- Database support (SQLite/PostgreSQL)
- Real-time streaming data processing
- Machine learning recommendations
- Dashboard UI
- Cloud deployment support
- Automated scheduling
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
MIT License - see LICENSE file for details.
Be mindful of YouTube API quotas:
- Default quota: 10,000 units per day
- Videos.list: 1 unit per video
- Search.list: 100 units per request
- PlaylistItems.list: 1 unit per request
- API Key Invalid: Ensure your YouTube API key is correct and has the YouTube Data API v3 enabled
- Quota Exceeded: Check your API usage in Google Cloud Console
- Channel Not Found: Verify the channel ID is correct (starts with 'UC')
- Permission Denied: Some channels may have restricted access to their data
Set log level for debugging:
RUST_LOG=debug cargo run -- --fetch-data --channel-id UC_x5XG1OV2P6uZZ5FSM9Ttw