A powerful AI-powered data analysis agent that combines Google BigQuery with the Google Agent Development Kit (ADK) to enable natural language interactions with your data warehouse.
- Authentication
gcloud auth application-default login- Clone and Setup
git clone https://github.com/johanesalxd/bq-agent-app.git
cd bq-agent-app
# Install uv (if not already installed)
# Visit: https://docs.astral.sh/uv/getting-started/installation/
# Install dependencies with uv
uv sync
# Activate environment
source .venv/bin/activate
# Setup MCP Toolbox
cd setup/mcp_toolbox
# Update the script parameters for your OS before running
# Edit install-mcp-toolbox.sh and update:
# - MCP_VERSION: choose the version from https://github.com/googleapis/genai-toolbox/releases
# - MCP_OS: "linux" for Linux, "darwin" for macOS
# - MCP_ARCH: "amd64" for Intel/x64, "arm64" for Apple Silicon
# Example for macOS Apple Silicon: MCP_OS="darwin" MCP_ARCH="arm64"
chmod +x install-mcp-toolbox.sh
./install-mcp-toolbox.sh
cd ../..
# Configure the environment
cp .env.example .env
export $(cat .env | grep -v '^#' | xargs)
# Run the MCP server with custom configuration
cd setup/mcp_toolbox
./toolbox --tools-file=tools.yaml --port=5000
cd ../..
# Run ADK
uv run adk webThis section covers setting up the BQML agent with RAG corpus integration for enhanced BigQuery ML capabilities.
In addition to the basic prerequisites, the BQML agent requires:
-
Additional APIs enabled:
- Vertex AI API
- BigQuery API (already required)
- Cloud Resource Manager API
-
Additional permissions for your account:
- Vertex AI User
- BigQuery User
- BigQuery Data Viewer
-
Enable required APIs:
# Enable required Google Cloud APIs gcloud services enable aiplatform.googleapis.com gcloud services enable bigquery.googleapis.com gcloud services enable cloudresourcemanager.googleapis.com
After completing the basic setup, configure BQML-specific environment variables:
# Edit your .env file to include BQML configuration
# The basic setup already covers most variables, add these BQML-specific ones:
# BQML Agent configuration (leave empty initially)
BQML_RAG_CORPUS_NAME=The BQML agent uses a RAG (Retrieval-Augmented Generation) corpus for enhanced BQML documentation access:
# From the project root directory
uv run python setup/rag_corpus/create_bqml_corpus.pyWhat this script does:
- Checks for existing corpus: If
BQML_RAG_CORPUS_NAMEis empty in your.envfile - Creates new RAG corpus: Sets up a Vertex AI RAG corpus with text-embedding-005
- Ingests BQML documentation: Downloads and processes Google's BQML documentation from GCS
- Updates environment: Automatically writes the corpus name to your
.envfile
When the RAG corpus setup runs successfully:
Creating new BQML RAG corpus...
Corpus created: projects/123456789/locations/us-west4/ragCorpora/1234567890123456789
BQML_RAG_CORPUS_NAME 'projects/123456789/locations/us-west4/ragCorpora/1234567890123456789' written to /path/to/your/project/.env
Importing files to corpus: projects/123456789/locations/us-west4/ragCorpora/1234567890123456789
Files imported to corpus: projects/123456789/locations/us-west4/ragCorpora/1234567890123456789
# Check that BQML_RAG_CORPUS_NAME is now populated
cat .env | grep BQML_RAG_CORPUS_NAMEImportant: The BQML setup uses us-west4 as the default region for Vertex AI RAG because:
- Capacity Limitations:
us-central1has allowlisting-based access due to capacity limitations - RAG Engine Availability:
us-west4has reliable availability for Vertex AI RAG Engine
Test the RAG corpus functionality:
# Test the RAG corpus with sample queries
uv run python setup/rag_corpus/test_rag.py "What BQML model types are available?"- Agentspace Management Guide - Comprehensive guide for managing Google Cloud Agentspace agents and ReasoningEngines
- Vertex Extensions Setup Guide - Complete guide for setting up Vertex AI Extensions for code interpretation
- MCP Toolbox Deployment Guide - Deploy MCP toolbox to Google Cloud Run for production use
This project provides a comprehensive Multi-Agent System for BigQuery analytics with advanced data science capabilities.
| Feature | Multi-Agent System |
|---|---|
| Directory | bq_multi_agent_app/ |
| Setup Complexity | Moderate |
| BigQuery Operations | ✅ |
| MCP Protocol Support | ✅ |
| Python Data Science | ✅ |
| Statistical Analysis | ✅ |
| Data Visualization | ✅ |
| Multi-Agent Orchestration | ✅ |
| Additional Dependencies | MCP Toolbox |
- Advanced Analytics: Complete data science workflows with comprehensive analysis and visualizations
- Multi-Agent Architecture: Root agent orchestrates specialized sub-agents for different tasks
- MCP Integration: Uses Model Context Protocol for standardized BigQuery operations
- 🔍 Dataset Discovery: List and explore BigQuery datasets
- 📊 Table Analysis: Get detailed schema and metadata information
- 🔎 SQL Execution: Execute complex SQL queries through natural language
- 🤖 AI-Powered: Uses Gemini 2.5 Flash for intelligent query understanding
- 🔐 Flexible Authentication: Multiple authentication methods supported
- 🎯 Multi-Agent Orchestration: Root agent delegates tasks to specialized sub-agents
- 🐍 Python Analytics: Stateful code execution for advanced data science workflows
- 📈 Data Visualization: Automated chart generation with matplotlib
- 🧠 Statistical Analysis: Comprehensive statistical testing and modeling
- 🔗 MCP Integration: Uses Model Context Protocol for BigQuery operations
- 💬 Conversational Analytics: Interactive BigQuery exploration via MCP
- 📊 Time Series Forecasting: Built-in forecasting capabilities for temporal data
- 📝 Pre-defined SQL Templates: Execute common SQL patterns efficiently
Basic Operations
"What datasets are available in my project?"
"Show me the schema of the sales_data table"
"Find the top 10 customers by revenue this year"
Advanced Analytics
"Analyze sales trends over the last 12 months and create a visualization"
→ Root agent retrieves data, DS agent creates trend analysis with charts
"Build a predictive model for customer churn"
→ Root agent extracts features, DS agent trains and evaluates model
"Compare revenue across product categories with statistical testing"
→ Root agent queries data, DS agent performs statistical analysis
BQML Operations
"Create a logistic regression model for customer churn prediction"
→ BQML agent queries RAG corpus, generates CREATE MODEL statement
"What BQML model types are available for forecasting?"
→ BQML agent retrieves documentation about ARIMA_PLUS and time series models
"Show me how to evaluate a BQML model"
→ BQML agent provides ML.EVALUATE function documentation and examples
"List existing BQML models in my dataset"
→ Root agent queries BigQuery information schema for ML models
The system is built on:
- Google Agent Development Kit (ADK): Framework for building AI agents
- Gemini 2.5 Flash: Large language model for natural language understanding
- Vertex AI: Google Cloud's AI platform integration
Root Agent (bigquery_ds_agent)
├── Conversational Toolset (MCP Toolbox)
│ └── bigquery-conversational-analytics # Quick insights & answers
├── Data Retrieval Toolset (MCP Toolbox)
│ ├── bigquery-execute-sql # Raw data extraction
│ ├── bigquery-list-dataset-ids # Dataset discovery
│ ├── bigquery-get-dataset-info # Dataset metadata
│ ├── bigquery-list-table-ids # Table discovery
│ └── bigquery-get-table-info # Table schema
├── ML Analysis Toolset (MCP Toolbox)
│ ├── bigquery-forecast # TimesFM forecasting
│ └── bigquery-analyze-contribution # Contribution analysis
├── DS Sub-Agent (ds_agent)
│ ├── Python Code Execution
│ ├── Data Visualization
│ └── Statistical Analysis
└── BQML Sub-Agent (bqml_agent)
├── BQML Toolset (MCP) # SQL/BQML execution
├── RAG Response # BQML documentation queries
└── Model Listing # BigQuery ML model discovery
The agent intelligently chooses between four approaches:
PATH 1: Quick Insights (Conversational Analytics)
- For simple questions and quick answers
- Uses
bigquery-conversational-analyticsdirectly - Returns natural language insights
PATH 2: In-Depth Analysis (Data Retrieval + Data Science)
- For complex analysis and visualizations
- Uses
bigquery-execute-sql→call_data_science_agent - Provides full control over data and analysis
PATH 3: ML Analysis (TimesFM + Contribution Analysis)
- For forecasting and understanding drivers of change
- Uses
bigquery-forecast(TimesFM model) andbigquery-analyze-contribution - Quick ML insights without custom model training
PATH 4: BQML Operations (BigQuery ML)
- For custom machine learning model operations
- Delegates to BQML sub-agent → BQML toolset + RAG corpus
- Handles model creation, training, evaluation, and documentation
bq-agent-app/
├── pyproject.toml # uv package configuration
├── uv.lock # Dependency lock file
├── bq_multi_agent_app/ # Multi-Agent System
│ ├── agent.py # Root agent with MCP integration
│ ├── tools.py # MCP BigQuery tools + agent wrappers
│ ├── prompts.py # Root agent instructions
│ └── sub_agents/
│ ├── ds_agents/ # Data Science Agent
│ │ ├── agent.py # Data science agent
│ │ └── prompts.py # DS agent instructions
│ └── bqml_agents/ # BigQuery ML Agent
│ ├── agent.py # BQML agent with RAG integration
│ ├── prompts.py # BQML agent instructions
│ └── tools.py # BQML-specific tools (RAG, model listing)
├── setup/ # Setup and deployment tools
│ ├── mcp_toolbox/ # MCP Toolbox setup
│ │ ├── install-mcp-toolbox.sh # Local installation script
│ │ ├── deploy.sh # Cloud Run deployment
│ │ ├── Dockerfile # Container definition
│ │ └── MCP_TOOLBOX_GUIDE.md # Deployment guide
│ ├── rag_corpus/ # BQML RAG Corpus Setup
│ │ ├── create_bqml_corpus.py # RAG corpus creation script
│ │ └── test_rag.py # RAG corpus testing script
│ ├── vertex_extensions/ # Vertex AI Extensions Management
│ │ ├── setup_vertex_extensions.py # Create extensions
│ │ ├── cleanup_vertex_extensions.py # Clean up extensions
│ │ ├── utils.py # Shared utilities
│ │ └── VERTEX_EXTENSIONS_GUIDE.md # Setup guide
│ └── agentspace/ # Agentspace Management
│ ├── manage_agentspace.sh # Agentspace and ReasoningEngine management
│ ├── test_agent_engine.py # Testing utilities
│ └── AGENTSPACE_MANAGEMENT_GUIDE.md # Comprehensive guide
└── README.md
- Python 3.11+
- uv package manager (installation guide)
- Google Cloud Project with BigQuery enabled
- Google Cloud credentials
The Multi-Agent System requires both the MCP Toolbox and the Agent to be deployed. You can choose to run each component locally or deploy to Cloud Run based on your needs.
The MCP Toolbox provides BigQuery connectivity for the agent.
# Set environment variables
cp .env.example .env
export $(cat .env | grep -v '^#' | xargs)
# Start the MCP server with custom configuration
cd setup/mcp_toolbox
BIGQUERY_PROJECT=$BIGQUERY_PROJECT ./toolbox --tools-file=tools.yaml --port=5000# Set environment variables
cp .env.example .env
export $(cat .env | grep -v '^#' | xargs)
# Setup and deploy
cd setup/mcp_toolbox
chmod +x deploy.sh
./deploy.sh
# To deploy using a local `toolbox` binary instead of the release version, run:
# ./deploy.sh localThe script will:
- Enable required Google Cloud APIs
- Create a service account with BigQuery permissions
- Build and push the Docker image
- Deploy to Cloud Run with
--allow-unauthenticated - Provide the service URL for access
The agent connects to the MCP Toolbox (local or cloud) to provide BigQuery functionality.
# Set environment variables
cp .env.example .env
export $(cat .env | grep -v '^#' | xargs)
uv run adk web # or uv run adk run# Set environment variables
cp .env.example bq_multi_agent_app/.env
export $(cat .env | grep -v '^#' | xargs)
uv run adk deploy cloud_run \
--project=your-project-id \
--region=us-central1 \
--service_name=bq-agent-app \
--trace_to_cloud \
--with_ui \
./bq_multi_agent_appAfter deployment:
- Access your agent at the provided Cloud Run service URL
- The web UI will be available for interactive testing
- Cloud tracing is enabled for monitoring and debugging
# Set environment variables
export $(cat .env | grep -v '^#' | xargs)
uv run adk deploy agent_engine \
--staging_bucket="gs://your-project-id-adk-staging" \
--display_name="BigQuery Multi-Agent App" \
--trace_to_cloud \
--env_file=.env \
./bq_multi_agent_appOr to update existing Agent Engine:
# Set environment variables
export $(cat .env | grep -v '^#' | xargs)
uv run adk deploy agent_engine \
--staging_bucket="gs://your-project-id-adk-staging" \
--display_name="BigQuery Multi-Agent App" \
--trace_to_cloud \
--env_file=.env \
--agent_engine_id=your-agent-engine-id \
./bq_multi_agent_appAfter deployment:
- Agent is deployed to Vertex AI Agent Engine with managed sessions
- Provides programmatic access via the Agent Engine API
- Integrated with Vertex AI ecosystem for enterprise use
- Supports both synchronous and asynchronous operations
To use ADK Web frontend with Agent Engine:
agent_engine_id="your-agent-engine-id"
adk web --session_db_url=agentengine://${agent_engine_id}| MCP Toolbox | Agent | Use Case |
|---|---|---|
| Local | Local | Development and testing |
| Cloud Run | Local | Development with shared toolbox |
| Cloud Run | Cloud Run | Full production deployment |
| Cloud Run | Agent Engine | Enterprise deployment with managed sessions |
Note: Ensure your service account has the necessary BigQuery permissions for your project. For advanced MCP configurations, refer to the official documentation.
Once your agent is deployed to Agent Engine, you can manage it through Google Cloud Agentspace. The project includes comprehensive tools for Agentspace operations:
- Agent Management: List, deploy, delete, and replace agents in Agentspace
- ReasoningEngine Operations: Manage ReasoningEngine sessions and handle deletion errors
- Error Resolution: Automatically handle the common "child resources" deletion error
- Dry Run Mode: Preview operations before execution
cd setup/agentspace
# Make script executable
chmod +x manage_agentspace.sh
# Configure your project details in the script
# Edit manage_agentspace.sh and update:
# - PROJECT_ID, PROJECT_NUMBER
# - REASONING_ENGINE_ID, AS_APP
# - AGENT_DISPLAY_NAME
# Common operations
./manage_agentspace.sh list # List all agents
./manage_agentspace.sh deploy # Deploy new agent
./manage_agentspace.sh delete # Delete existing agent
./manage_agentspace.sh delete-reasoning-engine # Fix deletion errors
./manage_agentspace.sh help # Show all commandsIf you encounter the error: "The ReasoningEngine contains child resources: sessions", use:
# Preview what would be deleted
./manage_agentspace.sh delete-reasoning-engine --dry-run
# Delete sessions first, then ReasoningEngine (recommended)
./manage_agentspace.sh delete-reasoning-engine
# Or force delete everything at once
./manage_agentspace.sh delete-reasoning-engine --forceFor detailed instructions, see the Agentspace Management Guide.
- Use minimum required permissions
- Store credentials securely
- Review SQL queries executed by agents
- Consider read-only mode for production
- Google Agent Development Kit Documentation
- BigQuery Documentation
- Model Context Protocol
- MCP Toolbox Cloud Run Deployment Guide
- Blog Post: BigQuery meets Google ADK and MCP
Built with ❤️ using Google Agent Development Kit, BigQuery, and MCP