Federated Queries
Federated queries let you execute one SQL statement across multiple sources — databases, local files, and S3-compatible storage — from a single SQL Console session.
What federated queries are for
- joining data from multiple databases
- combining database tables with local files or S3-backed files
- comparing data across providers (e.g., AWS S3 vs DigitalOcean Spaces)
- validating multi-source logic before turning the result into a stream workflow
Limitations
- Federated queries run through DuckDB, which may differ from native database behavior
- Not all providers and engines behave identically — use explicit casts when types differ across sources
Setting up a multi-source session
- Open the SQL Console from Data Explorer.
- Click Manage sources in the top-right corner. This opens the Query Session panel.
- The panel lists available connections in two sections:
- DATABASES — PostgreSQL, MySQL, and other database connections
- FILES — local file connections and S3-compatible storage (DigitalOcean Spaces, MinIO, etc.)
- Check two or more connections to select them.
- For database connections, pick a database from the dropdown that appears below the selected connection. Use + DB to add another database from the same connection.
- Use + Add at the top of the section to create a new connection if needed.
When more than one source is selected, the console switches to multi-source (DuckDB) mode automatically. The header updates to show "Multi-source • N sources" with alias badges for each selected source.
Aliases
Each selected source gets an auto-generated alias based on its connection type — for example, my1 for MySQL, pg1 for PostgreSQL, aws for S3. Aliases can be edited inline in the Query Session panel.
Alias rules:
- Must start with a letter
- Letters, numbers, and underscores only
- Keep them short and readable:
pg1,my1,aws,do
Naming conventions
In multi-source mode, qualify all table references with aliases:
| Source type | Pattern | Example |
|---|---|---|
| PostgreSQL | alias.schema.table | pg1.public.actor |
| MySQL | alias.database.table | my1.sakila.film |
| S3 / files | read_* functions with alias scheme | read_parquet('aws://bucket/path/*.parquet') |
Examples
Cross-database join (MySQL + PostgreSQL)
SELECT p.first_name, p.last_name, m.title
FROM my1.sakila.film AS m
JOIN pg1.public.film_actor AS fa ON m.film_id = fa.film_id
JOIN pg1.public.actor AS p ON p.actor_id = fa.actor_id
WHERE m.title = 'ACE GOLDFINGER';
Database + S3 join
SELECT c.id, c.email, o.total
FROM pg1.public.customers c
JOIN read_parquet('aws://analytics/orders/*.parquet') o
ON c.id = o.customer_id
LIMIT 100;
Query two S3 providers
Assign each S3 connection an alias (e.g., aws, do), then use the alias as the URL scheme:
SELECT a.actor_id, a.first_name AS aws_name, b.first_name AS do_name
FROM read_parquet('aws://my-bucket/sakila/actor/*.parquet') a
JOIN read_parquet('do://my-bucket/sakila/actor/*.parquet') b
ON a.actor_id = b.actor_id
LIMIT 50;
UNION across S3 providers
SELECT 'AWS' AS source, actor_id, first_name, last_name
FROM read_parquet('aws://bucket/actor/*.parquet')
UNION ALL
SELECT 'DO' AS source, actor_id, first_name, last_name
FROM read_parquet('do://bucket/actor/*.parquet')
ORDER BY actor_id
LIMIT 100;
Supported file readers
Any DuckDB file reader that accepts a URL string works with S3 aliases:
read_parquet('alias://bucket/path/*.parquet')read_csv_auto('alias://bucket/path/*.csv')read_json_auto('alias://bucket/path/*.json')
SQL templates
The Templates panel (Ctrl+K) provides context-aware snippets for federated mode. When multiple sources are selected, templates are organized into sections:
- Joins — cross-database JOIN, UNION, database+S3 JOIN, database+file JOIN, S3+S3 JOIN
- Databases — grouped by alias (e.g.,
my1 (MySQL),pg1 (PostgreSQL)), each with:- Starter query for that alias
- List namespaces (using
duckdb_tables()) - List tables (using
duckdb_tables())
- Session —
SHOW DATABASES;to list all attached aliases
Templates are generated dynamically based on the selected connections and their aliases. A search field at the top of the panel filters across all sections.
See SQL Console — Templates for full details.
Mode transitions
- Single-source → multi-source: adding a second source switches to DuckDB mode automatically. Existing SQL is not rewritten.
- Multi-source → single-source: removing sources back to one switches back to direct mode automatically.
- Explicit rewrite: when a starter query needs alias-qualified naming, the console offers a "Rewrite starter SQL to federated naming" action.
Practical rules
- Always qualify table references with aliases in multi-source mode
- Unqualified references (e.g.,
SELECT * FROM actor) will fail in multi-source mode - Use explicit casts when types differ across sources
- S3 alias routing covers S3-compatible providers; GCS and Azure are not currently supported with alias routing
Troubleshooting
"unknown S3 alias(es) in query"
Your SQL uses a scheme like aws://..., but no source with alias aws is selected. Open the Query Session panel, select the S3 connection, and set its alias to match.
Access denied / 403 / signature errors
Credentials or region/endpoint mismatch. Verify the S3 connection credentials. For DigitalOcean/MinIO, ensure the endpoint is set correctly. For AWS, ensure the region matches the bucket's region.
Works for AWS, fails for Spaces/MinIO
Endpoint and URL style differences. Confirm the connection's endpoint is set (Spaces/MinIO usually require it). If your provider requires path-style access, configure the connection accordingly.
Related docs
- SQL Console — Query editor, autocomplete, diagnostics
- Federated Query API Workflows