Docs/Data Explorer

Federated Queries

Federated queries let you execute one SQL statement across multiple sources — databases, local files, and S3-compatible storage — from a single SQL Console session.

What federated queries are for

  • joining data from multiple databases
  • combining database tables with local files or S3-backed files
  • comparing data across providers (e.g., AWS S3 vs DigitalOcean Spaces)
  • validating multi-source logic before turning the result into a stream workflow

Limitations

  • Federated queries run through DuckDB, which may differ from native database behavior
  • Not all providers and engines behave identically — use explicit casts when types differ across sources

Setting up a multi-source session

  1. Open the SQL Console from Data Explorer.
  2. Click Manage sources in the top-right corner. This opens the Query Session panel.
  3. The panel lists available connections in two sections:
    • DATABASES — PostgreSQL, MySQL, and other database connections
    • FILES — local file connections and S3-compatible storage (DigitalOcean Spaces, MinIO, etc.)
  4. Check two or more connections to select them.
  5. For database connections, pick a database from the dropdown that appears below the selected connection. Use + DB to add another database from the same connection.
  6. Use + Add at the top of the section to create a new connection if needed.

When more than one source is selected, the console switches to multi-source (DuckDB) mode automatically. The header updates to show "Multi-source • N sources" with alias badges for each selected source.

Aliases

Each selected source gets an auto-generated alias based on its connection type — for example, my1 for MySQL, pg1 for PostgreSQL, aws for S3. Aliases can be edited inline in the Query Session panel.

Alias rules:

  • Must start with a letter
  • Letters, numbers, and underscores only
  • Keep them short and readable: pg1, my1, aws, do

Naming conventions

In multi-source mode, qualify all table references with aliases:

Source typePatternExample
PostgreSQLalias.schema.tablepg1.public.actor
MySQLalias.database.tablemy1.sakila.film
S3 / filesread_* functions with alias schemeread_parquet('aws://bucket/path/*.parquet')

Examples

Cross-database join (MySQL + PostgreSQL)

SELECT p.first_name, p.last_name, m.title
FROM my1.sakila.film AS m
JOIN pg1.public.film_actor AS fa ON m.film_id = fa.film_id
JOIN pg1.public.actor AS p ON p.actor_id = fa.actor_id
WHERE m.title = 'ACE GOLDFINGER';

Database + S3 join

SELECT c.id, c.email, o.total
FROM pg1.public.customers c
JOIN read_parquet('aws://analytics/orders/*.parquet') o
  ON c.id = o.customer_id
LIMIT 100;

Query two S3 providers

Assign each S3 connection an alias (e.g., aws, do), then use the alias as the URL scheme:

SELECT a.actor_id, a.first_name AS aws_name, b.first_name AS do_name
FROM read_parquet('aws://my-bucket/sakila/actor/*.parquet') a
JOIN read_parquet('do://my-bucket/sakila/actor/*.parquet') b
  ON a.actor_id = b.actor_id
LIMIT 50;

UNION across S3 providers

SELECT 'AWS' AS source, actor_id, first_name, last_name
FROM read_parquet('aws://bucket/actor/*.parquet')
UNION ALL
SELECT 'DO' AS source, actor_id, first_name, last_name
FROM read_parquet('do://bucket/actor/*.parquet')
ORDER BY actor_id
LIMIT 100;

Supported file readers

Any DuckDB file reader that accepts a URL string works with S3 aliases:

  • read_parquet('alias://bucket/path/*.parquet')
  • read_csv_auto('alias://bucket/path/*.csv')
  • read_json_auto('alias://bucket/path/*.json')

SQL templates

The Templates panel (Ctrl+K) provides context-aware snippets for federated mode. When multiple sources are selected, templates are organized into sections:

  • Joins — cross-database JOIN, UNION, database+S3 JOIN, database+file JOIN, S3+S3 JOIN
  • Databases — grouped by alias (e.g., my1 (MySQL), pg1 (PostgreSQL)), each with:
    • Starter query for that alias
    • List namespaces (using duckdb_tables())
    • List tables (using duckdb_tables())
  • SessionSHOW DATABASES; to list all attached aliases

Templates are generated dynamically based on the selected connections and their aliases. A search field at the top of the panel filters across all sections.

See SQL Console — Templates for full details.

Mode transitions

  • Single-source → multi-source: adding a second source switches to DuckDB mode automatically. Existing SQL is not rewritten.
  • Multi-source → single-source: removing sources back to one switches back to direct mode automatically.
  • Explicit rewrite: when a starter query needs alias-qualified naming, the console offers a "Rewrite starter SQL to federated naming" action.

Practical rules

  • Always qualify table references with aliases in multi-source mode
  • Unqualified references (e.g., SELECT * FROM actor) will fail in multi-source mode
  • Use explicit casts when types differ across sources
  • S3 alias routing covers S3-compatible providers; GCS and Azure are not currently supported with alias routing

Troubleshooting

"unknown S3 alias(es) in query"

Your SQL uses a scheme like aws://..., but no source with alias aws is selected. Open the Query Session panel, select the S3 connection, and set its alias to match.

Access denied / 403 / signature errors

Credentials or region/endpoint mismatch. Verify the S3 connection credentials. For DigitalOcean/MinIO, ensure the endpoint is set correctly. For AWS, ensure the region matches the bucket's region.

Works for AWS, fails for Spaces/MinIO

Endpoint and URL style differences. Confirm the connection's endpoint is set (Spaces/MinIO usually require it). If your provider requires path-style access, configure the connection accordingly.