LaDuck

Run LLMs inside DuckDB. Load GGUF models, generate text, compute embeddings, and classify data — all from SQL, fully local, no API keys.

LaDuck embeds llama.cpp directly into DuckDB as an extension. Models run in-process with GPU acceleration on Apple Silicon, NVIDIA, and AMD hardware.

Quick Start

LOAD 'laduck.duckdb_extension';

-- Download and load a model from HuggingFace
SELECT llm_load_model('hf://Qwen/Qwen2-0.5B-Instruct-GGUF/qwen2-0_5b-instruct-q2_k.gguf', 'qwen');

-- Generate text
SELECT llm_complete('Explain SQL joins in one sentence:', 'qwen');

-- Classify rows without training
SELECT product_name,
       llm_classify(description, 'qwen', ['electronics', 'clothing', 'food']) AS category
FROM products;

-- Semantic search with embeddings
SELECT title, list_cosine_similarity(
    llm_embed(content, 'qwen'),
    llm_embed('machine learning tutorial', 'qwen')
) AS relevance
FROM articles
ORDER BY relevance DESC
LIMIT 10;

Functions

Inference

Function	Description	Returns
`llm_complete(text, model [, max_tokens, temperature, top_p])`	Generate text	`VARCHAR`
`llm_embed(text, model)`	Compute embedding vector	`FLOAT[]`
`llm_classify(text, model, labels)`	Zero-shot classification (single forward pass)	`STRUCT(label VARCHAR, score FLOAT)`

Model Management

Function	Description	Returns
`llm_load_model(source, name [, n_gpu_layers, n_ctx])`	Load a GGUF model	`VARCHAR`
`llm_unload_model(name)`	Free model from memory	`VARCHAR`
`llm_models()`	List loaded models	Table
`llm_backends()`	List available compute backends (CPU, Metal, CUDA, ...)	Table

Persistent Storage

Function	Description	Returns
`llm_store_model(name)`	Save loaded model into the DuckDB database	Table
`llm_delete_model(name)`	Remove stored model from database	Table

Store and delete are table functions — use SELECT * FROM llm_store_model('name').

Model Sources

LaDuck supports three ways to load models:

-- Local file
SELECT llm_load_model('/path/to/model.gguf', 'mymodel');

-- HuggingFace (downloads and caches to ~/.cache/laduck/)
SELECT llm_load_model('hf://Qwen/Qwen2-0.5B-Instruct-GGUF/qwen2-0_5b-instruct-q2_k.gguf', 'qwen');

-- From DuckDB storage (after llm_store_model)
SELECT llm_load_model('db://qwen', 'qwen');

Portable Databases

Store models directly in the .duckdb file for fully self-contained, portable databases:

-- Load and store
SELECT llm_load_model('hf://Qwen/Qwen2-0.5B-Instruct-GGUF/qwen2-0_5b-instruct-q2_k.gguf', 'qwen');
SELECT * FROM llm_store_model('qwen');

-- Later, on another machine — just load from the database
SELECT llm_load_model('db://qwen', 'qwen');
SELECT llm_complete('Hello!', 'qwen');

Models are stored as chunked BLOBs (50MB chunks). A 322MB model stores in ~0.5 seconds.

GPU Backends

LaDuck uses the best available backend automatically. Use llm_backends() to see what's available:

SELECT * FROM llm_backends();
-- ┌─────────┬──────────────┐
-- │  name   │ description  │
-- ├─────────┼──────────────┤
-- │ MTL0    │ Apple M4 Max │
-- │ BLAS    │ Accelerate   │
-- │ CPU     │ Apple M4 Max │
-- └─────────┴──────────────┘

Force CPU-only inference by setting n_gpu_layers to 0:

SELECT llm_load_model('model.gguf', 'cpu_model', 0, 2048);

Building

Prerequisites

Nix with flakes enabled

Default (Metal on macOS, CPU on Linux)

nix develop
GEN=ninja make release

NVIDIA (CUDA)

nix build .#laduck-cuda

AMD (Vulkan — cross-platform)

nix build .#laduck-vulkan

AMD (ROCm — Linux, better performance)

nix build .#laduck-rocm

Testing

nix develop
GEN=ninja make release
build/release/test/unittest --test-dir test

Loading the Extension

duckdb -cmd "LOAD 'build/release/extension/laduck/laduck.duckdb_extension'"

Architecture

┌─────────────────────────────────────────────┐
│                  DuckDB                      │
│                                              │
│  SQL Query ──► Scalar/Table Functions        │
│                    │                         │
│         ┌──────────┴──────────┐              │
│         │   LaDuck Extension  │              │
│         │                     │              │
│         │  ┌───────────────┐  │              │
│         │  │ Model Registry│  │              │
│         │  │ (name→model)  │  │              │
│         │  └──────┬────────┘  │              │
│         │         │           │              │
│         │  ┌──────┴────────┐  │              │
│         │  │  llama.cpp    │  │              │
│         │  │  (embedded)   │  │              │
│         │  └───────────────┘  │              │
│         └─────────────────────┘              │
└─────────────────────────────────────────────┘
         │                    │
    ┌────┴────┐    ┌─────────┴────────┐
    │  GGUF   │    │  DuckDB Tables   │
    │  Files  │    │  (BLOB storage)  │
    │ (disk)  │    └──────────────────┘
    └─────────┘

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
duckdb @ 33cdee9		duckdb @ 33cdee9
extension-ci-tools @ e203276		extension-ci-tools @ e203276
src		src
test/sql		test/sql
third_party		third_party
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
community-extension-descriptor.yml		community-extension-descriptor.yml
extension_config.cmake		extension_config.cmake
flake.lock		flake.lock
flake.nix		flake.nix
vcpkg.json		vcpkg.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LaDuck

Quick Start

Functions

Inference

Model Management

Persistent Storage

Model Sources

Portable Databases

GPU Backends

Building

Prerequisites

Default (Metal on macOS, CPU on Linux)

NVIDIA (CUDA)

AMD (Vulkan — cross-platform)

AMD (ROCm — Linux, better performance)

Testing

Loading the Extension

Architecture

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

LaDuck

Quick Start

Functions

Inference

Model Management

Persistent Storage

Model Sources

Portable Databases

GPU Backends

Building

Prerequisites

Default (Metal on macOS, CPU on Linux)

NVIDIA (CUDA)

AMD (Vulkan — cross-platform)

AMD (ROCm — Linux, better performance)

Testing

Loading the Extension

Architecture

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 1

Languages

Packages