Irreduce

Irreduce compresses long prompts by splitting text into token spans, scoring each span for task relevance and global importance, then selecting the highest information-per-token spans under a tight budget while penalizing redundancy. The result keeps about 95% task performance while cutting roughly 90% of tokens, making large context inference far cheaper and more scalable.

Demo

Requirements

Python 3.13+

Quick start

cd server
uv sync
uv run uvicorn main:app --reload

Open http://127.0.0.1:8000/app.

Alternative (existing venv)

cd server
source .venv/bin/activate
python -m uvicorn main:app --reload

What it does

Chunks long context into spans and applies guardrails (headings, entities, code/role markers).
Scores spans with novelty/entity/number boosts plus optional signal scoring.
Runs greedy facility-location selection under a token budget.
Optional paraphrase squeeze (heuristic, local LLM, or Groq).

API endpoints

GET /health - liveness
POST /compress - main Irreduce compression
POST /compress/longbench - LongBench-style compression
POST /compare - Irreduce vs TokenCo baseline
POST /evaluate - quality vs savings curve
GET /examples - demo scenarios

Configuration

Irreduce runs without keys; extra features unlock with these environment variables:

TOKENC_API_KEY or TTC_API_KEY - TokenCo baseline in /compare
TOKENC_MODEL - TokenCo model override for scripts
TOKEN_COMPANY_API_KEY - required for the custom compressor in scripts
GROQ_API_KEY (and optional GROQ_MODEL) - Groq paraphrase
LOCAL_LLM_MODEL (and optional LOCAL_LLM_RUNTIME=hf) - local signals/paraphrase
OPENAI_API_KEY - evaluation scripts
GOOGLE_API_KEY or GEMINI_API_KEY - LongBench vision script

Optional extras for local LLM:

uv add torch transformers accelerate sentencepiece

Evaluation

cd server
uv run python -m cosmos.quick_eval

Project structure

client/ - static demo UI
server/ - FastAPI API, compression engine, eval scripts

External tools and libraries

FastAPI
Pydantic
Uvicorn (via FastAPI standard)
uv (Python package manager)
Token Company tokenc SDK
Requests
Google GenAI SDK (Gemini)
Pillow
ChromaDB
python-dotenv
OpenAI Python SDK
Groq API (optional)
Hugging Face Transformers, PyTorch, Accelerate, SentencePiece (optional)
Google Fonts (Fraunces, Manrope, IBM Plex Mono)

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
client-vite		client-vite
compressor		compressor
server		server
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Irreduce

Demo

Requirements

Quick start

Alternative (existing venv)

What it does

API endpoints

Configuration

Evaluation

Project structure

External tools and libraries

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Irreduce

Demo

Requirements

Quick start

Alternative (existing venv)

What it does

API endpoints

Configuration

Evaluation

Project structure

External tools and libraries

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages