Unix Reimagined | toast

Tools that save you time, are simple, composable, and fun.
For humans and AI alike.

Terminal

$ ls -al | toast "roast my directory"

Oh wow, 47 node_modules folders? Your disk called, it's crying. And .DS_Store everywhere—you know those do nothing, right?

Join on Mac or Linux

curl -sSL linuxtoaster.com/install | sh Click to copy

$20/yr PayGo — includes $20 in inference credits. Top off any time. BYOK and local inference is FREE. We collect anonymized usage data (model, token count) — never prompts.

BYOK support: OpenAI · Anthropic · Google · Mistral · Groq · Cerebras · Perplexity · xAI · OpenRouter · Together

Local support: Ollama · MLX · LM Studio · KoboldCpp · llama.cpp · vLLM · LocalAI · Jan

Composable AI

toast — AI in your terminal

With toast you get "sed with a brain" — pipe text in, get Unix knowledge out.

Understand anything

Legacy code. Config files. Cryptic logs. Get explanations.

cat /etc/nginx/nginx.conf | toast "explain in detail"

Get the command you need

Describe what you want in plain English. Get the exact command.

toast "how do I delete all .log files older than 7 days"

Diagnose your system

Not sure which tab is burning the CPU? Ask.

ps aux | toast "which tab is burning the cpu"
PID 75517 — Safari WebContent, 45.3% CPU. Kill it: kill 75517

Mass updates

toast reads files, writes patches, and works with any format.

find . -name "*.py" -exec toast {} "add type hints" \;

ls *.md | xargs -I {} toast {} "fix grammar"

Terminal Chat

When you need a back-and-forth. Pull files into context with @.

toast
> @models.py explain this
sure, the file contains...
> what does function...

Toast on Telegram

Talk to toast from your phone. Link your account, then message the bot.

toast --telegram

Power Users

Simple for beginners. Deep for experts. The toaster grows with you.

Custom Personas

Drop a .persona file in any project. Toast picks it up automatically, zero config. Test by chatting with the .persona

echo "You are a Django expert" > .persona

Pipe chains

Compose like Unix. Chain multiple transforms.

curl site.com | toast "summarize" | toast "translate to Spanish"

Project context

Drop a .crumbs file. AI knows your stack.

echo "Python 3.11, FastAPI" > .crumbs

Edit a book

Iterative refinement. Each pass reads, learns, decides, refines. Gradient descent for prose.

20 times Editor draft.md "tighten prose, cut filler"

Edit a book until done.

Let the AI decide when it's done. Loops until the command signals completion. Add a cap for safety.

while Editor draft.md "tighten prose"

20 while Editor draft.md "tighten prose"

@file injection

In chat mode, pull files into context on the fly. Multi-file supported.

> @schema.sql @models.py are these in sync?

Any model

One interface, many providers. Compare models without changing your workflow.

toast -p anthropic -m claude-opus-4-5 "explain"

Build a bot on our Mac in one line

Email, iMessage. One line. Your AI, your rules.

email bot toast "reply in a friendly way"

imessage bot toast "answer as executive assistant"

Local Inference

Start toasted and toast will use it for local inference. You can also use Ollama, MLX, LM Studio, KoboldCpp, llama.cpp, vLLM, LocalAI, as inference provider. Full privacy, no internet required.

toast -p mlx -m GLM-4.7-flash "tell me a joke"

Usage stats

Token counts and latency per provider. Tracked locally via mmap, zero overhead.

toast --stats

AI native Shell

jam — The AI shell that doesn't fight you

No quoting nightmares. No expansion. No $ surprises. What you type is what you get. Type something that isn't a command, and the AI answers.

                    # Strings just work. No escaping.
                    🍞 echo "The price is $100"
                    The price is $100
                    
                    # Environment variables. Explicit words, not sigils.
                    🍞 set API_KEY sk-abc123
                    🍞 get API_KEY
                    sk-abc123
                    
                    # Built-in RPN for math. No bc, no expr.
                    🍞 100 2 / 3 *
                    150
                    
                    # Not a command? AI answers instead of "command not found".
                    🍞 what processes are using port 8080
                    lsof -i :8080
                

Loops in plain English

Gradient descent for documents. Bounded or unbounded. The AI can decide when it's done. Add a cap for safety.

5 times echo hello

while Editor draft.md "kaizen until done"

7 while Editor draft.md "polish for publishing"

AI helps use the terminal

Builtin → RPN → PATH → AI. Type eixt and the AI tells you it's exit. The shell understands intent, not just syntax.

🍞 eixt
Did you mean: exit

Per-project context

.persona, .crumbs, .history walk up from cwd. Different folder means different project, different history, different AI behavior. Zero config.

# AI knows your last 50 commands
# per project directory

Read the full jam story →

Networking for AI

AIgents see each other on the network

UDP multicast. Every jam instance on the subnet hears it. No broker. No server. No configuration. This is the nervous system.

                    # Send a message to every machine on the network
                    🍞 send status deploying
                    
                    # Listen for a specific key — blocks until match
                    🍞 listen status
                    web3:status deploying
                    
                    # An AI agent that monitors and summarizes the network
                    🍞 while listen | toast "summarize this event"
                    
                    # Wait for 3 nodes to report ready
                    🍞 3 times listen ready
                

Three linuxtoaster boxes running jam are three islands — unless they can talk to each other. send and listen turn them into a fleet. No etcd. No consul. No Kubernetes. Just multicast.

Same shell

set / get

Environment variables

Same project

.history

Shared AI context

Same machine

pipes

stdin → stdout

Same network

send / listen

UDP multicast

Four scopes. Each is a word. Each composes with pipes.

Version Control · 15 Commands

ito — Version control for AI, and humans

Git records the what and attaches a why. ito flips it — record intent, derive diffs. 15 commands instead of 150. Single C file, ~1,100 lines. No staging area. No detached HEAD. No .gitignore. Built for AI and humans alike.

                    # One command to save. Intent is the source of truth.
                    $ ito log "added rate limiting to prevent abuse"
                    
                    # Interactive history navigator — arrow keys, side-by-side diffs.
                    $ ito changes
                    
                    # Search by intent, not by grepping diffs.
                    $ ito search "auth" | toast "summarize the approach"
                    
                    # Sync via rsync. No GitHub required.
                    $ ito sync user@server:/repos/project
                

Intent-first snapshots

Every snapshot is a moment — why you changed, not just what changed. Six months later, ito search finds the reasoning, not just fix stuff.

ito log "refactored auth to use JWT"

Opt-in tracking

No .gitignore. A .ito/track file lists what to snapshot. Everything else is invisible. No 10 GB of build artifacts in version control.

# .ito/track
*.c *.h Makefile *.md

15 commands, not 150

ito log instead of git add -A && git commit -m. ito on experiment instead of git checkout -b. ito undo instead of figuring out reset vs revert.

ito on experiment
ito undo
ito merge alice

AI-native search

Search the why-layer directly. Agents ask "find everything related to auth performance" and get real answers. Composes with toast.

ito history | toast "what was the focus last week"

Whole-file merge

Designed for real AI projects with clear ownership. Side-by-side diff: pick ours, theirs, or make your own.

ito merge bob

Single C file

~1,100 lines. No dependencies beyond rsync and diff. Content-addressed, immutable objects. Sync is safe by construction — rsync can only add, never corrupt.

~1,100 lines · zero dependencies · content-addressed

Read the full ito story →

Local Inference · Zero Cost

toasted — A local brain for your laptop

A from-scratch inference daemon for Apple Silicon. ~1,800 lines of C++, no Python. A 30B-parameter model running at ~100 tok/s generation, ~400 tok/s prompt reading. Zero cost per token. Zero data exposure. 128 GB RAM supports 8-bit, 6-bit, and 4-bit quantization. 64 GB supports 4-bit.

                    # Start the daemon. Model loads once, stays hot in GPU memory.
                    $ toasted start
                    
                    # toast auto-detects local inference. Same interface as cloud.
                    $ toast "explain quicksort"
                    
                    # Pipe chains and chat work locally.
                    $ cat auth.py | Security "audit this"
                    $ git diff | Reviewer
                

~100 tok/s generation

Mixture-of-experts routes through 8 of 512 experts per token — the knowledge of all 512 at the cost of 8. Speeds typically associated with a 7B dense model, from a 30B-parameter brain.

~70–80 English words per second

~400 tok/s prompt reading

Chunked batch prefill processes context in 32-token chunks. 17K tokens prefills in ~44 seconds instead of 7 minutes. 56× faster than our first implementation.

7 tok/s → 394 tok/s

Session cache — 0.6s to first word

Only the last message is new. toasted hashes prior conversation, restores cached state, prefills just the delta. A 125× improvement in time-to-first-token.

75 seconds → 600 milliseconds

Written in C++, not Python

Built against Apple's MLX C++ API with a hand-tuned Metal kernel for DeltaNet. No Python startup, no fragile environments. The model is a single file.

~1,800 lines · compiled step functions · zero dependencies

True privacy

Air-gapped environments, regulated industries, security-conscious teams. Your code never leaves the machine. No API keys. No internet required.

toast "review this classified document"

Zero marginal cost

The daemon loads the model once into unified memory. Metal shaders stay compiled. Cache stays warm. Every subsequent request is free — just electricity.

toast --stats # track local vs cloud usage

Requires Apple Silicon Mac. 128 GB unified memory supports 8-bit, 6-bit, and 4-bit quantization. 64 GB supports 4-bit. When toasted is running, toast automatically defaults to local inference. Cloud models still available with -p provider.

Read the full engineering story →

Pricing

The future of software depends on getting the balance between deterministic and non-deterministic right. Traditional Unix is deterministic — predictable, composable, reliable. AI is non-deterministic — creative, adaptive, surprising. The hard part is making them work together. That's what we're building: Unix reimagined for both AI and people. Your membership funds that work.

$20/yr PayGo to start. $49/mo Member for the full stack. Found Partner for teams.

PayGo

$20/yr

toast
$20 in AI credits included — top off any time
Anonymized usage data only (model, token count — never prompts)
Custom personas via .persona files
BYOK & local models free
All updates
Community Support

Member

$49/mo

Everything in PayGo
toasted local inference daemon
Qwen3-Next-Coder on M4 at ~100 tok/s — zero cost per token
Every growing list of Unix re-imagined tools: jam, ito..
UDP networking for agents
Priority Support

Teams & Enterprise

Founding Partner

Custom

Everything in Member
Fund the rewrite of Unix.
You tell us what is missing. We implement. You get the credit.
Priority & dedicated support
Consulting, seminars & FDE options

FAQ

How does it work?

Lightweight toast talks to local toastd, which keeps an HTTP/2 connection pool to linuxtoaster.com. Written in C to minimize latency. With BYOK, toastd connects directly to your provider—your traffic never touches our servers.

What's BYOK?

Got a PROVIDER_API_KEY set for Anthropic, Cerebras, Google Gemini, Groq, OpenAI, OpenRouter, Together, Mistral, Perplexity, and/or xAI? Use toast -p provider. Zero config.

What is a Founding Partner?

Companies funding the rewrite of Unix. Your team gets a software license, priority support, and consulting options. You're funding tools that make software simpler for all LinuxToaster users. Talk to us.

Can I run it fully offline?

Yes. Use any local backend—Ollama, MLX, LM Studio, KoboldCpp, llama.cpp, vLLM, LocalAI, or Jan. No internet, no API keys, full privacy.

What's jam?

A shell rebuilt for AI. No quoting, no expansion, no $ syntax. Strings just work. Unrecognized input goes to the AI. Includes set/get for env vars, while/times for loops, RPN math, and a UDP multicast basket for multi-machine coordination.

What's toasted?

A from-scratch local inference daemon for Apple Silicon. Written in C++ against Apple's MLX API. Loads a 30B-parameter model once, serves requests via Unix socket. ~100 tok/s generation, ~400 tok/s prefill, 0.6s time-to-first-token with session caching. 128 GB supports 8/6/4-bit quantization, 64 GB supports 4-bit.

Where's my data stored?

Locally. Context in .crumbs, conversations in .chat. Version them, grep them, delete them. Your machine, your files.

macOS? Windows?

macOS and Linux today. Windows WSL works.

What about consulting?

Consulting is available for teams that want hands-on help with deployment, integration, or training. Enterprise accounts have a Forward Deployed Engineering option.

How does billing work?

$20/yr gets you a membership and $20 in AI credits — top off anytime. AI inference is charged based on use. BYOK or local inference carries no cost. We collect anonymized usage data (which model, token count) but never your prompts. You may choose to pay for consulting. You may choose to pay the monthly cost of a FDE.