JSON configuration files that define which LLM providers and models PlanExe can use. Each file is a profile targeting a different use case.
| Profile | File | Use case |
|---|---|---|
baseline |
baseline.json |
Default. Mix of affordable cloud models + local Ollama/LM Studio. |
custom |
custom.json |
User-curated selection for experimentation. |
premium |
premium.json |
Higher-quality cloud models (higher cost). |
frontier |
frontier.json |
Cutting-edge models for maximum quality. |
custom |
anthropic_claude.json |
Anthropic Claude models only. |
custom |
local.json |
LM Studio local models (no API keys). |
custom |
custom_thinking_test.json |
LM Studio with thinking tokens enabled. |
Set the PLANEXE_MODEL_PROFILE environment variable in your .env file:
# Use one of: baseline, premium, frontier, custom
PLANEXE_MODEL_PROFILE=baselineFor the custom profile, you can override the filename:
PLANEXE_MODEL_PROFILE=custom
PLANEXE_LLM_CONFIG_CUSTOM_FILENAME=anthropic_claude.jsonIf unset, baseline is used.
Each model entry in a config file looks like:
{
"openrouter-openai-gpt-4o-mini": {
"comment": "Created Jul 18, 2024. 128,000 context. $0.15/M input. $0.60/M output.",
"priority": 2,
"luigi_workers": 4,
"class": "OpenRouter",
"arguments": {
"model": "openai/gpt-4o-mini",
"api_key": "${OPENROUTER_API_KEY}",
"temperature": 0.1,
"timeout": 60.0,
"is_function_calling_model": false,
"is_chat_model": true,
"max_tokens": 8192,
"max_retries": 5
},
"model_info_url": "https://openrouter.ai/openai/gpt-4o-mini",
"pricing": {
"input_per_million_tokens": 0.15,
"output_per_million_tokens": 0.60
},
"pricing_kind": "paid"
}
}| Field | Required | Description |
|---|---|---|
class |
Yes | Provider class: OpenRouter, OpenAI, Anthropic, OpenAILike, Ollama, LMStudio, ThinkingAwareOpenAILike |
arguments |
Yes | Passed directly to the llama_index LLM constructor |
arguments.model |
Yes | Model identifier sent to the provider API |
arguments.api_key |
Yes (cloud) | Environment variable reference, e.g. ${OPENROUTER_API_KEY} |
luigi_workers |
Yes | Max parallel tasks (1 for local, 4 for cloud) |
pricing_kind |
Yes | "paid" or "free" |
priority |
No | Auto mode fallback order (lower = tried first) |
pricing |
No | Fallback cost estimation rates (USD per million tokens) |
comment |
No | Human-readable description |
model_info_url |
No | Link to provider model page |
- OpenRouter — Routes to many providers. Set
OPENROUTER_API_KEY. - OpenAI — Direct OpenAI API. Set
OPENAI_API_KEY. - Anthropic — Direct Anthropic API. Set
ANTHROPIC_API_KEY. Also supports Claude Code OAuth tokens (sk-ant-oat*fromclaude setup-token). - OpenAILike — Any OpenAI-compatible endpoint (e.g. Alibaba DashScope).
- Ollama — Requires Ollama installed and running.
For Docker, set
base_urltohttp://host.docker.internal:11434. - LMStudio — Requires LM Studio installed and running.
For Docker, set
base_urltohttp://host.docker.internal:1234/v1.
When the user selects "Auto" in the UI, PlanExe cycles through models sorted
by priority (lowest first). If a model fails, the next one is tried. Only
entries with a priority field participate.
The pricing field enables cost estimation for providers that don't report
cost in their API responses (OpenAI, Anthropic). Without it, total_cost
shows as 0.0 in activity_overview.json.
"pricing": {
"input_per_million_tokens": 0.15,
"output_per_million_tokens": 0.60
}OpenRouter returns cost directly in usage responses, so the pricing field
serves as a cross-check there.
LlamaIndex's Anthropic integration bypasses self.chat() and instrumentation
events never fire, so token counts are lost. PlanExe works around this by
patching httpx.Client.send to intercept the raw Anthropic API response and
capture the usage dict. This hook is installed automatically when any LLM
config is loaded (via llm_factory._load_llm_config()). The captured tokens
are used for both usage_metrics.jsonl and activity_overview.json.
- Choose the appropriate profile file (or create entries in multiple profiles).
- Pick a descriptive key (e.g.
openrouter-google-gemini-3). - Set
classto the provider type and fill inarguments. - Add
pricingwith rates from the provider's pricing page. - Set
pricing_kindto"paid"or"free". - Optionally set
priorityif it should participate in Auto mode. - Validate the JSON:
python3 -c "import json; json.load(open('llm_config/<file>.json'))".
To restrict which provider classes are loaded, set:
PLANEXE_LLM_CONFIG_WHITELISTED_CLASSES=OpenRouter,OllamaOnly entries with a matching class will be available.