ctxray

See how you really use AI.

X-ray your AI coding sessions across Claude Code, Cursor, ChatGPT, and 6 more tools. Discover your patterns, find wasted tokens, catch leaked secrets — all locally, nothing leaves your machine.

Quick start

pip install ctxray

ctxray scan                    # discover prompts from your AI tools
ctxray wrapped                 # your AI coding persona + shareable card
ctxray insights                # your patterns vs research-optimal
ctxray privacy                 # what sensitive data you've exposed

Works in your pipeline

Drop ctxray into your CI as a prompt quality gate. No LLM, no API key, no network — <50ms per prompt.

# .github/workflows/prompt-quality.yml
- uses: ctxray/ctxray@main
  with:
    score-threshold: 43    # experimentally validated quality threshold
    model: claude          # model-specific rules (claude/gpt/gemini)
    comment-on-pr: true

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/ctxray/ctxray
    rev: v3.0.0
    hooks:
      - id: ctxray-lint-score   # fail below quality threshold
      # or: id: ctxray-lint-claude  # Claude-specific rules + threshold

Deterministic — same prompt, same score, every run. No flaky LLM-based checks.
Air-gapped — runs in offline and private networks. All analysis stays on your infrastructure.
Configurable — .ctxray.toml or [tool.ctxray.lint] in pyproject.toml. Per-project rules.

Full setup: GitHub Action · pre-commit · .ctxray.toml

What you'll discover

Your AI coding persona

ctxray wrapped generates a Spotify Wrapped-style report of your AI interactions — your persona (Debugger? Architect? Explorer?), top patterns, and a shareable card.

Your prompt patterns

ctxray insights compares your actual prompting habits against research-backed benchmarks. Are your prompts specific enough? Do you front-load instructions? How much context do you provide?

Your privacy exposure

ctxray privacy --deep scans every prompt you've sent for API keys, tokens, passwords, and PII. See exactly what you've shared with which AI tool.

Full prompt diagnostic

ctxray check "your prompt" scores, lints, and rewrites in one command — no LLM, <50ms.

Experimentally validated on 3000+ LLM calls across 8 models (1.5B → 27B): prompts at or above score 43 hit ~93% pass rate on executable code tests. Below 43 they average 72% or lower. ctxray tells you which side you're on and what to fix — see experiments/RESULTS.md for the full cross-model data.

ctxray check "fix the auth bug in login.ts"        # threshold pass/fail + diagnostics
ctxray check "fix bug" --model claude               # model-specific scoring for Claude
ctxray check "refactor middleware" --threshold 50    # custom threshold for stricter teams

More screenshots

`ctxray rewrite` — rule-based prompt improvement

`ctxray build` — assemble prompts from components

ctxray build — structured prompt assembly

What a bad prompt looks like

All commands

Discover your patterns

Command	Description
`ctxray wrapped`	AI coding persona + shareable card
`ctxray insights`	Personal patterns vs research-optimal benchmarks
`ctxray tools`	Cross-tool comparison — how your Claude Code / Cursor / ChatGPT habits differ
`ctxray sessions`	Session quality scores with frustration signal detection
`ctxray agent`	Agent workflow analysis — error loops, tool patterns, efficiency
`ctxray repetition`	Cross-session repetition detection — spot recurring prompts
`ctxray patterns`	Personal prompt weaknesses — recurring gaps by task type
`ctxray distill`	Extract important turns from conversations with 6-signal scoring
`ctxray projects`	Per-project quality breakdown
`ctxray style`	Prompting fingerprint with `--trends` for evolution tracking
`ctxray privacy`	See what data you sent where — file paths, errors, PII exposure

Optimize your prompts

Command	Description
`ctxray check "prompt"`	Full diagnostic — score + lint + rewrite + threshold pass/fail
`ctxray score "prompt"`	Research-backed 0-100 scoring with 30+ features
`ctxray score "prompt" --model claude`	Model-specific scoring — Claude, GPT, or Gemini adjustments
`ctxray rewrite "prompt"`	Rule-based improvement — filler removal, restructuring, hedging cleanup
`ctxray build "task"`	Build prompts from components — task, context, files, errors, constraints
`ctxray compress "prompt"`	4-layer prompt compression (40-60% token savings typical)
`ctxray compare "a" "b"`	Side-by-side prompt analysis (or `--best-worst` for auto-selection)
`ctxray lint`	Configurable linter with CI/GitHub Action support

Manage

Command	Description
`ctxray`	Instant dashboard — prompts, sessions, avg score, top categories
`ctxray scan`	Auto-discover prompts from 9 AI tools
`ctxray report`	Full analytics: hot phrases, clusters, patterns (`--html` for dashboard)
`ctxray digest`	Weekly summary comparing current vs previous period
`ctxray template save\|list\|use`	Save and reuse your best prompts
`ctxray distill --export`	Recover context when a session runs out — paste into new session
`ctxray init`	Generate `.ctxray.toml` config for your project

Supported AI tools

Tool	Format	Auto-discovered by `scan`
Claude Code	JSONL	Yes
Codex CLI	JSONL	Yes
Cursor	.vscdb	Yes
Aider	Markdown	Yes
Gemini CLI	JSON	Yes
Cline (VS Code)	JSON	Yes
OpenClaw / OpenCode	JSON	Yes
ChatGPT	JSON	Via `ctxray import`
Claude.ai	JSON/ZIP	Via `ctxray import`

Installation

pip install ctxray              # core (all features, zero config)
pip install ctxray[chinese]     # + Chinese prompt analysis (jieba)
pip install ctxray[mcp]         # + MCP server for Claude Code / Continue.dev / Zed

Auto-scan after every session

ctxray install-hook             # adds post-session hook to Claude Code

Browser extension

Capture prompts from ChatGPT, Claude.ai, and Gemini directly in your browser. Live quality badge shows prompt tier as you type — click "Rewrite & Apply" to improve and replace the text directly in the input box.

Install the extension from Chrome Web Store or Firefox Add-ons
Connect to the CLI: ctxray install-extension
Verify: ctxray extension-status

Captured prompts sync locally via Native Messaging — nothing leaves your machine.

CI integration

GitHub Action

# .github/workflows/prompt-lint.yml
name: Prompt Quality
on: pull_request

jobs:
  lint:
    runs-on: ubuntu-latest
    permissions:
      pull-requests: write
    steps:
      - uses: actions/checkout@v4
      - uses: ctxray/ctxray@main
        with:
          score-threshold: 43     # experimentally validated (below = 83% failure rate)
          model: claude           # optional: model-specific rules
          strict: true
          comment-on-pr: true

pre-commit

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/ctxray/ctxray
    rev: v3.0.0
    hooks:
      - id: ctxray-lint-score     # quality threshold gate (score >= 43)
      # - id: ctxray-lint-claude  # Claude-specific rules + threshold
      # - id: ctxray-lint-gpt    # GPT-specific rules + threshold

Direct CLI

ctxray lint --score-threshold 43  # exit 1 below experimentally validated threshold
ctxray lint --score-threshold 50  # or set your own bar
ctxray lint --model claude        # model-specific lint rules
ctxray lint --strict              # exit 1 on warnings
ctxray lint --json                # machine-readable output

Project configuration

ctxray init   # generates .ctxray.toml with all rules documented

# .ctxray.toml (or [tool.ctxray.lint] in pyproject.toml)
[lint]
score-threshold = 43   # experimentally validated quality threshold
model = "claude"       # model-specific rules (claude/gpt/gemini)

[lint.rules]
min-length = 20
short-prompt = 40
vague-prompt = true
debug-needs-reference = true

Prompt Science — research foundation

Prompt Science

Scoring is calibrated against 10 peer-reviewed papers covering 30+ features across 5 dimensions:

Dimension	What it measures	Key papers
Structure	Markdown, code blocks, explicit constraints	Prompt Report (2406.06608)
Context	File paths, error messages, I/O specs, edge cases	Zi+ (2508.03678), Google (2512.14982)
Position	Instruction placement relative to context	Stanford (2307.03172), Veseli+ (2508.07479), Chowdhury (2603.10123)
Repetition	Redundancy that degrades model attention	Google (2512.14982)
Clarity	Readability, sentence length, ambiguity	SPELL (EMNLP 2023), PEEM (2603.10477)

Cross-validated findings that inform our engine:

Position bias is architectural — present at initialization, not learned. Front-loading instructions is effective for prompts under 50% of context window (3 papers agree)
Moderate compression improves output — rule-based filler removal doesn't just save tokens, it enhances LLM performance (2505.00019)
Prompt quality is independently measurable — prompt-only scoring predicts output quality without seeing the response (ACL 2025, 2503.10084)
Quality threshold at score ~43 — our own experiment (30 prompts, 5 tiers, 2 models) found a step function: below 43, 83% failure rate; above 43, 94% success (Pearson r=0.56, Spearman ρ=0.64)
Format preferences are model-dependent — XML benefits Claude, Markdown benefits GPT, but having any structure matters more than the specific format (PromptBridge 2512.01420)

Model-specific scoring (--model claude/gpt/gemini) applies research-backed adjustments for each model's known preferences and sensitivities.

All analysis runs locally in <1ms per prompt. No LLM calls, no network requests.

How it works — architecture

How it works

 Data sources:
 ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
 │Claude Code│ │  Cursor  │ │  Aider   │ │ ChatGPT  │ │ 5 more.. │
 └─────┬────┘ └─────┬────┘ └─────┬────┘ └─────┬────┘ └─────┬────┘
       └─────────────┴───────────┴─────────────┴─────────────┘
                                 │
                    scan -> dedup -> store -> analyze
                                 │
              ┌──────────────────┼──────────────────┐
              v                  v                  v
        ┌──────────┐     ┌──────────────┐    ┌──────────┐
        │ insights │     │  patterns    │    │ sessions │
        │ wrapped  │     │  repetition  │    │ projects │
        │ style    │     │  privacy     │    │ agent    │
        └──────────┘     └──────────────┘    └──────────┘

Key design decisions:

Pure rules, no LLM — scoring and rewriting use regex + TF-IDF + research heuristics. Deterministic, private, <1ms per prompt.
Adapter pattern — each AI tool gets a parser that normalizes to a common Prompt model. Adding a new tool = one file.
Two-layer dedup — SHA-256 for exact matches, TF-IDF cosine similarity for near-dupes.
Research-calibrated — 10 peer-reviewed papers inform the scoring weights.

Conversation Distillation

ctxray distill scores every turn in a conversation using 6 signals:

Position — first/last turns carry framing and conclusions
Length — substantial turns contain more information
Tool trigger — turns that cause tool calls are action-driving
Error recovery — turns that follow errors show problem-solving
Semantic shift — topic changes mark conversation boundaries
Uniqueness — novel phrasing vs repetitive follow-ups

Session type (debugging, feature-dev, exploration, refactoring) is auto-detected and signal weights adapt accordingly.

Why ctxray?

After Promptfoo joined OpenAI and Humanloop joined Anthropic, ctxray is the independent, open-source alternative for understanding your AI interactions.

100% local — your prompts never leave your machine
No LLM required — pure rule-based analysis, <50ms per prompt
9 AI tools — the only tool that works across Claude Code, Cursor, ChatGPT, and more
Research-backed — calibrated against 10 peer-reviewed papers, not vibes

Previously published as reprompt-cli. Same tool, new name, clean namespace.

Privacy

All analysis runs locally. No prompts leave your machine.
ctxray privacy shows exactly what you've sent to which AI tool.
Optional telemetry sends only anonymous feature vectors — never prompt text.
Open source: audit exactly what's collected.

Links

PyPI: ctxray
Chrome Extension: Chrome Web Store
Firefox Add-on: Firefox Add-ons
Changelog: CHANGELOG.md

Contributing

See CONTRIBUTING.md for development setup and guidelines.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 291 Commits
.github		.github
docs		docs
experiments		experiments
scripts		scripts
src/ctxray		src/ctxray
tests		tests
.editorconfig		.editorconfig
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.pre-commit-hooks.yaml		.pre-commit-hooks.yaml
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
action.yml		action.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

ctxray

Quick start

Works in your pipeline

What you'll discover

Your AI coding persona

Your prompt patterns

Your privacy exposure

Full prompt diagnostic

ctxray rewrite — rule-based prompt improvement

ctxray build — assemble prompts from components

What a bad prompt looks like

All commands

Discover your patterns

Optimize your prompts

Manage

Supported AI tools

Installation

Auto-scan after every session

Browser extension

CI integration

GitHub Action

pre-commit

Direct CLI

Project configuration

Prompt Science

How it works

Conversation Distillation

Why ctxray?

Privacy

Links

Contributing

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`ctxray rewrite` — rule-based prompt improvement

`ctxray build` — assemble prompts from components

Packages