Skip to content

ctxray/ctxray

Use this GitHub action with your project
Add this Action to an existing workflow or create a new one
View on Marketplace

ctxray

See how you really use AI.

X-ray your AI coding sessions across Claude Code, Cursor, ChatGPT, and 6 more tools. Discover your patterns, find wasted tokens, catch leaked secrets — all locally, nothing leaves your machine.

PyPI version Python 3.10+ License: MIT Tests Coverage

Quick start

pip install ctxray

ctxray scan                    # discover prompts from your AI tools
ctxray wrapped                 # your AI coding persona + shareable card
ctxray insights                # your patterns vs research-optimal
ctxray privacy                 # what sensitive data you've exposed

ctxray demo

Works in your pipeline

Drop ctxray into your CI as a prompt quality gate. No LLM, no API key, no network — <50ms per prompt.

# .github/workflows/prompt-quality.yml
- uses: ctxray/ctxray@main
  with:
    score-threshold: 43    # experimentally validated quality threshold
    model: claude          # model-specific rules (claude/gpt/gemini)
    comment-on-pr: true
# .pre-commit-config.yaml
repos:
  - repo: https://github.com/ctxray/ctxray
    rev: v3.0.0
    hooks:
      - id: ctxray-lint-score   # fail below quality threshold
      # or: id: ctxray-lint-claude  # Claude-specific rules + threshold
  • Deterministic — same prompt, same score, every run. No flaky LLM-based checks.
  • Air-gapped — runs in offline and private networks. All analysis stays on your infrastructure.
  • Configurable.ctxray.toml or [tool.ctxray.lint] in pyproject.toml. Per-project rules.

Full setup: GitHub Action · pre-commit · .ctxray.toml

What you'll discover

Your AI coding persona

ctxray wrapped generates a Spotify Wrapped-style report of your AI interactions — your persona (Debugger? Architect? Explorer?), top patterns, and a shareable card.

Your prompt patterns

ctxray insights compares your actual prompting habits against research-backed benchmarks. Are your prompts specific enough? Do you front-load instructions? How much context do you provide?

Your privacy exposure

ctxray privacy --deep scans every prompt you've sent for API keys, tokens, passwords, and PII. See exactly what you've shared with which AI tool.

Full prompt diagnostic

ctxray check "your prompt" scores, lints, and rewrites in one command — no LLM, <50ms.

Experimentally validated on 3000+ LLM calls across 8 models (1.5B → 27B): prompts at or above score 43 hit ~93% pass rate on executable code tests. Below 43 they average 72% or lower. ctxray tells you which side you're on and what to fix — see experiments/RESULTS.md for the full cross-model data.

ctxray check "fix the auth bug in login.ts"        # threshold pass/fail + diagnostics
ctxray check "fix bug" --model claude               # model-specific scoring for Claude
ctxray check "refactor middleware" --threshold 50    # custom threshold for stricter teams

ctxray check — good prompt

More screenshots

ctxray rewrite — rule-based prompt improvement

ctxray rewrite — before/after

ctxray build — assemble prompts from components

ctxray build — structured prompt assembly

What a bad prompt looks like

ctxray check — weak prompt

All commands

Discover your patterns

Command Description
ctxray wrapped AI coding persona + shareable card
ctxray insights Personal patterns vs research-optimal benchmarks
ctxray tools Cross-tool comparison — how your Claude Code / Cursor / ChatGPT habits differ
ctxray sessions Session quality scores with frustration signal detection
ctxray agent Agent workflow analysis — error loops, tool patterns, efficiency
ctxray repetition Cross-session repetition detection — spot recurring prompts
ctxray patterns Personal prompt weaknesses — recurring gaps by task type
ctxray distill Extract important turns from conversations with 6-signal scoring
ctxray projects Per-project quality breakdown
ctxray style Prompting fingerprint with --trends for evolution tracking
ctxray privacy See what data you sent where — file paths, errors, PII exposure

Optimize your prompts

Command Description
ctxray check "prompt" Full diagnostic — score + lint + rewrite + threshold pass/fail
ctxray score "prompt" Research-backed 0-100 scoring with 30+ features
ctxray score "prompt" --model claude Model-specific scoring — Claude, GPT, or Gemini adjustments
ctxray rewrite "prompt" Rule-based improvement — filler removal, restructuring, hedging cleanup
ctxray build "task" Build prompts from components — task, context, files, errors, constraints
ctxray compress "prompt" 4-layer prompt compression (40-60% token savings typical)
ctxray compare "a" "b" Side-by-side prompt analysis (or --best-worst for auto-selection)
ctxray lint Configurable linter with CI/GitHub Action support

Manage

Command Description
ctxray Instant dashboard — prompts, sessions, avg score, top categories
ctxray scan Auto-discover prompts from 9 AI tools
ctxray report Full analytics: hot phrases, clusters, patterns (--html for dashboard)
ctxray digest Weekly summary comparing current vs previous period
ctxray template save|list|use Save and reuse your best prompts
ctxray distill --export Recover context when a session runs out — paste into new session
ctxray init Generate .ctxray.toml config for your project

Supported AI tools

Tool Format Auto-discovered by scan
Claude Code JSONL Yes
Codex CLI JSONL Yes
Cursor .vscdb Yes
Aider Markdown Yes
Gemini CLI JSON Yes
Cline (VS Code) JSON Yes
OpenClaw / OpenCode JSON Yes
ChatGPT JSON Via ctxray import
Claude.ai JSON/ZIP Via ctxray import

Installation

pip install ctxray              # core (all features, zero config)
pip install ctxray[chinese]     # + Chinese prompt analysis (jieba)
pip install ctxray[mcp]         # + MCP server for Claude Code / Continue.dev / Zed

Auto-scan after every session

ctxray install-hook             # adds post-session hook to Claude Code

Browser extension

Capture prompts from ChatGPT, Claude.ai, and Gemini directly in your browser. Live quality badge shows prompt tier as you type — click "Rewrite & Apply" to improve and replace the text directly in the input box.

  1. Install the extension from Chrome Web Store or Firefox Add-ons
  2. Connect to the CLI: ctxray install-extension
  3. Verify: ctxray extension-status

Captured prompts sync locally via Native Messaging — nothing leaves your machine.

CI integration

GitHub Action

# .github/workflows/prompt-lint.yml
name: Prompt Quality
on: pull_request

jobs:
  lint:
    runs-on: ubuntu-latest
    permissions:
      pull-requests: write
    steps:
      - uses: actions/checkout@v4
      - uses: ctxray/ctxray@main
        with:
          score-threshold: 43     # experimentally validated (below = 83% failure rate)
          model: claude           # optional: model-specific rules
          strict: true
          comment-on-pr: true

pre-commit

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/ctxray/ctxray
    rev: v3.0.0
    hooks:
      - id: ctxray-lint-score     # quality threshold gate (score >= 43)
      # - id: ctxray-lint-claude  # Claude-specific rules + threshold
      # - id: ctxray-lint-gpt    # GPT-specific rules + threshold

Direct CLI

ctxray lint --score-threshold 43  # exit 1 below experimentally validated threshold
ctxray lint --score-threshold 50  # or set your own bar
ctxray lint --model claude        # model-specific lint rules
ctxray lint --strict              # exit 1 on warnings
ctxray lint --json                # machine-readable output

Project configuration

ctxray init   # generates .ctxray.toml with all rules documented
# .ctxray.toml (or [tool.ctxray.lint] in pyproject.toml)
[lint]
score-threshold = 43   # experimentally validated quality threshold
model = "claude"       # model-specific rules (claude/gpt/gemini)

[lint.rules]
min-length = 20
short-prompt = 40
vague-prompt = true
debug-needs-reference = true
Prompt Science — research foundation

Prompt Science

Scoring is calibrated against 10 peer-reviewed papers covering 30+ features across 5 dimensions:

Dimension What it measures Key papers
Structure Markdown, code blocks, explicit constraints Prompt Report (2406.06608)
Context File paths, error messages, I/O specs, edge cases Zi+ (2508.03678), Google (2512.14982)
Position Instruction placement relative to context Stanford (2307.03172), Veseli+ (2508.07479), Chowdhury (2603.10123)
Repetition Redundancy that degrades model attention Google (2512.14982)
Clarity Readability, sentence length, ambiguity SPELL (EMNLP 2023), PEEM (2603.10477)

Cross-validated findings that inform our engine:

  • Position bias is architectural — present at initialization, not learned. Front-loading instructions is effective for prompts under 50% of context window (3 papers agree)
  • Moderate compression improves output — rule-based filler removal doesn't just save tokens, it enhances LLM performance (2505.00019)
  • Prompt quality is independently measurable — prompt-only scoring predicts output quality without seeing the response (ACL 2025, 2503.10084)
  • Quality threshold at score ~43 — our own experiment (30 prompts, 5 tiers, 2 models) found a step function: below 43, 83% failure rate; above 43, 94% success (Pearson r=0.56, Spearman ρ=0.64)
  • Format preferences are model-dependent — XML benefits Claude, Markdown benefits GPT, but having any structure matters more than the specific format (PromptBridge 2512.01420)

Model-specific scoring (--model claude/gpt/gemini) applies research-backed adjustments for each model's known preferences and sensitivities.

All analysis runs locally in <1ms per prompt. No LLM calls, no network requests.

How it works — architecture

How it works

 Data sources:
 ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
 │Claude Code│ │  Cursor  │ │  Aider   │ │ ChatGPT  │ │ 5 more.. │
 └─────┬────┘ └─────┬────┘ └─────┬────┘ └─────┬────┘ └─────┬────┘
       └─────────────┴───────────┴─────────────┴─────────────┘
                                 │
                    scan -> dedup -> store -> analyze
                                 │
              ┌──────────────────┼──────────────────┐
              v                  v                  v
        ┌──────────┐     ┌──────────────┐    ┌──────────┐
        │ insights │     │  patterns    │    │ sessions │
        │ wrapped  │     │  repetition  │    │ projects │
        │ style    │     │  privacy     │    │ agent    │
        └──────────┘     └──────────────┘    └──────────┘

Key design decisions:

  • Pure rules, no LLM — scoring and rewriting use regex + TF-IDF + research heuristics. Deterministic, private, <1ms per prompt.
  • Adapter pattern — each AI tool gets a parser that normalizes to a common Prompt model. Adding a new tool = one file.
  • Two-layer dedup — SHA-256 for exact matches, TF-IDF cosine similarity for near-dupes.
  • Research-calibrated — 10 peer-reviewed papers inform the scoring weights.
Conversation Distillation

Conversation Distillation

ctxray distill scores every turn in a conversation using 6 signals:

  • Position — first/last turns carry framing and conclusions
  • Length — substantial turns contain more information
  • Tool trigger — turns that cause tool calls are action-driving
  • Error recovery — turns that follow errors show problem-solving
  • Semantic shift — topic changes mark conversation boundaries
  • Uniqueness — novel phrasing vs repetitive follow-ups

Session type (debugging, feature-dev, exploration, refactoring) is auto-detected and signal weights adapt accordingly.

Why ctxray?

After Promptfoo joined OpenAI and Humanloop joined Anthropic, ctxray is the independent, open-source alternative for understanding your AI interactions.

  • 100% local — your prompts never leave your machine
  • No LLM required — pure rule-based analysis, <50ms per prompt
  • 9 AI tools — the only tool that works across Claude Code, Cursor, ChatGPT, and more
  • Research-backed — calibrated against 10 peer-reviewed papers, not vibes

Previously published as reprompt-cli. Same tool, new name, clean namespace.

Privacy

  • All analysis runs locally. No prompts leave your machine.
  • ctxray privacy shows exactly what you've sent to which AI tool.
  • Optional telemetry sends only anonymous feature vectors — never prompt text.
  • Open source: audit exactly what's collected.

Links

Contributing

See CONTRIBUTING.md for development setup and guidelines.

License

MIT