Skip to content

expectedparrot/katz

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

katz

Version-aware ledger for paper review artifacts. Stores canonical manuscript representations, issue findings, investigations, and issue spotters keyed to specific git commits.

katz does not convert manuscripts, generate issues, or run analyses. It provides stable storage and a queryable interface. Workflow belongs to agents and external tools.

Install

pip install -e .

Quick start

cd ~/papers/my-paper
katz init
katz paper register \
  --canonical manuscript.md \
  --source-format pdf \
  --source-root writeup/paper.pdf

# Add section boundaries
katz paper add-sections --sections '[{"id":"intro","title":"Introduction","byte_start":0,"byte_end":5000}]'

# Populate the spotter catalog with defaults
katz spotter init-catalog --preset default

# Browse and enable spotters for this review
katz spotter catalog
katz spotter enable overclaiming
katz spotter enable logical_gaps
katz spotter enable unclear_writing

# File an issue
katz issue write \
  --title "Unsupported causal claim" \
  --byte-start 3200 --byte-end 3280 \
  --body "Causal language used but only correlational evidence presented." \
  --spotter overclaiming

# Investigate and update
katz issue investigate --id <id> --verdict confirmed --notes "Checked results section, no causal identification."
katz issue update --id <id> --state confirmed --reason "Investigation confirmed"

Storage layout

.katz/
  ACTIVE_VERSION                    active commit SHA
  spotters/                         catalog (available to all versions)
    overclaiming.md
    logical_gaps.md
    identification_threats.md
    ...
  versions/
    {commit}/
      version.json                  registration metadata
      paper/
        manuscript.md               canonical one-sentence-per-line markdown
      paper_map.jsonl               section + sentence index
      symbol_table.json             notation definitions (written by agent)
      spotters/
        overclaiming.md             issue spotter definitions
        logical_gaps.md
      issues/
        {id}/
          issue.json                immutable original record
          status/
            20260411T145504_534957.json   state changes (append-only)
            20260411T145505_155781.json
          investigations/
            20260411T145504_844804.json   investigation records (append-only)
      chunks/
        {id}.json                   chunk definitions (written by agent)

Design principles

  • Append-only: Status changes and investigations are never overwritten. Each is a new file. Current state = latest file in status/. Full history is always preserved.
  • Git-native: Every version is keyed to a full git commit SHA.
  • Byte-anchored: Every finding references source text via half-open byte ranges [byte_start, byte_end) into the canonical manuscript.
  • Agent-first: All commands output JSON.

CLI reference

katz init

katz init

Initialize .katz/ in the current git repository.

katz paper

katz paper register --canonical <path> [--source-format pdf] [--source-root writeup/paper.pdf]

Register a canonical manuscript for the current HEAD commit. Auto-generates sentence segmentation.

katz paper status

Show paper metadata: commit, sections, sentences, validity.

katz paper add-sections --sections '<json-array>'

Append section boundary records to the paper map.

katz paper section <id>

Show one section's byte range, line range, and title.

katz paper sentences [--section <id>] [--from-line N] [--to-line N]

Return the sentence index, optionally filtered.

katz paper resolve <byte-start> <byte-end>

Resolve a byte range into text, line numbers, and section.

katz paper find <text> [--ignore-case] [--limit N]

Find text in the canonical manuscript. Returns byte offsets.

katz spotter

Spotters define what to look for during review. There are two layers:

  • Catalog (.katz/spotters/) — all available spotters, shared across versions
  • Active (.katz/versions/<commit>/spotters/) — spotters enabled for a specific review

Catalog management

katz spotter init-catalog [--preset default]

Populate the catalog with default spotters. The default preset includes 13 spotters covering overclaiming, logical gaps, statistical errors, methodology, writing clarity, identification threats, and more.

katz spotter catalog [--scope section|holistic]

List all available spotters in the catalog.

katz spotter catalog-show <name>

Show a catalog spotter's full description and investigation instructions.

Enabling spotters for a review

katz spotter enable <name>

Copy a spotter from the catalog into the active version. This command is idempotent; if the spotter is already enabled, it returns success with "already_enabled": true.

katz spotter add --name "prompt_sensitivity" \
  --scope section \
  --description "Check whether results depend on specific prompt wording." \
  --investigation "Check if alternative prompts are tested."

Add a custom paper-specific spotter to the catalog and auto-enable it for the active version.

katz spotter add --file my_spotter.md [--name custom_slug]

Add from a markdown file. Validates frontmatter (scope, heading).

katz spotter list [--scope section|holistic]

List spotters enabled for the active version.

katz spotter show <name>

Show an enabled spotter's parsed content (scope, description, investigation instructions).

katz spotter remove <name>

Remove a spotter from the active version.

katz issue

Issues use a directory-per-issue layout with append-only subdirectories for status changes and investigations.

Valid states

draft | open | confirmed | rejected | resolved | wontfix

Write

katz issue write \
  --title "Short description" \
  --byte-start 3200 --byte-end 3280 \
  --body "Explanation of the issue." \
  [--spotter overclaiming] \
  [--state draft] \
  [--meta '{"severity": "major"}']

Creates issues/<id>/issue.json (immutable) and an initial status record in issues/<id>/status/. The --spotter flag validates that the named spotter is registered.

Update state

katz issue update --id <id> --state confirmed [--reason "Investigation confirmed"]

Appends a new file to issues/<id>/status/. Never overwrites prior state. The --id value may be a full id or an unambiguous prefix.

Investigate

katz issue investigate \
  --id <id> \
  --verdict confirmed \
  [--notes "Checked the LaTeX source, claim is not supported."] \
  [--evidence '["line 45: no causal design", "appendix omits robustness check"]']

Appends a new file to issues/<id>/investigations/. Verdicts: confirmed, rejected, uncertain.

Show

katz issue show <id>
katz issue show --ids <id1,id2,...>

Returns the full issue record with current state (derived from latest status file), status_history (all state changes), and investigations (all investigation records). IDs may be full ids or unambiguous prefixes. With --ids, returns a list of full issue records.

{
  "schema_version": 2,
  "id": "4a4ea277e3f84cd1895e787179e7dd72",
  "commit": "6f1dba8d...",
  "title": "Unsupported causal claim",
  "body": "...",
  "spotter": "overclaiming",
  "location": {
    "byte_start": 3200,
    "byte_end": 3280,
    "line_start": 25,
    "line_end": 25,
    "resolved_text": "This proves that...",
    "contains_math": false
  },
  "created_at": "2026-04-11T14:50:39Z",
  "meta": {},
  "state": "confirmed",
  "status_history": [
    {"state": "draft", "reason": "created", "timestamp": "2026-04-11T14:50:39Z"},
    {"state": "open", "reason": "triaged", "timestamp": "2026-04-11T14:50:50Z"},
    {"state": "confirmed", "reason": "investigation confirmed", "timestamp": "2026-04-11T14:51:02Z"}
  ],
  "investigations": [
    {"verdict": "confirmed", "timestamp": "2026-04-11T14:50:55Z", "notes": "..."}
  ]
}

List

katz issue list [--state confirmed] [--section intro] [--spotter overclaiming] [--meta severity=major]

Returns issue summaries with current state, spotter, section, and meta. All filters are optional and combinable.

katz report

katz report generate [--output .katz/review.html]

Generate the HTML review report from the active katz version. The command returns JSON with the output path and artifact counts.

katz validate

katz validate [--commit <sha>]

Validates the version's structure: manuscript checksum, issue directories, status files, investigation files, and chunk records. Returns {"valid": true/false, ...}.

About

Version-aware ledger for paper review artifacts

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages