Skip to content

AcidicSoil/DSPyTeach

Repository files navigation

dspyteach – DSPy File Teaching Analyzer


PyPI Downloads TestPyPI Release Repo


What it does

dspyteach is a Python CLI that analyzes one file or many files and generates either:

  • a teaching brief (--mode teach, default), or
  • a refactor prompt template (--mode refactor)

It supports:

  • single files and recursive directory scans
  • repeated include globs (-g/--glob)
  • directory exclusions (-ed/--exclude-dirs)
  • local providers such as Ollama and LM Studio
  • OpenAI-compatible hosted endpoints
  • mirrored output directory layouts when --output-dir is set

The package centers on:

  • dspy_file/analyze_file_cli.py – CLI entrypoint and provider setup
  • dspy_file/file_analyzer.py – teaching pipeline
  • dspy_file/refactor_analyzer.py – refactor template pipeline
  • dspy_file/file_helpers.py – file discovery, ignore rules, rendering
  • dspy_file/prompts/ – bundled refactor prompt templates

Requirements

  • Python >=3.10,<3.12 (from pyproject.toml)
  • a supported model backend:
    • Ollama
    • LM Studio
    • OpenAI-compatible API
  • uv recommended for local development

Install

Local development

uv venv -p 3.11
source .venv/bin/activate
uv sync

From PyPI

pip install dspyteach

Smoke check:

dspyteach --help

Provider setup

Ollama

Default provider is Ollama.

dspyteach path/to/file.md

The current default Ollama base URL is:

http://localhost:11434

LM Studio

LM Studio uses the OpenAI-compatible server at:

http://localhost:1234/v1

Example:

dspyteach path/to/project \
  --provider lmstudio \
  --model qwen_qwen3-4b-instruct-2507 \
  --api-base http://localhost:1234/v1

More details:

Hosted OpenAI-compatible provider

dspyteach path/to/project \
  --provider openai \
  --model gpt-5 \
  --api-base https://your-endpoint.example/v1 \
  --api-key YOUR_KEY

Environment variables

The CLI loads .env automatically via python-dotenv.

Common settings:

DSPYTEACH_PROVIDER=ollama
DSPYTEACH_MODEL=hf.co/Mungert/osmosis-mcp-4b-GGUF:Q5_K_M
#DSPYTEACH_API_BASE=http://localhost:1234/v1
#DSPYTEACH_API_KEY=lm-studio
#OPENAI_API_KEY=
#DSPYTEACH_LOG_PATH=.dspyteach/logs/custom.log
#DSPYTEACH_MAX_TOKENS=4000
#DSPYTEACH_LMSTUDIO_TIMEOUT_SECONDS=60

Notes:

  • DSPYTEACH_API_KEY falls back to OPENAI_API_KEY when DSPYTEACH_PROVIDER=openai
  • DSPYTEACH_LOG_PATH controls the runtime log file destination
  • DSPYTEACH_MAX_TOKENS controls the root LM token budget used during DSPy configuration
  • DSPYTEACH_LMSTUDIO_TIMEOUT_SECONDS controls the LM Studio HTTP request timeout; set it to 0, off, none, or disabled to remove the timeout entirely

See:


Usage

Analyze a single file

dspyteach docs/example.md

Analyze a directory recursively

dspyteach ./repo -g '**/*.py' -g '**/*.md'

Write outputs to a separate mirrored tree

dspyteach ./repo \
  -g '**/*.md' \
  -o ./out

When --output-dir is set, the CLI mirrors the original relative directory layout inside the output directory and keeps the original filename.

Skip directories while scanning

dspyteach ./repo \
  -g '**/*README*' \
  -g '**/package.json' \
  -g '**/pyproject.toml' \
  -ed '.git,node_modules,.venv,dist,build' \
  -o ./out

Confirm each file before analysis

dspyteach ./repo -g '**/*.md' --confirm-each

Print raw predictions

dspyteach docs/example.md --raw

Globs

Include globs are relative to the path you pass to dspyteach.

Good:

dspyteach ../../../ai-apps \
  -g '**/README.md' \
  -g '**/package.json' \
  -g '**/pyproject.toml' \
  -o ../../../ai-apps/.readMes/.out

Not recommended:

# full absolute paths inside --glob are not needed
-g '~/projects/temp/ai-apps/**/README.md'

Repeat -g once per pattern.


Modes

Teach mode

Default mode. Generates a teaching-oriented markdown brief.

dspyteach path/to/file.md --mode teach

Refactor mode

Generates a refactor-oriented prompt template.

dspyteach path/to/file.md --mode refactor

Refactor mode supports -p/--prompt:

dspyteach ./repo --mode refactor --prompt refactor_prompt_template

If multiple bundled templates are available and you do not pass --prompt, the CLI will prompt you to choose one.


Output behavior

Current behavior:

  • if --output-dir is omitted, the CLI writes output under dspy_file/data/
  • if --output-dir is provided, the CLI writes into that directory and mirrors the source tree
  • teaching mode appends .teaching.md; refactor mode appends .refactor.md
  • --in-place requests source replacement, but real overwrites require per-file confirmation
  • if --in-place is combined with --output-dir, the CLI writes the original filename into that directory instead of overwriting the source

This keeps the original source files unchanged unless you explicitly approve each overwrite.

Example:

dspyteach ../../../ai-apps \
  -m teach \
  -g '**/*README*' \
  -o ../../../ai-apps/.readMes/.out

Logging

Runtime logging is configured once at the CLI boundary.

Default log path pattern:

.dspyteach/logs/run-YYYYMMDD-HHMMSS-<pid>.log

Each CLI run now gets its own log file by default.

Override it with:

DSPYTEACH_LOG_PATH=/absolute/path/to/dspyteach.log

The logger uses a rotating file handler.

Run state, resume, and cleanup

Batch runs now persist resumable state under:

.dspyteach/runs/<run_id>/

A saved run includes:

  • manifest.json with run-level settings and status
  • per-file checkpoint JSON files under files/
  • stage data for teaching-mode resume, so cancelled files can continue from the last completed stage

By default, a fresh run gets a generated run id. You can also name one explicitly:

dspyteach ./repo -g '**/*.py' --run-id docs-pass-1

Resume a saved run:

dspyteach ./repo -g '**/*.py' --resume docs-pass-1

Resume validation is intentionally strict. The CLI expects the resumed run to match the original path, mode, provider/model settings, and scan filters.

Inspect saved runs

List saved runs:

dspyteach --list-runs

Show one run and its per-file checkpoint stages:

dspyteach --show-run docs-pass-1

Machine-readable output is also available:

dspyteach --list-runs --json
dspyteach --show-run docs-pass-1 --json

Delete or prune saved runs

Delete a single saved run:

dspyteach --delete-run docs-pass-1

Preview which runs would be pruned without deleting them:

dspyteach --prune-runs --prune-status failed --dry-run

Delete completed runs older than 14 days:

dspyteach --prune-runs --prune-status completed --prune-older-than-days 14

Preview prune results as JSON:

dspyteach --prune-runs \
  --prune-status failed,completed_with_errors \
  --prune-older-than-days 7 \
  --dry-run \
  --json

Pruning requires at least one filter:

  • --prune-status
  • --prune-older-than-days

This is intentional so pruning cannot accidentally behave like a delete-all command.


Local model cleanup

Unless --keep-provider-alive is set, the CLI attempts to free local resources after the run:

  • Ollama – stops the active model
  • LM Studio – unloads matching loaded instances
dspyteach ./repo --provider lmstudio --keep-provider-alive

Troubleshooting

  • If Ollama cannot be reached, verify it is running on http://localhost:11434
  • If LM Studio cannot be reached, verify the local server is running on http://localhost:1234/v1
  • If you are scanning large trees, prefer --output-dir plus --exclude-dirs
  • If one file fails during a batch, the CLI logs the exception and continues with the next file
  • For debugging LM Studio integration, capture console output and inspect the runtime log file

Example verbose capture:

{ dspyteach ./repo -g '**/*.md'; } |& tee dspyteach.$(date +%Y%m%d-%H%M%S).log

Releasing

Maintainer release steps live in:

Tag workflow helper:

./scripts/tag-release.sh

Pushing a tag matching v* triggers:

  • .github/workflows/release.yml

Development notes

Targeted test runs used frequently in this repo:

uv run pytest -q

Focused examples:

uv run pytest -q tests/test_cli_connectivity.py
uv run pytest -q tests/test_file_helpers.py
uv run pytest -q tests/test_lmstudio_structured.py

Example data

Sample generated outputs live under:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors