dspyteach – DSPy File Teaching Analyzer

What it does

dspyteach is a Python CLI that analyzes one file or many files and generates either:

a teaching brief (--mode teach, default), or
a refactor prompt template (--mode refactor)

It supports:

single files and recursive directory scans
repeated include globs (-g/--glob)
directory exclusions (-ed/--exclude-dirs)
local providers such as Ollama and LM Studio
OpenAI-compatible hosted endpoints
mirrored output directory layouts when --output-dir is set

The package centers on:

dspy_file/analyze_file_cli.py – CLI entrypoint and provider setup
dspy_file/file_analyzer.py – teaching pipeline
dspy_file/refactor_analyzer.py – refactor template pipeline
dspy_file/file_helpers.py – file discovery, ignore rules, rendering
dspy_file/prompts/ – bundled refactor prompt templates

Requirements

Python >=3.10,<3.12 (from pyproject.toml)
a supported model backend:
- Ollama
- LM Studio
- OpenAI-compatible API
uv recommended for local development

Install

Local development

uv venv -p 3.11
source .venv/bin/activate
uv sync

From PyPI

pip install dspyteach

Smoke check:

dspyteach --help

Provider setup

Ollama

Default provider is Ollama.

dspyteach path/to/file.md

The current default Ollama base URL is:

http://localhost:11434

LM Studio

LM Studio uses the OpenAI-compatible server at:

http://localhost:1234/v1

Example:

dspyteach path/to/project \
  --provider lmstudio \
  --model qwen_qwen3-4b-instruct-2507 \
  --api-base http://localhost:1234/v1

More details:

LM Studio provider guide

Hosted OpenAI-compatible provider

dspyteach path/to/project \
  --provider openai \
  --model gpt-5 \
  --api-base https://your-endpoint.example/v1 \
  --api-key YOUR_KEY

Environment variables

The CLI loads .env automatically via python-dotenv.

Common settings:

DSPYTEACH_PROVIDER=ollama
DSPYTEACH_MODEL=hf.co/Mungert/osmosis-mcp-4b-GGUF:Q5_K_M
#DSPYTEACH_API_BASE=http://localhost:1234/v1
#DSPYTEACH_API_KEY=lm-studio
#OPENAI_API_KEY=
#DSPYTEACH_LOG_PATH=.dspyteach/logs/custom.log
#DSPYTEACH_MAX_TOKENS=4000
#DSPYTEACH_LMSTUDIO_TIMEOUT_SECONDS=60

Notes:

DSPYTEACH_API_KEY falls back to OPENAI_API_KEY when DSPYTEACH_PROVIDER=openai
DSPYTEACH_LOG_PATH controls the runtime log file destination
DSPYTEACH_MAX_TOKENS controls the root LM token budget used during DSPy configuration
DSPYTEACH_LMSTUDIO_TIMEOUT_SECONDS controls the LM Studio HTTP request timeout; set it to 0, off, none, or disabled to remove the timeout entirely

See:

.env.example

Usage

Analyze a single file

dspyteach docs/example.md

Analyze a directory recursively

dspyteach ./repo -g '**/*.py' -g '**/*.md'

Write outputs to a separate mirrored tree

dspyteach ./repo \
  -g '**/*.md' \
  -o ./out

When --output-dir is set, the CLI mirrors the original relative directory layout inside the output directory and keeps the original filename.

Skip directories while scanning

dspyteach ./repo \
  -g '**/*README*' \
  -g '**/package.json' \
  -g '**/pyproject.toml' \
  -ed '.git,node_modules,.venv,dist,build' \
  -o ./out

Confirm each file before analysis

dspyteach ./repo -g '**/*.md' --confirm-each

Print raw predictions

dspyteach docs/example.md --raw

Globs

Include globs are relative to the path you pass to dspyteach.

Good:

dspyteach ../../../ai-apps \
  -g '**/README.md' \
  -g '**/package.json' \
  -g '**/pyproject.toml' \
  -o ../../../ai-apps/.readMes/.out

Not recommended:

# full absolute paths inside --glob are not needed
-g '~/projects/temp/ai-apps/**/README.md'

Repeat -g once per pattern.

Modes

Teach mode

Default mode. Generates a teaching-oriented markdown brief.

dspyteach path/to/file.md --mode teach

Refactor mode

Generates a refactor-oriented prompt template.

dspyteach path/to/file.md --mode refactor

Refactor mode supports -p/--prompt:

dspyteach ./repo --mode refactor --prompt refactor_prompt_template

If multiple bundled templates are available and you do not pass --prompt, the CLI will prompt you to choose one.

Output behavior

Current behavior:

if --output-dir is omitted, the CLI writes output under dspy_file/data/
if --output-dir is provided, the CLI writes into that directory and mirrors the source tree
teaching mode appends .teaching.md; refactor mode appends .refactor.md
--in-place requests source replacement, but real overwrites require per-file confirmation
if --in-place is combined with --output-dir, the CLI writes the original filename into that directory instead of overwriting the source

This keeps the original source files unchanged unless you explicitly approve each overwrite.

Example:

dspyteach ../../../ai-apps \
  -m teach \
  -g '**/*README*' \
  -o ../../../ai-apps/.readMes/.out

Logging

Runtime logging is configured once at the CLI boundary.

Default log path pattern:

.dspyteach/logs/run-YYYYMMDD-HHMMSS-<pid>.log

Each CLI run now gets its own log file by default.

Override it with:

DSPYTEACH_LOG_PATH=/absolute/path/to/dspyteach.log

The logger uses a rotating file handler.

Run state, resume, and cleanup

Batch runs now persist resumable state under:

.dspyteach/runs/<run_id>/

A saved run includes:

manifest.json with run-level settings and status
per-file checkpoint JSON files under files/
stage data for teaching-mode resume, so cancelled files can continue from the last completed stage

By default, a fresh run gets a generated run id. You can also name one explicitly:

dspyteach ./repo -g '**/*.py' --run-id docs-pass-1

Resume a saved run:

dspyteach ./repo -g '**/*.py' --resume docs-pass-1

Resume validation is intentionally strict. The CLI expects the resumed run to match the original path, mode, provider/model settings, and scan filters.

Inspect saved runs

List saved runs:

dspyteach --list-runs

Show one run and its per-file checkpoint stages:

dspyteach --show-run docs-pass-1

Machine-readable output is also available:

dspyteach --list-runs --json
dspyteach --show-run docs-pass-1 --json

Delete or prune saved runs

Delete a single saved run:

dspyteach --delete-run docs-pass-1

Preview which runs would be pruned without deleting them:

dspyteach --prune-runs --prune-status failed --dry-run

Delete completed runs older than 14 days:

dspyteach --prune-runs --prune-status completed --prune-older-than-days 14

Preview prune results as JSON:

dspyteach --prune-runs \
  --prune-status failed,completed_with_errors \
  --prune-older-than-days 7 \
  --dry-run \
  --json

Pruning requires at least one filter:

--prune-status
--prune-older-than-days

This is intentional so pruning cannot accidentally behave like a delete-all command.

Local model cleanup

Unless --keep-provider-alive is set, the CLI attempts to free local resources after the run:

Ollama – stops the active model
LM Studio – unloads matching loaded instances

dspyteach ./repo --provider lmstudio --keep-provider-alive

Troubleshooting

If Ollama cannot be reached, verify it is running on http://localhost:11434
If LM Studio cannot be reached, verify the local server is running on http://localhost:1234/v1
If you are scanning large trees, prefer --output-dir plus --exclude-dirs
If one file fails during a batch, the CLI logs the exception and continues with the next file
For debugging LM Studio integration, capture console output and inspect the runtime log file

Example verbose capture:

{ dspyteach ./repo -g '**/*.md'; } |& tee dspyteach.$(date +%Y%m%d-%H%M%S).log

Releasing

Maintainer release steps live in:

docs/RELEASING.md

Tag workflow helper:

./scripts/tag-release.sh

Pushing a tag matching v* triggers:

.github/workflows/release.yml

Development notes

Targeted test runs used frequently in this repo:

uv run pytest -q

Focused examples:

uv run pytest -q tests/test_cli_connectivity.py
uv run pytest -q tests/test_file_helpers.py
uv run pytest -q tests/test_lmstudio_structured.py

Example data

Sample generated outputs live under:

example-data

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
.github/workflows		.github/workflows
docs		docs
dspy_file		dspy_file
example-data		example-data
scripts		scripts
tests		tests
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

dspyteach – DSPy File Teaching Analyzer

What it does

Requirements

Install

Local development

From PyPI

Provider setup

Ollama

LM Studio

Hosted OpenAI-compatible provider

Environment variables

Usage

Analyze a single file

Analyze a directory recursively

Write outputs to a separate mirrored tree

Skip directories while scanning

Confirm each file before analysis

Print raw predictions

Globs

Modes

Teach mode

Refactor mode

Output behavior

Logging

Run state, resume, and cleanup

Inspect saved runs

Delete or prune saved runs

Local model cleanup

Troubleshooting

Releasing

Development notes

Example data

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages