GitHub - AcidicSoil/lms-llmsTxt: LM-Studio llms.txt generator using DSPy framework

title	LM Studio llms.txt Generator
description	Generate llms.txt, llms-full, and fallback artifacts for GitHub repositories using DSPy with LM Studio.

Overview

Use this CLI-first toolkit to produce LLM-friendly documentation bundles (llms.txt, llms-full.txt, optional llms-ctx.txt, and fallback JSON) for any GitHub repository. The generator wraps DSPy analyzers, manages LM Studio model lifecycle with the official Python SDK, and guarantees output even when the primary language model cannot respond.

Note

The pipeline validates curated links, detects default branches automatically, and writes artifacts to artifacts/<owner>/<repo>/.

Prerequisites

Python 3.10 or later
LM Studio server available locally (Developer tab → Start Server) or the CLI (lms server start --port 1234)
GitHub API token in GITHUB_ACCESS_TOKEN or GH_TOKEN
Optional: llms_txt when you want to produce llms-ctx.txt

Warning

Install dependencies inside a virtual environment to avoid PEP 668 “externally managed environment” errors.

Install

Create a virtual environment

python3 -m venv .venv
source .venv/bin/activate

Using uv:

uv venv
source .venv/bin/activate

Install the package with developer extras

pip install -e '.[dev]'

Using uv:

uv pip install -e '.[dev]'

Installing the editable package exposes the lmstxt CLI and the lmstxt-mcp server.

Tip

Keep the virtual environment active while running the CLI or tests so the SDK-based unload logic can import lmstudio.

Configure LM Studio

Load the CLI and start the server

npx lmstudio install-cli
lms server start --port 1234

The server must expose an OpenAI-compatible endpoint, commonly http://localhost:1234/v1.

Ensure the target model is downloaded

Open LM Studio, download the model (for example qwen/qwen3-4b-2507), and confirm it appears in the Server tab.

Quick start

Run the CLI against any GitHub repository:

lmstxt https://github.com/owner/repo \
  --model qwen/qwen3-4b-2507 \
  --api-base http://localhost:1234/v1 \
  --stamp

The command writes artifacts to artifacts/owner/repo/. Use --output-dir to override the destination.

Private GitHub repositories

To run against a private repository you own, the GitHub API calls must be authenticated. The CLI reads a token from GITHUB_ACCESS_TOKEN or GH_TOKEN and sends it as a Bearer token.

Token options:

Classic PAT: grant at least repo scope for private repo read access.
Fine-grained PAT: grant read access to Contents and Metadata for the target repo.
Org SSO: if your org enforces SSO, you must authorize the token in the GitHub SSO UI.

Environment setup examples:

export GITHUB_ACCESS_TOKEN="ghp_..."
# or
export GH_TOKEN="ghp_..."

You can also use a .env file; the CLI calls load_dotenv() on startup and reads the token into AppConfig.github_token.

GITHUB_ACCESS_TOKEN=ghp_...

Run the CLI normally once the token is available:

lmstxt https://github.com/<owner>/<repo>

If you run the MCP server, the same env vars must be present in the process environment that launches lmstxt-mcp (for example, in your MCP client config).

Environment variables

Variable	Description
`LMSTUDIO_MODEL`	Default LM Studio model identifier
`LMSTUDIO_BASE_URL`	Base URL such as `http://localhost:1234/v1`
`LMSTUDIO_API_KEY`	API key for secured LM Studio deployments
`OUTPUT_DIR`	Custom root directory for artifacts
`ENABLE_CTX=1`	Emit `llms-ctx.txt` using the optional `llms_txt` package

Generated artifacts

Artifact	Purpose
`*-llms.txt`	Primary documentation synthesized by DSPy or the fallback heuristic
`*-llms-full.txt`	Expanded content fetched from curated GitHub links with 404 filtering
`*-llms.json`	Fallback JSON following `LLMS_JSON_SCHEMA` (only when LM fallback triggers)
`*-llms-ctx.txt`	Optional context file created when `ENABLE_CTX=1` and `llms_txt` is installed

Important

The pipeline always writes llms.txt and llms-full.txt, even when the language model call fails.

Model Context Protocol (MCP) Server

This package includes a FastMCP server that exposes the generator as an MCP tool and provides access to generated artifacts as resources.

Features

Asynchronous Processing: Tool calls return a run_id immediately while generation happens in the background.
Tools:
- lmstxt_generate_llms_txt: Trigger llms.txt generation.
- lmstxt_generate_llms_full: Generate llms-full.txt from an existing run.
- lmstxt_generate_llms_ctx: Generate llms-ctx.txt (requires llms_txt).
- lmstxt_list_runs: View recent generation history and status.
- lmstxt_read_artifact: Read generated files with pagination support.
- lmstxt_list_all_artifacts: List all persistent .txt artifacts on disk.
Resources:
- Run-specific: lmstxt://runs/{run_id}/{artifact_name}
- Persistent Directory: lmstxt://artifacts/{filename} (e.g., lmstxt://artifacts/owner/repo/repo-llms.txt)

Running the Server

# Default stdio transport
lmstxt-mcp

Client Configuration

Add to your MCP client config (e.g., claude_desktop_config.json or config.toml):

Claude Desktop / Cursor

{
  "mcpServers": {
    "lmstxt": {
      "command": "lmstxt-mcp",
      "env": {
        "GITHUB_ACCESS_TOKEN": "your_token",
        "LMSTUDIO_BASE_URL": "http://localhost:1234/v1"
      }
    }
  }
}

Codex / CLI (toml)

[mcp_servers.lmstxt]
command = "lmstxt-mcp"
startup_timeout_sec = 30
tool_timeout_sec = 30

[mcp_servers.lmstxt.env]
GITHUB_ACCESS_TOKEN = "your_token"
LMSTUDIO_BASE_URL = "http://localhost:1234/v1"

How it works

Collect repository material – the GitHub client gathers the file tree, README, package files, repository visibility, and default branch.
Prepare LM Studio – the manager confirms the requested model is loaded, auto-loading if necessary.
Generate documentation – DSPy produces curated content; on LM failures the fallback serializer builds markdown and JSON directly.
Assemble llms-full – curated links are re-fetched via raw GitHub URLs for public repos or authenticated API calls for private ones, with validation to remove dead links.
Unload models safely – the workflow first uses the official lmstudio SDK (model.unload() or list_loaded_models), then falls back to HTTP and CLI unload requests.

Project layout

src/lms_llmsTxt/ – core generation library, DSPy analyzers, and LM Studio helpers.
src/lms_llmsTxt_mcp/ – MCP server implementation, asynchronous worker, and resource providers.
tests/ – pytest coverage for the generator pipeline and MCP server.
artifacts/ – sample outputs from previous runs.

Verify your setup

source .venv/bin/activate
python -m pytest

Using uv:

source .venv/bin/activate
uv run pytest

All tests should pass, confirming URL validation, fallback handling, and MCP resource exposure.

Build & verify a local package

Use this flow to rebuild locally and verify the installed CLI before tagging and publishing.

python3 -m pip install -U pip
python3 -m pip install build
python3 -m build
python3 -m pip install --force-reinstall dist/*.whl
lmstxt --help

Using uv:

uv pip install -U pip
uv pip install build
uv build
uv pip install --force-reinstall dist/*.whl
lmstxt --help

Note: this project uses setuptools_scm, so the version comes from git tags. Before tagging, you will see a post-release + date suffix.

Troubleshooting

Warning

If pip install -e .[dev] fails with build tool errors, ensure cmake and necessary compilers are installed.

Using uv:

uv pip install -e '.[dev]'

Tip

If the MCP server times out during generation, check lmstxt_list_runs to see if the background task is still processing. The lmstxt_generate_* tools return immediately to avoid client timeouts.

MCP Inspector

npx @modelcontextprotocol/inspector --config ./inspector.config.json --server lmstxt

Use the payloads in docs/mcp-inspector-payloads.md to verify specific tool behaviors.

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
.claude		.claude
.clinerules		.clinerules
.cursor		.cursor
.gemini/commands/tm		.gemini/commands/tm
.github/workflows		.github/workflows
.omc		.omc
.ruler		.ruler
.rules		.rules
.taskmaster		.taskmaster
.tickets		.tickets
artifacts		artifacts
docs		docs
oracle-out		oracle-out
scripts		scripts
src		src
tests		tests
.codefetchignore		.codefetchignore
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
codefetch.config.mjs		codefetch.config.mjs
inspector.config.json		inspector.config.json
lmstxt-usage.md		lmstxt-usage.md
local_translation_run_for_lms_llmstxt_artifacts_nllb_based_mockup.md		local_translation_run_for_lms_llmstxt_artifacts_nllb_based_mockup.md
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pyproject.toml		pyproject.toml
pytest.txt		pytest.txt
repos.md		repos.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Prerequisites

Install

Create a virtual environment

Install the package with developer extras

Configure LM Studio

Load the CLI and start the server

Ensure the target model is downloaded

Quick start

Private GitHub repositories

Environment variables

Generated artifacts

Model Context Protocol (MCP) Server

Features

Running the Server

Client Configuration

Claude Desktop / Cursor

Codex / CLI (toml)

How it works

Project layout

Verify your setup

Build & verify a local package

Troubleshooting

MCP Inspector

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

Overview

Prerequisites

Install

Create a virtual environment

Install the package with developer extras

Configure LM Studio

Load the CLI and start the server

Ensure the target model is downloaded

Quick start

Private GitHub repositories

Environment variables

Generated artifacts

Model Context Protocol (MCP) Server

Features

Running the Server

Client Configuration

Claude Desktop / Cursor

Codex / CLI (toml)

How it works

Project layout

Verify your setup

Build & verify a local package

Troubleshooting

MCP Inspector

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors 1

Languages

Packages