codesearch

Full-text and structural code search for a large monorepo. Runs a Typesense search server and exposes results as MCP tools so Claude can query the codebase directly without copy-pasting.

Early alpha. Expect rough edges. The only supported install path is cloning the repository and running setup.cmd.

Installation

git clone https://github.com/microsoft/tscodesearch
cd tscodesearch
setup.cmd          # Docker mode (default)
setup.cmd --wsl    # WSL mode (alternative)

setup.cmd builds the MCP server, registers it with Claude Code, creates config.json, and installs the VS Code extension. After it completes:

ts start

Then open VS Code (or reload: Ctrl+Shift+P > Reload Window) and use TsCodeSearch: Add Root to point at your source directory.

Prerequisites

Docker mode (default):

Docker Desktop installed and running
Node.js 20+

WSL mode (alternative):

Windows 11 with WSL2
Python 3.10+ available in WSL (python3 --version)
Node.js 20+

One-time setup

From a Windows command prompt or PowerShell:

setup.cmd          # Docker mode (default)
setup.cmd --wsl    # WSL mode

setup.cmd checks for Node.js then calls node setup.mjs, which:

Builds the MCP server (npm install && npm run build)
Registers the MCP server with Claude Code
Sets up the WSL environment (WSL mode only)
Creates config.json with an auto-generated API key and mode field
Installs the VS Code extension

After setup, start the service:

ts start

Then open VS Code (or reload: Ctrl+Shift+P > Reload Window) and use TsCodeSearch: Add Root to point at your source directory.

To uninstall: setup.cmd --uninstall

Adding roots

ts root --add NAME C:\path\to\source
ts restart

Or use the VS Code extension command TsCodeSearch: Add Root.

Each root gets its own Typesense collection (codesearch_NAME). Multi-root config in config.json:

{
  "api_key": "...",
  "mode": "docker",
  "roots": {
    "default": "C:/myproject/src",
    "other":   "D:/other/src"
  }
}

Use the MCP root= parameter to search a specific collection:

query_codebase("implements", "IRepository", root="other")
query_single_file("methods", file="path/to/Widget.cs", root="other")

Service management

All service commands go through ts.cmd (Windows CMD/PowerShell):

ts status                          show server health, doc count, watcher/heartbeat state
ts start                           start the service (auto-indexes if needed)
ts stop                            stop everything
ts restart                         stop then start
ts index                           re-index in background (incremental, keeps existing collection)
ts index --reset                   drop + recreate collection, then re-index
ts index --root <name>             index a specific named root
ts verify                          scan FS + repair index: add missing, re-index stale, remove orphans
ts verify --root <name>            verify a specific named root
ts verify --no-delete-orphans      repair without removing deleted-file entries
ts log                             tail the service log
ts log --indexer [-n N]            tail the indexer/verifier log (default: last 40 lines)

The same ts commands work for both Docker and WSL modes — ts.mjs reads the mode field from config.json to decide which backend to use.

Keeping the index up to date

The watcher picks up changes automatically within ~12 seconds (10 s poll interval + 2 s debounce). For large repos, or after bulk operations like a git pull or branch switch, use the MCP tools or ts verify to confirm everything is in sync.

From Claude (MCP tools)

ready()                              # poll FS + check index is in sync (synchronous, ~30 s for 200k files)
verify_index(action="proxy.php?url=https%3A%2F%2Fgithub.com%2Fstart")         # launch background repair scan
verify_index(action="proxy.php?url=https%3A%2F%2Fgithub.com%2Fstatus")        # monitor repair progress
verify_index(action="proxy.php?url=https%3A%2F%2Fgithub.com%2Fstop")          # cancel a running scan

ready() returns a summary with poll_ok (FS walk completed), index_ok (zero missing/stale/orphaned), and timing. If not ready, verify_index(action="proxy.php?url=https%3A%2F%2Fgithub.com%2Fstart") repairs the index without resetting it.

From the command line

ts verify                            # foreground repair scan (missing + stale + orphans)
ts verify --no-delete-orphans        # repair without removing deleted-file entries
ts verify --root other               # verify a specific named root

Running tests

Three modes, all via run_tests.mjs (or the run_tests.cmd wrapper):

node run_tests.mjs --docker                            # Docker mode (default)
node run_tests.mjs --wsl --destructive                 # WSL mode (erases + resets WSL index)
node run_tests.mjs --linux                             # Linux/CI

Filter by test name, class, or file:

node run_tests.mjs --wsl --destructive -k TestVerifier
node run_tests.mjs --docker tests/test_query_cs.py

The full suite is ~697 tests. Docker E2E mounts tests/ as a volume — test changes don't require rebuilding the image.

Structural query tests (no server needed)

74 tests covering all 15 query_cs modes against a synthetic C# fixture:

node run_tests.mjs --wsl --destructive tests/test_query_cs.py

File	What it tests
`test_query_cs.py`	All C# AST query modes against `tests/query_fixture.cs`
`test_indexer.py`	Indexer, semantic fields, multi-root, `extract_cs_metadata`, `index_file_list` pipeline
`test_indexer_query_consistency.py`	Cross-checks that indexer and query extract the same values from identical source
`test_watcher.py`	File watcher event handler (unit + integration)
`test_process_cs.py`	`process_file()` C# structural query API
`test_python.py`	Python metadata extraction (`extract_py_metadata`), `process_py_file()`, Python semantic fields
`test_verifier.py`	`_export_index()` (mock HTTP), `run_verify()` diff logic, full verify integration

Direct CLI usage

Full-text search (`search.py`)

# From WSL:
~/.local/indexserver-venv/bin/python search.py "Widget"
~/.local/indexserver-venv/bin/python search.py "ProcessOrder" --ext cs --sub payments
~/.local/indexserver-venv/bin/python search.py "IRepository"  --mode implements
~/.local/indexserver-venv/bin/python search.py "SaveChanges"  --mode calls
~/.local/indexserver-venv/bin/python search.py "Obsolete"     --mode attrs
~/.local/indexserver-venv/bin/python search.py "ConnectionString" --mode uses

Structural C# AST queries (`query.py`)

# Listing modes (no pattern needed)
~/.local/indexserver-venv/bin/python query.py --methods  Order.cs
~/.local/indexserver-venv/bin/python query.py --classes  Order.cs
~/.local/indexserver-venv/bin/python query.py --fields   Order.cs
~/.local/indexserver-venv/bin/python query.py --usings   Order.cs

# Pattern modes with explicit file(s) or glob
~/.local/indexserver-venv/bin/python query.py --calls     SaveChanges        "src/data/**/*.cs"
~/.local/indexserver-venv/bin/python query.py --calls     Repository.Save    "src/data/**/*.cs"
~/.local/indexserver-venv/bin/python query.py --casts     Widget             "src/**/*.cs"
~/.local/indexserver-venv/bin/python query.py --all-refs  ProcessOrder       "src/**/*.cs"
~/.local/indexserver-venv/bin/python query.py --accesses-on Widget           Order.cs
~/.local/indexserver-venv/bin/python query.py --accesses-of Status           "src/**/*.cs"
~/.local/indexserver-venv/bin/python query.py --attrs     TestMethod         "src/**/*.cs"
~/.local/indexserver-venv/bin/python query.py --declarations ProcessOrder    Order.cs
~/.local/indexserver-venv/bin/python query.py --params    SaveChanges        Order.cs

# Pattern modes with --search (Typesense finds the files automatically)
~/.local/indexserver-venv/bin/python query.py --implements IRepository       --search "IRepository"
~/.local/indexserver-venv/bin/python query.py --uses       Order             --search "Order"
~/.local/indexserver-venv/bin/python query.py --uses       ConnectionString  --uses-kind field   --search "ConnectionString"
~/.local/indexserver-venv/bin/python query.py --uses       Widget            --uses-kind param   --search "Widget"

Architecture

Two-layer search

Typesense — fast keyword/semantic search over pre-indexed metadata (class names, method names, base types, call sites, signatures, attributes, etc.). Data stored at ~/.local/typesense/ (WSL) or in a Docker volume.
tree-sitter — precise C# AST queries on the file set returned by Typesense. Skips comments and string literals, understands syntax.

Typical flow: Typesense narrows the haystack to ~50 candidate files → tree-sitter parses each one and applies the structural query.

Process topology

Docker mode (default):

┌──────────────────────────────────────────────────────────────┐
│  MCP CLIENT  (Claude ↔ tools)                                │
│  mcp_server.js  (Node.js — runs on Windows)                  │
│  Claude Code → mcp.cmd → node mcp_server.js                  │
└────────────────────────────┬─────────────────────────────────┘
                             │  HTTP  localhost:PORT+1
                             │  (indexserver management API)
┌────────────────────────────▼─────────────────────────────────┐
│  DOCKER CONTAINER                                            │
│  indexserver/api.py  (management API + thread manager)       │
│    • watcher thread    (PollingObserver)                      │
│    • heartbeat thread  (Typesense health check)              │
│    • verifier thread   (on-demand, via POST /verify)         │
│  /app/tests (volume mount)                                   │
└──────────────────────────────────────────────────────────────┘
                             │  internal
                        Typesense server
                      (Docker volume for data)

WSL mode:

┌──────────────────────────────────────────────────────────────┐
│  MCP CLIENT  (Claude ↔ tools)                                │
│  mcp_server.js  (Node.js — runs on Windows)                  │
│  Claude Code → mcp.cmd → node mcp_server.js                  │
└────────────────────────────┬─────────────────────────────────┘
                             │  HTTP  localhost:PORT+1
┌────────────────────────────▼─────────────────────────────────┐
│  INDEXSERVER  (WSL process: api.py)                          │
│    • watcher thread    (PollingObserver, /mnt/)              │
│    • heartbeat thread  (Typesense health check)              │
│    • verifier thread   (on-demand, via POST /verify)         │
│  Venv: ~/.local/indexserver-venv/                            │
└──────────────────────────────────────────────────────────────┘
                             │  data at ~/.local/typesense/
                        Typesense server (Linux binary)

MCP server is Node.js. mcp.cmd (Windows) runs node mcp_server.js; on Linux/WSL run it directly — no Python venv needed for the MCP layer. mcp_server.js communicates with the indexserver via HTTP on localhost (port PORT+1). Typesense is internal-only.

File map

Client-side (repo root)

File	Purpose
`config.py`	Shared constants: HOST, PORT, API_KEY, ROOTS, collection names. Reads `config.json`.
`search.py`	Typesense HTTP search; `search()` + `format_results()`
`query.py`	tree-sitter AST query functions + `process_file()` + `files_from_search()`
`mcp_server.ts` / `mcp_server.js`	Node.js MCP server: `query_codebase`, `query_single_file`, `ready`, `verify_index`, `service_status`, `manage_service` tools
`mcp.cmd`	Windows launcher: `node mcp_server.js` (Linux/WSL: run directly)
`ts.cmd`	Thin wrapper: `node ts.mjs %*`
`ts.mjs`	Management CLI: start/stop/restart/status/index/verify/log/root/build/setup. Reads `mode` from `config.json`.
`setup.cmd`	Thin wrapper: checks Node.js 20+, calls `node setup.mjs %*`
`setup.mjs`	Full one-time setup: build MCP, register with Claude Code, WSL env (if --wsl), config.json, start service, VS Code extension
`run_tests.cmd`	Thin wrapper: `node run_tests.mjs %*`
`run_tests.mjs`	Test runner: `--docker`, `--wsl --destructive`, or `--linux` mode

Server-side (indexserver/)

File	Purpose
`config.py`	Same constants as client config.py; also has INCLUDE_EXTENSIONS, EXCLUDE_DIRS, MAX_FILE_BYTES
`api.py`	Single indexserver process: management HTTP API + watcher/heartbeat/verifier threads
`indexer.py`	Full re-index via `os.walk` + `.gitignore` parsing + tree-sitter C#/Python metadata extraction. Shared `index_file_list()` pipeline used by both full indexer and verifier.
`verifier.py`	Index repair: compares FS mtimes against the index, re-indexes missing/stale files, removes orphaned entries. `check_ready()` for synchronous readiness checks.
`watcher.py`	Incremental updates: `PollingObserver` (10 s interval) monitors source root and upserts changes. Uses polling because inotify doesn't fire for Windows-backed `/mnt/` paths in WSL.
`start_server.py`	Downloads Typesense Linux binary; starts server process in WSL
`service.py`	CLI dispatcher for all `ts` subcommands
`smoke_test.py`	Quick sanity check that the server is up and basic queries work

Scripts

File	Purpose
`scripts/entrypoint.sh`	Docker/WSL container entry point. `--background`: start daemons and exit. `--background --disown`: start + disown (survives session end). Default: Docker foreground mode.
`scripts/wsl-setup.sh`	WSL environment setup (venv, Typesense binary) — called by `setup.mjs --wsl`
`scripts/e2e.sh`	End-to-end test script (runs inside container/WSL)
`docker/Dockerfile`	Docker image definition
`docker/docker-compose.yml`	Docker Compose configuration

Typesense schema

The collection uses tiered semantic fields extracted by tree-sitter at index time:

Tier	Fields	Used by MCP mode
T1	`base_types`	`implements`
T1	`call_sites`	`calls`
T1	`method_sigs`	`declarations`
T2	`type_refs`	`uses`
T2	`attributes`	`attrs`
T2	`usings`	—
—	`class_names`, `method_names`, `symbols`	`text`
—	`content`	`text`

Search ranking by file type: .cs (priority 3) → .h/.cpp/.c (2) → scripts/.py/.ts (1) → config/docs (0).

The subsystem field is the first path component under the source root. Use sub= to scope searches to a subsystem.

config.json

{
  "api_key": "codesearch-local",
  "mode": "docker",
  "roots": {
    "default": "C:/myproject/src"
  }
}

This file is not checked in (listed in .gitignore). It is created by setup.mjs with an auto-generated API key. The mode field is "docker" (default) or "wsl". Roots use Windows-style paths (C:/...) and are added via ts root --add or the VS Code extension.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.github/workflows		.github/workflows
docker		docker
indexserver		indexserver
sample		sample
scripts		scripts
tests		tests
vscode-codesearch		vscode-codesearch
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
__init__.py		__init__.py
ci-test.sh		ci-test.sh
config.py		config.py
conftest.py		conftest.py
cs_ast.py		cs_ast.py
mcp.cmd		mcp.cmd
mcp.sh		mcp.sh
mcp_server.js		mcp_server.js
mcp_server.ts		mcp_server.ts
package-lock.json		package-lock.json
package.json		package.json
pyproject.toml		pyproject.toml
query.py		query.py
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
run-server-tests.sh		run-server-tests.sh
run_tests.cmd		run_tests.cmd
run_tests.mjs		run_tests.mjs
search-typesense-util.py		search-typesense-util.py
search.py		search.py
setup.cmd		setup.cmd
setup.mjs		setup.mjs
smoke-test.cmd		smoke-test.cmd
smoke-test.sh		smoke-test.sh
test-query.sh		test-query.sh
test_e2e_modes.py		test_e2e_modes.py
ts.cmd		ts.cmd
ts.mjs		ts.mjs
tscodesearch-wsl.sh		tscodesearch-wsl.sh
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

codesearch

Installation

Prerequisites

One-time setup

Adding roots

Service management

Keeping the index up to date

From Claude (MCP tools)

From the command line

Running tests

Structural query tests (no server needed)

Direct CLI usage

Full-text search (`search.py`)

Structural C# AST queries (`query.py`)

Architecture

Two-layer search

Process topology

File map

Typesense schema

config.json

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

codesearch

Installation

Prerequisites

One-time setup

Adding roots

Service management

Keeping the index up to date

From Claude (MCP tools)

From the command line

Running tests

Structural query tests (no server needed)

Direct CLI usage

Full-text search (search.py)

Structural C# AST queries (query.py)

Architecture

Two-layer search

Process topology

File map

Typesense schema

config.json

About

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Full-text search (`search.py`)

Structural C# AST queries (`query.py`)

Packages