Generate AI-optimized codebase summaries via static analysis.
codebase-ctx is a zero-dependency static analysis tool that reads a project directory and produces structured, token-efficient context about its setup, dependencies, scripts, language, runtime, and architecture. The output is designed to be injected into AI instruction files (CLAUDE.md, .cursorrules), passed as prompt context, or consumed programmatically by other tools.
Unlike source code dumpers that concatenate entire repositories (producing hundreds of thousands of tokens), codebase-ctx runs a pipeline of modular analyzers that each extract a specific dimension of project context and compress the results into a structured summary. A typical output is 300--800 tokens: enough to convey what a developer learns in their first hour with a codebase, compact enough to leave the vast majority of the context window for the actual task.
All analysis is deterministic, offline, and fast (under 500ms for most projects). No LLM calls, no network requests, no API keys required.
npm install codebase-ctxOr install globally for CLI usage:
npm install -g codebase-ctxRequires Node.js >= 18.
import { analyzeProject, analyzeDependencies, analyzeScripts } from 'codebase-ctx';
// Analyze project metadata
const project = analyzeProject('/path/to/project');
console.log(project);
// {
// name: 'my-app',
// version: '1.0.0',
// description: 'My application',
// license: 'MIT',
// language: 'TypeScript',
// runtime: 'Node.js',
// repository: 'https://github.com/user/my-app',
// nodeVersion: '>=18',
// }
// Analyze dependencies with automatic categorization
const deps = analyzeDependencies('/path/to/project');
console.log(deps.summary);
// {
// totalProduction: 5,
// totalDev: 3,
// frameworks: ['react', 'next'],
// databases: ['@prisma/client'],
// testingTools: ['vitest'],
// }
// Analyze npm scripts
const scripts = analyzeScripts('/path/to/project');
console.log(scripts.hasBuild, scripts.hasTest);
// true true- Zero dependencies -- all analysis uses Node.js built-ins (
node:fs,node:path). No AST parsers, no tree-sitter, no external tools. - Modular analyzers -- each analyzer extracts one dimension of context (project metadata, dependencies, scripts) and runs independently. Analyzers fail gracefully when their input files are missing.
- Dependency categorization -- a built-in registry of 190+ common npm packages automatically classifies dependencies into categories: framework, database, testing, build, lint, ui, auth, api, observability, and type-definitions.
- Language detection -- infers TypeScript or JavaScript from
tsconfig.json, file extensions insrc/, or the presence oftypescriptin dependencies. - Runtime detection -- infers Node.js, Bun, Deno, or Browser from
enginesfields and framework dependencies. - Script categorization -- classifies npm scripts into build, test, lint, start, deploy, and other categories by name pattern matching.
- Token estimation -- estimates LLM token counts for any text using
ceil(chars / 4). - Deterministic output -- same codebase always produces the same analysis result.
- TypeScript-first -- full type definitions shipped with the package.
Reads package.json and inspects the project directory to produce a ProjectInfo summary.
Parameters:
| Parameter | Type | Description |
|---|---|---|
projectPath |
string |
Absolute path to the project directory |
Returns: ProjectInfo
| Field | Type | Description |
|---|---|---|
name |
string | null |
Package name from package.json, or directory name as fallback |
version |
string | null |
Package version |
description |
string | null |
Package description |
license |
string | null |
License identifier (e.g. "MIT") |
language |
string |
Detected language: "TypeScript", "JavaScript", or "unknown" |
runtime |
string |
Detected runtime: "Node.js", "Bun", "Deno", "Browser", or "unknown" |
repository |
string | null |
Repository URL (cleaned of git+ prefix and .git suffix) |
nodeVersion |
string | null |
Node.js version constraint from engines.node |
Language detection checks in order: tsconfig.json existence, .ts/.tsx files in src/, typescript in dependencies. Falls back to "JavaScript".
Runtime detection checks in order: engines.bun, engines.deno, browser-only framework dependencies (React/Vue/Angular/Svelte without an SSR framework like Next/Nuxt), then defaults to "Node.js".
Fallback behavior: When no package.json exists, returns the directory name as name, "unknown" as language and runtime, and null for all other fields.
Example:
import { analyzeProject } from 'codebase-ctx';
const info = analyzeProject('/home/user/my-express-app');
// info.language === 'TypeScript'
// info.runtime === 'Node.js'
// info.repository === 'https://github.com/user/my-express-app'Reads package.json and extracts all dependency sections, categorizing each dependency by its purpose using a built-in registry.
Parameters:
| Parameter | Type | Description |
|---|---|---|
projectPath |
string |
Absolute path to the project directory |
Returns: DependencyInfo
| Field | Type | Description |
|---|---|---|
production |
DependencyEntry[] |
Dependencies from dependencies |
dev |
DependencyEntry[] |
Dependencies from devDependencies |
peer |
DependencyEntry[] |
Dependencies from peerDependencies |
optional |
DependencyEntry[] |
Dependencies from optionalDependencies |
summary |
object |
Aggregated summary (see below) |
summary fields:
| Field | Type | Description |
|---|---|---|
totalProduction |
number |
Count of production dependencies |
totalDev |
number |
Count of dev dependencies |
frameworks |
string[] |
Names of all framework dependencies across all sections |
databases |
string[] |
Names of all database dependencies across all sections |
testingTools |
string[] |
Names of all testing dependencies across all sections |
Each DependencyEntry has the shape:
{
name: string; // Package name (e.g. "react")
version: string; // Version range (e.g. "^18.2.0")
category: DependencyCategory;
}Category inference: Known packages are mapped via a built-in registry of 190+ entries. Unrecognized packages in dependencies default to "utility"; unrecognized packages in devDependencies default to "build". Packages matching @types/* are always categorized as "type-definitions".
Fallback behavior: When package.json does not exist, returns empty arrays and zero counts for all fields.
Example:
import { analyzeDependencies } from 'codebase-ctx';
const deps = analyzeDependencies('/home/user/my-project');
// Inspect categorized production deps
for (const dep of deps.production) {
console.log(`${dep.name} (${dep.category}): ${dep.version}`);
}
// react (framework): ^18.2.0
// @prisma/client (database): ^5.0.0
// winston (observability): ^3.0.0
// Use the summary for quick context
console.log(deps.summary.frameworks); // ['react']
console.log(deps.summary.databases); // ['@prisma/client']Reads the scripts field from package.json and categorizes each script by name.
Parameters:
| Parameter | Type | Description |
|---|---|---|
projectPath |
string |
Absolute path to the project directory |
Returns: ScriptInfo
| Field | Type | Description |
|---|---|---|
scripts |
ScriptEntry[] |
All scripts with their names, commands, and categories |
hasBuild |
boolean |
true if any script is categorized as "build" |
hasTest |
boolean |
true if any script is categorized as "test" |
hasLint |
boolean |
true if any script is categorized as "lint" |
hasStart |
boolean |
true if any script is categorized as "start" |
Each ScriptEntry has the shape:
{
name: string; // Script name (e.g. "build")
command: string; // Script command (e.g. "tsc")
category: 'build' | 'test' | 'lint' | 'start' | 'deploy' | 'other';
}Category mapping:
| Category | Matched script names |
|---|---|
build |
build, compile, bundle, tsc, build:* |
test |
test, spec, e2e, coverage, test:* |
lint |
lint, check, format, prettier, lint:* |
start |
start, dev, serve, develop |
deploy |
deploy, release, publish |
other |
Everything else |
Scripts prefixed with pre or post (e.g. pretest, postbuild, prepublishOnly) inherit the category of their base script.
Fallback behavior: When package.json has no scripts field or does not exist, returns an empty array and all boolean flags as false.
Example:
import { analyzeScripts } from 'codebase-ctx';
const scripts = analyzeScripts('/home/user/my-project');
if (scripts.hasTest) {
const testScripts = scripts.scripts.filter(s => s.category === 'test');
for (const s of testScripts) {
console.log(`${s.name}: ${s.command}`);
}
}
// test: vitest run
// test:unit: vitest run src/A Record<string, DependencyCategory> mapping 190+ common npm package names to their categories. This is the lookup table used by analyzeDependencies.
import { DEPENDENCY_REGISTRY } from 'codebase-ctx';
console.log(DEPENDENCY_REGISTRY['react']); // 'framework'
console.log(DEPENDENCY_REGISTRY['prisma']); // 'database'
console.log(DEPENDENCY_REGISTRY['vitest']); // 'testing'
console.log(DEPENDENCY_REGISTRY['typescript']); // 'build'
console.log(DEPENDENCY_REGISTRY['eslint']); // 'lint'
console.log(DEPENDENCY_REGISTRY['tailwindcss']); // 'ui'
console.log(DEPENDENCY_REGISTRY['passport']); // 'auth'
console.log(DEPENDENCY_REGISTRY['axios']); // 'api'
console.log(DEPENDENCY_REGISTRY['winston']); // 'observability'Categories covered: framework, database, testing, build, lint, ui, auth, api, observability.
Categorizes a single package name. Checks the built-in registry first, then @types/* prefix, then falls back to "utility" for production dependencies or "build" for dev dependencies.
Parameters:
| Parameter | Type | Description |
|---|---|---|
name |
string |
The npm package name |
isDevDep |
boolean |
Whether the package is a devDependency |
Returns: DependencyCategory -- one of "framework", "database", "testing", "build", "lint", "utility", "type-definitions", "ui", "auth", "api", "observability".
Example:
import { categorize } from 'codebase-ctx';
categorize('react', false); // 'framework'
categorize('@types/node', true); // 'type-definitions'
categorize('some-unknown-pkg', false); // 'utility'
categorize('some-unknown-pkg', true); // 'build'Synchronously checks whether a file exists at the given path.
import { fileExists } from 'codebase-ctx';
if (fileExists('/path/to/tsconfig.json')) {
// TypeScript project
}Reads a file synchronously and returns its contents as a UTF-8 string. Throws if the file does not exist.
import { readFileContent } from 'codebase-ctx';
const content = readFileContent('/path/to/package.json');Reads a file and splits it into an array of lines. Throws if the file does not exist.
import { readLines } from 'codebase-ctx';
const lines = readLines('/path/to/src/index.ts');
console.log(`${lines.length} lines`);Reads and parses a JSON file. Returns null if the file does not exist or contains invalid JSON. Never throws.
Type parameter: T -- the expected shape of the parsed JSON object.
import { readJsonFile } from 'codebase-ctx';
interface PkgJson {
name: string;
version: string;
}
const pkg = readJsonFile<PkgJson>('/path/to/package.json');
if (pkg) {
console.log(pkg.name, pkg.version);
}Estimates the LLM token count for a string using the approximation Math.ceil(text.length / 4).
Returns 0 for empty strings.
import { estimateTokens } from 'codebase-ctx';
const tokens = estimateTokens('Hello, world!');
// 4 (ceil(13 / 4))Options accepted by the analysis pipeline:
interface AnalyzeOptions {
analyzers?: AnalyzerName[]; // Which analyzers to run (default: all)
exclude?: string[]; // Directory/file patterns to exclude
detailLevel?: DetailLevel; // 'minimal' | 'standard' | 'detailed'
maxSampleFiles?: number; // Max files for pattern sampling
maxDepth?: number; // Max directory traversal depth
}Options for formatting output:
interface FormatOptions {
detailLevel?: DetailLevel; // 'minimal' | 'standard' | 'detailed'
formatter?: (context: CodebaseContext) => string; // Custom formatter function
includeTokenCount?: boolean; // Append token count to output
}Valid analyzer names:
"project" | "dependencies" | "structure" | "typescript" | "api" | "scripts" | "config" | "git" | "stats" | "patterns"
Valid dependency categories:
"framework" | "database" | "testing" | "build" | "lint" | "utility" | "type-definitions" | "ui" | "auth" | "api" | "observability"
Controls information density in formatted output:
| Level | Target tokens | Description |
|---|---|---|
minimal |
150--300 | Language, framework, architecture, entry point, build/test commands only |
standard |
400--800 | All major context dimensions with summaries |
detailed |
800--2,000 | Full dependency lists, complete API surface, all patterns with evidence |
Supported output formats:
| Format | Description |
|---|---|
markdown |
Structured markdown for CLAUDE.md / .cursorrules injection |
json |
Machine-readable JSON for programmatic consumption |
compact |
Minimal token count, maximum information density |
custom |
User-provided formatter function via FormatOptions.formatter |
All analyzers follow a graceful degradation pattern:
- Missing
package.json: Analyzers that depend onpackage.jsonreturn sensible defaults -- empty arrays, zero counts,nullfields, or the directory name as a fallback project name. - Invalid JSON:
readJsonFilereturnsnullwhen a file contains malformed JSON. Analyzers that use it handle thenullcase explicitly. - Missing files:
fileExistsreturnsfalse; analyzers skip analysis for files that do not exist rather than throwing. - No matching data: Analyzers return empty result objects (empty arrays,
falseflags) rather than throwing when the data they look for is absent.
The general contract: individual analyzers never throw. If an analyzer cannot extract data, it returns a typed fallback value. Only infrastructure-level errors (directory does not exist, permission denied) produce exceptions.
Run multiple analyzers against the same project and assemble the results:
import { analyzeProject, analyzeDependencies, analyzeScripts } from 'codebase-ctx';
const projectPath = '/home/user/my-app';
const project = analyzeProject(projectPath);
const deps = analyzeDependencies(projectPath);
const scripts = analyzeScripts(projectPath);
// Build a context summary for prompt injection
const summary = [
`Project: ${project.name} v${project.version}`,
`Language: ${project.language}, Runtime: ${project.runtime}`,
`Frameworks: ${deps.summary.frameworks.join(', ') || 'none'}`,
`Databases: ${deps.summary.databases.join(', ') || 'none'}`,
`Testing: ${deps.summary.testingTools.join(', ') || 'none'}`,
`Build: ${scripts.hasBuild ? 'yes' : 'no'}, Test: ${scripts.hasTest ? 'yes' : 'no'}`,
].join('\n');
console.log(summary);You can use categorize alongside your own logic to handle packages not in the built-in registry:
import { categorize, DEPENDENCY_REGISTRY } from 'codebase-ctx';
import type { DependencyCategory } from 'codebase-ctx';
const CUSTOM_REGISTRY: Record<string, DependencyCategory> = {
'my-internal-framework': 'framework',
'@company/auth-sdk': 'auth',
};
function customCategorize(name: string, isDev: boolean): DependencyCategory {
if (CUSTOM_REGISTRY[name]) return CUSTOM_REGISTRY[name];
return categorize(name, isDev);
}Use estimateTokens to verify that your assembled context fits within a model's context window:
import { estimateTokens } from 'codebase-ctx';
const context = '... assembled context string ...';
const tokens = estimateTokens(context);
const MODEL_LIMIT = 128_000;
const TASK_BUDGET = MODEL_LIMIT - tokens;
console.log(`Context: ${tokens} tokens, leaving ${TASK_BUDGET} for the task`);The Analyzer interface provides a contract for custom analyzer implementations:
import type { Analyzer, CodebaseContext, OutputFormat, FormatOptions } from 'codebase-ctx';
const myAnalyzer: Analyzer = {
async analyze(projectPath?: string): Promise<CodebaseContext> {
// Run analysis and return a CodebaseContext
},
async analyzeAndFormat(
projectPath?: string,
outputFormat?: OutputFormat,
formatOptions?: FormatOptions,
): Promise<string> {
// Analyze and return formatted string
},
};This package is written in TypeScript and ships type declarations (dist/index.d.ts). All public interfaces and type aliases are exported from the package root:
import type {
// Core result types
CodebaseContext,
ProjectInfo,
DependencyInfo,
DependencyEntry,
ScriptInfo,
ScriptEntry,
// Additional analyzer result types
StructureInfo,
DirectoryEntry,
TypeScriptInfo,
APISurface,
APIEntry,
ConfigInfo,
ConfigEntry,
GitInfo,
StatsInfo,
FileStat,
LanguageStat,
PatternInfo,
DetectedPattern,
AnalysisMeta,
// Option and configuration types
AnalyzeOptions,
FormatOptions,
AnalyzerConfig,
Analyzer,
// String union types
DetailLevel,
OutputFormat,
AnalyzerName,
DependencyCategory,
} from 'codebase-ctx';Compiled with target: ES2022, module: commonjs, strict: true.
MIT