A Go implementation of Git repository hotspot analysis, based on the Bugspots bug prediction heuristic. This tool analyzes Git repositories to identify files and commits most likely to contain bugs based on historical change patterns.
bugspots-go provides three analysis modes:
Examines your Git repository's commit history and calculates risk scores for each file based on:
- Commit Frequency (30%): Files changed frequently may have more issues
- Code Churn (25%): High volume of added/deleted lines indicates instability
- Recency (20%): Recently changed files are more likely to have new bugs
- Burst Activity (15%): Concentrated changes in short time periods suggest rushed work
- Ownership (10%): Many contributors may indicate unclear ownership
Analyzes individual commits for Just-In-Time (JIT) defect prediction based on research-backed metrics:
- Diffusion Metrics (35%): Number of files (NF), directories (ND), and subsystems (NS) affected
- Size Metrics (35%): Lines added (LA) and lines deleted (LD)
- Change Entropy (30%): How spread out the changes are across files (Shannon entropy)
Detects file pairs that frequently change together, indicating hidden dependencies:
- Jaccard Coefficient: Similarity measure between file change sets
- Confidence: Probability that file B changes when file A changes
- Lift: How much more likely files change together than by chance
cd bugspots-go
go build -o bugspots-go .# Analyze current directory
./bugspots-go analyze
# Analyze specific repository
./bugspots-go analyze --repo /path/to/repo
# Analyze with date range
./bugspots-go analyze --repo /path/to/repo --since 2025-01-01 --top 30
# Show score breakdown
./bugspots-go analyze --repo /path/to/repo --explain
# Export to JSON
./bugspots-go analyze --repo /path/to/repo --format json --output hotspots.json# Analyze commits for risk
./bugspots-go commits
# Analyze commits with date range
./bugspots-go commits --repo /path/to/repo --since 2025-01-01 --top 20
# Filter by risk level
./bugspots-go commits --repo /path/to/repo --risk-level high
# Show detailed score breakdown
./bugspots-go commits --repo /path/to/repo --explain
# Export to JSON
./bugspots-go commits --repo /path/to/repo --format json --output commits.json# Analyze file coupling patterns
./bugspots-go coupling
# Analyze with minimum thresholds
./bugspots-go coupling --repo /path/to/repo --min-co-commits 5 --min-jaccard 0.2
# Show top coupling pairs
./bugspots-go coupling --repo /path/to/repo --top-pairs 100
# Skip large refactoring commits
./bugspots-go coupling --repo /path/to/repo --max-files 30
# Export to JSON
./bugspots-go coupling --repo /path/to/repo --format json --output coupling.json# Console output (default)
./bugspots-go analyze --repo /path/to/repo
# JSON output
./bugspots-go analyze --repo /path/to/repo --format json --output hotspots.json
# CSV output
./bugspots-go analyze --repo /path/to/repo --format csv --output hotspots.csv
# Markdown output (great for PR comments)
./bugspots-go analyze --repo /path/to/repo --format markdown --output hotspots.mdbugspots-go identifies bugfix commits by matching commit messages against regex patterns. By default, the following patterns are used:
| Pattern | Matches |
|---|---|
\bfix(ed|es)?\b |
fix, fixed, fixes |
\bbug\b |
bug |
\bhotfix\b |
hotfix |
\bpatch\b |
patch |
You can customize these patterns via CLI flags or the configuration file:
# Override with custom patterns (replaces defaults)
./bugspots-go analyze --repo /path/to/repo \
--bug-patterns "\bfix(ed|es)?\b" \
--bug-patterns "\bresolve[ds]?\b" \
--bug-patterns "\bcve-\d+"
# Match Conventional Commits style
./bugspots-go analyze --repo /path/to/repo \
--bug-patterns "^fix(\(.+\))?:"Or in .bugspots.json:
{
"bugfix": {
"patterns": [
"\\bfix(ed|es)?\\b",
"\\bbug\\b",
"\\bhotfix\\b",
"\\bresolve[ds]?\\b"
]
}
}All patterns are case-insensitive. CLI flags override config file settings. For detailed pattern syntax, see docs/SCORING.md.
You can filter which files to analyze using glob patterns, either via CLI flags or configuration file.
# Include only Go source files (excluding tests)
./bugspots-go analyze --include "**/*.go" --exclude "**/*_test.go"
# Analyze specific directories
./bugspots-go analyze --include "src/**" --include "apps/**"
# Exclude common non-source files
./bugspots-go analyze --exclude "**/vendor/**" --exclude "**/node_modules/**"
# Exclude generated files by extension
./bugspots-go analyze --exclude "**/*.pb.go" --exclude "**/*.gen.go" --exclude "**/*.min.js"
# Combine include and exclude
./bugspots-go analyze --repo /path/to/repo \
--include "src/**" --include "internal/**" \
--exclude "**/testdata/**" --exclude "**/*_mock.go"Create a .bugspots.json file in your project root or home directory:
{
"filters": {
"include": ["src/**", "apps/**", "internal/**"],
"exclude": [
"**/vendor/**",
"**/node_modules/**",
"**/testdata/**",
"**/*.min.js",
"**/*.pb.go"
]
}
}Then run with config:
./bugspots-go analyze --config .bugspots.jsonNote: CLI flags override config file settings.
Powered by doublestar/v4, supporting rich glob patterns:
| Pattern | Matches | Example |
|---|---|---|
**/*.ext |
Files with extension anywhere | **/*.go, **/*.js |
dir/** |
All files under directory | src/**, apps/** |
**/name/** |
Directory anywhere | **/vendor/**, **/testdata/** |
* |
Any characters (except /) |
src/*.go (single level) |
? |
Single character | file?.go |
{a,b} |
Alternatives | **/*.{js,ts} |
- Exclude takes precedence: If a file matches any exclude pattern, it's skipped
- Include empty = all files: If no include patterns specified, all non-excluded files are included
- Include specified = allowlist: If include patterns exist, only matching files are included
# Go projects: analyze source code only
./bugspots-go analyze --include "**/*.go" --exclude "**/*_test.go" --exclude "**/vendor/**"
# JavaScript projects: exclude minified and dependencies
./bugspots-go analyze --include "src/**" --exclude "**/*.min.js" --exclude "**/node_modules/**"
# Monorepo: specific packages only
./bugspots-go analyze --include "packages/core/**" --include "packages/api/**"
# Exclude generated code
./bugspots-go analyze --exclude "**/*.pb.go" --exclude "**/*.gen.go" --exclude "**/mocks/**"| Option | Alias | Description | Default |
|---|---|---|---|
--repo <PATH> |
-r |
Path to Git repository | Current directory |
--since <DATE> |
Start date for analysis (YYYY-MM-DD) | All history | |
--until <DATE> |
End date for analysis | Now | |
--branch <NAME> |
-b |
Branch to analyze | HEAD |
--rename-detect <MODE> |
Rename detection: off, simple (exact), aggressive (similarity) | simple | |
--top <N> |
-n |
Number of top results | 50 |
--format <FORMAT> |
-f |
Output format: console, json, csv, markdown | console |
--output <PATH> |
-o |
Output file path | stdout |
--explain |
-e |
Include score breakdown | false |
--config <PATH> |
-c |
Configuration file path | .bugspots.json |
--include <PATTERN> |
Glob patterns to include (repeatable) | All files | |
--exclude <PATTERN> |
Glob patterns to exclude (repeatable) | None |
| Option | Description | Default |
|---|---|---|
--half-life <DAYS> |
Half-life for recency decay (days) | 30 |
--bug-patterns <REGEX> |
Regex patterns for bugfix commit detection (repeatable) | See Bugfix Keywords |
--diff <REFSPEC> |
Analyze only files changed between refs (e.g., origin/main...HEAD) | |
--ci-threshold <SCORE> |
Exit with non-zero status if any file exceeds this risk score | |
--include-complexity |
Include file complexity (line count) in scoring | false |
| Option | Alias | Description | Default |
|---|---|---|---|
--risk-level <LEVEL> |
-l |
Filter by risk: high, medium, all | all |
| Option | Description | Default |
|---|---|---|
--min-co-commits <N> |
Minimum co-commits to consider coupling | 3 |
--min-jaccard <FLOAT> |
Minimum Jaccard coefficient threshold | 0.1 |
--max-files <N> |
Maximum files per commit (skip large commits) | 50 |
--top-pairs <N> |
Number of top coupled pairs to report | 50 |
Create a .bugspots.json or specify with --config:
{
"scoring": {
"halfLifeDays": 30,
"weights": {
"commit": 0.20,
"churn": 0.20,
"recency": 0.15,
"burst": 0.10,
"ownership": 0.10,
"bugfix": 0.15,
"complexity": 0.10
}
},
"bugfix": {
"patterns": [
"\\bfix(ed|es)?\\b",
"\\bbug\\b",
"\\bhotfix\\b",
"\\bpatch\\b"
]
},
"burst": {
"windowDays": 7
},
"commitScoring": {
"weights": {
"diffusion": 0.35,
"size": 0.35,
"entropy": 0.30
},
"thresholds": {
"high": 0.7,
"medium": 0.4
}
},
"coupling": {
"minCoCommits": 3,
"minJaccardThreshold": 0.1,
"maxFilesPerCommit": 50,
"topPairs": 50
},
"filters": {
"include": ["src/**", "apps/**"],
"exclude": [
"**/vendor/**",
"**/testdata/**",
"**/*.min.js",
"**/*.pb.go"
]
}
}For detailed documentation of all scoring formulas, normalization methods, and calculation algorithms, see docs/SCORING.md.
File Hotspots (Top 10)
Repository: /path/to/repo
Period: 2025-01-01 to 2025-02-04
+----+-----------------------------+-------+---------+-----------+--------------+--------------+-------+
| # | Path | Score | Commits | Churn | Last Modified| Contributors | Burst |
+----+-----------------------------+-------+---------+-----------+--------------+--------------+-------+
| 1 | src/core/engine.go | 0.82 | 18 | +420/-390 | 2025-02-01 | 5 | 0.73 |
| 2 | src/api/controller.go | 0.77 | 15 | +300/-200 | 2025-01-29 | 4 | 0.65 |
| 3 | src/services/handler.go | 0.71 | 12 | +250/-180 | 2025-01-28 | 3 | 0.58 |
+----+-----------------------------+-------+---------+-----------+--------------+--------------+-------+
Note: Risk score is an indicator, not a definitive measure of bugs.
{
"repo": "/path/to/repo",
"since": "2025-01-01T00:00:00Z",
"until": "2025-02-04T00:00:00Z",
"generatedAt": "2025-02-04T12:34:56Z",
"items": [
{
"path": "src/core/engine.go",
"riskScore": 0.82,
"metrics": {
"commitCount": 18,
"addedLines": 420,
"deletedLines": 390,
"lastModified": "2025-02-01T09:10:00Z",
"contributorCount": 5,
"burstScore": 0.73
},
"breakdown": {
"commitComponent": 0.25,
"churnComponent": 0.22,
"recencyComponent": 0.18,
"burstComponent": 0.11,
"ownershipComponent": 0.06
}
}
]
}{
"repo": "/path/to/repo",
"since": "2025-01-01T00:00:00Z",
"until": "2025-02-04T00:00:00Z",
"generatedAt": "2025-02-04T12:34:56Z",
"items": [
{
"sha": "abc1234",
"message": "Refactor auth module",
"author": "Developer <[email protected]>",
"when": "2025-02-01T10:00:00Z",
"riskScore": 0.85,
"riskLevel": "high",
"metrics": {
"fileCount": 12,
"directoryCount": 5,
"subsystemCount": 3,
"linesAdded": 450,
"linesDeleted": 200,
"changeEntropy": 0.78
},
"breakdown": {
"diffusionComponent": 0.30,
"sizeComponent": 0.32,
"entropyComponent": 0.23
}
}
]
}{
"repo": "/path/to/repo",
"since": "2025-01-01T00:00:00Z",
"until": "2025-02-04T00:00:00Z",
"generatedAt": "2025-02-04T12:34:56Z",
"totalCommitsAnalyzed": 150,
"pairs": [
{
"fileA": "src/api/handler.go",
"fileB": "src/api/handler_test.go",
"coCommits": 25,
"jaccard": 0.83,
"confidence": 0.89,
"lift": 4.2
}
]
}Run weekly to identify files that need attention:
./bugspots-go analyze --repo . --since $(date -d "7 days ago" +%Y-%m-%d) --format markdown --output weekly-hotspots.mdAdd to your CI pipeline to warn about changes to high-risk files:
- name: Check Hotspots
run: |
./bugspots-go analyze --repo . --format json --output hotspots.json
./bugspots-go commits --repo . --risk-level high --format json --output risky-commits.jsonGenerate a list of high-risk files for AI code review:
./bugspots-go analyze --repo . --top 10 --format json | jq '.items[].path'Detect Hidden Dependencies
Find files that should be reviewed together:
./bugspots-go coupling --repo . --min-jaccard 0.5 --format markdown- Risk is not Bugs: The risk score is an indicator based on change patterns, not a definitive measure of bugs. Use it to prioritize review efforts, not as absolute truth.
- Context Matters: A high-risk score might indicate a file that's actively being improved, not necessarily one that's problematic.
- Large Commits: The coupling analysis automatically skips commits with many files (configurable via
--max-files) to avoid noise from refactoring or merge commits.
go test ./...go fmt ./...github.com/urfave/cli/v2- Command-line interface frameworkgithub.com/go-git/go-git/v5- Git repository interactiongithub.com/fatih/color- Colored console outputgithub.com/olekukonko/tablewriter- Table output formattinggithub.com/bmatcuk/doublestar/v4- Glob pattern matching
bugspots-go/
├── app.go # Entry point
├── cmd/
│ ├── root.go # CLI configuration, common flags
│ ├── analyze.go # File hotspot analysis command
│ ├── commits.go # JIT commit risk analysis command
│ ├── coupling.go # Change coupling analysis command
│ └── calibrate.go # Score weight calibration command
├── config/
│ └── config.go # Configuration structures
├── internal/
│ ├── git/
│ │ ├── models.go # CommitInfo, FileChange, CommitChangeSet
│ │ └── reader.go # Git history reader (go-git)
│ ├── scoring/
│ │ ├── normalization.go # NormLog, RecencyDecay, MinMax
│ │ ├── file_scorer.go # 5-factor file scoring
│ │ └── commit_scorer.go # JIT commit scoring
│ ├── aggregation/
│ │ ├── file_metrics.go # File-level metrics aggregation
│ │ └── commit_metrics.go # Commit-level metrics calculation
│ ├── burst/
│ │ └── sliding_window.go # O(n) burst score calculation
│ ├── entropy/
│ │ └── shannon.go # Shannon entropy calculation
│ ├── coupling/
│ │ └── analyzer.go # Change coupling analysis
│ └── output/
│ ├── formatter.go # Output interfaces
│ ├── console.go # Console table output
│ ├── json.go # JSON output
│ ├── csv.go # CSV output
│ └── markdown.go # Markdown output
└── go.mod
- igrigorik/bugspots - Original Ruby implementation
- Google Engineering Tools Blog - Original research
MIT License
Inspired by the original bugspots algorithm by Ilya Grigorik and JIT defect prediction research.