docs: static analysis feature opportunities from review taxonomy#797
docs: static analysis feature opportunities from review taxonomy#797carlos-alm merged 5 commits intomainfrom
Conversation
…onomy Analyze 1,017 inline review comments across 300 PRs to identify which problem classes codegraph could catch with new deterministic static analysis features. Proposes 18 new check predicates organized into prioritized tiers, with estimated catch rates per problem class.
|
Claude finished @carlos-alm's task —— View job Review Complete ✓Todo List
Root Cause AnalysisProblem Being Solved: This PR addresses the strategic question of which static analysis features codegraph should prioritize by systematically analyzing 1,017 review comments across 300 PRs to quantify what types of issues could be caught by deterministic static analysis. Why This Approach: Rather than building features based on intuition, this data-driven approach identifies the highest-impact static analysis opportunities by:
Risk Assessment: LOW RISK — This is purely documentation that provides strategic planning data. No code changes, no dependencies added, no breaking changes. Backlog Compliance
Content Quality AssessmentStrengths
Technical Accuracy
Minor Issues
Strategic ValueThis analysis provides exceptional strategic value for several reasons:
The 18 proposed features are well-scoped and represent genuine value-adds that would materially improve codegraph's utility for AI agents. RecommendationsAPPROVE — with minor suggestions:
Final AssessmentThis PR represents exactly the kind of evidence-based strategic thinking that mature projects need. The analysis quality is high, the methodology is sound, and the proposed features are well-aligned with codegraph's mission. |
Greptile SummaryThis PR adds three companion documents — a static-analysis feature opportunities report, a review coverage analysis, and a review taxonomy — derived from categorizing 1,017 inline review comments across 300 PRs. It is a docs-only change with no code impact. The previously flagged issues are resolved: the taxonomy's Category Summary now has 16 rows summing to exactly 1,017, and the Executive Summary correctly distinguishes ~41% deterministic from ~15% heuristic coverage rather than attributing all 55% to deterministic analysis. Three smaller documentation quality issues remain (all P2):
Confidence Score: 5/5Safe to merge — docs-only change with no code impact; all prior P1 concerns are resolved Both previously flagged issues (taxonomy totals and deterministic/heuristic conflation) are correctly addressed. Remaining findings are P2 doc-quality issues that do not affect correctness or decisions made from these reports. docs/reports/static-analysis-feature-opportunities.md (lines 12 and 136–147) and generated/REVIEW_COVERAGE_ANALYSIS.md (ROI table, lines 318–342) Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["1,017 review findings\n(300 PRs)"] --> B{Feasibility}
B -->|"7.6% (77)"| C["Already catchable"]
B -->|"32.8% (334)"| D["Feasible new feature"]
B -->|"14.9% (152)"| E["Partially feasible\nheuristic"]
B -->|"44.6% (454)"| F["Not feasible\nrequires LLM"]
C --> G["~41% deterministic\n411 findings"]
D --> G
G --> I["Tier A - Quick Wins A1-A7\n~130 caught"]
G --> J["Tier B - CFG Checks B1-B3\n+60 caught"]
G --> K["Tier C - AST+Dataflow C1-C3\n+64 caught"]
G --> L["Tier D - Compound D1-D2\n+8 caught"]
Reviews (3): Last reviewed commit: "fix(docs): correct stale-reference catch..." | Re-trigger Greptile |
| | #545 | `\|\|` operator incorrectly used for numeric quality-gain fallback | P1 | `optimiser.js` | | ||
| | #775 | `parenDepthDelta` changes comma-split evaluation order | P2 | `query-builder.ts` | | ||
| | #722 | `->` right-assignment operator not handled correctly | P2 | `r.ts` | | ||
| | #708 | Kotlin `logical_node_type` covers only `&&`, not `\|\|` | P2 | `complexity.rs` | | ||
| | #642 | Qualifier ambiguity for bare `feat:` commits | P2 | `SKILL.md` | | ||
|
|
||
| </details> | ||
|
|
||
| #### String & Encoding (4) — P1: 1 | P2: 3 | ||
|
|
||
| <details> | ||
| <summary>Show 4 comments</summary> | ||
|
|
There was a problem hiding this comment.
Category Summary totals don't add up to 1,017
The 11 categories in the Category Summary table sum to 850, not 1,017 — leaving 167 findings unaccounted for. Three subcategories visible in the Top 15 list have no corresponding top-level category row:
Titan Pipeline / Procedure Gap— 55 commentsStale / Outdated References— 52 commentsShell Script / Check Logic Bug— 48 comments
Those three alone account for 155 of the 167 gap (the remainder likely lives in other unlisted subcategories). Either these need their own top-level category rows (e.g. Stale References, Process / Shell Scripts), or they need to be rolled up into an existing category (e.g. "Process" for Titan Pipeline and Shell Script) with the counts corrected accordingly. As written, the table claims 100% categorization but is missing ~16% of the findings.
There was a problem hiding this comment.
Fixed — added 5 missing top-level category rows (Titan Pipeline: 55, Stale/Outdated References: 52, Shell Script: 48, Display/Output: 9, Other: 3) to the Category Summary table so the rows now sum to 1,017.
|
|
||
| ### Impact by Problem Class | ||
|
|
||
| | Problem class | Review findings | Catchable | Catch rate | Primary data source | |
There was a problem hiding this comment.
"55% deterministic" claim conflates deterministic and heuristic
The Executive Summary states:
~55% of issues could be caught by deterministic static analysis
But REVIEW_COVERAGE_ANALYSIS.md breaks this down explicitly:
| Feasibility | Count | % |
|---|---|---|
| Already catchable | 77 | 7.6% |
| Feasible new feature | 334 | 32.8% |
| Subtotal (deterministic) | 411 | ~41% |
| Partially feasible (heuristic) | 152 | 14.9% |
| Combined | 563 | ~55% |
The 55% figure is only reached by including "partially feasible" (heuristic) findings that the coverage analysis itself describes as having false positives. The coverage analysis correctly states "~41% deterministic … another ~15% heuristically." The Executive Summary should mirror that distinction — claiming all 55% are "deterministic" overstates the precision of the proposed checks.
There was a problem hiding this comment.
Fixed — the Executive Summary now correctly distinguishes ~41% deterministic from ~15% heuristic coverage, matching the coverage analysis breakdown. The combined ~55% is still stated but no longer attributed entirely to deterministic analysis.
- Add 5 missing top-level category rows to REVIEW_TAXONOMY.md Category Summary (Titan Pipeline, Stale References, Shell Script, Display, Other) so rows total 1,017 instead of 850 - Distinguish deterministic (~41%) from heuristic (~15%) coverage in Executive Summary instead of claiming all 55% as deterministic - Fix "95% feasible" to "91% feasible" for Prepared Statement in Loop (10/11 = 91%, consistent with companion report) - Clarify ~244 vs ~260 addressable count: ~260 is total catchable by problem class, ~244 is the subset addressed by the 18 proposed features
|
Addressed all Greptile feedback in ea8fad8:
|
…on totals Stale/Outdated References feasibility: 45% -> 52% (27/52, matching the actual feasible count from the coverage breakdown). Feature report Impact table and A4 description updated from 23/44% to 27/52%. Priority Distribution P1/P2/P3 corrected to match Category Summary column sums: P1 355->344, P2 521->522, P3/unmarked 139->149.
|
Addressed both remaining Greptile findings in 67d3855:
|
Summary
Replaces #792 (rebased on current main).
Test plan