All notable changes to this project are documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
lib/validation.shmodule withcheck_artifacts()(inspects.aidd/project-level assertions against a per-artifact severity catalog and writes.artifacts-check.json) andsuggest_post_interview_cleanup()(prompts operators to close the interview → doc2feature loop when responses exist without anassertions.md). Integrated intoaidd.shvia a new--check-artifactsmode and an end-of-interview nudge.audits/ASSERTIONS.md— Project Assertions / Invariants Audit framework for verifying the codebase honors every invariant declared in.aidd/assertions.md.- Audit-review bundle (2026-04-23): full
/audit-reviewpass across every stale audit definition. Thirteen audit files touched:SECURITY.mdv3.2 → v3.3: added 10 checklist rows across password security, SQL injection prevention, rate limiting, transport headers, WebSocket security, and encryption; hoisted the MFA QRdangerouslySetInnerHTMLexemption; narrowed rate-limit production-scope wording.DEPLOYMENT.mdv2.0 → v3.0: removed all Vercel/Sentry/Next.js/ winston/Codecov content; rebuilt around containerized self-host with health checks, metrics, structured logging, alerting, audit log archival, and backup verification. Now consolidates the retiredMONITORING.md.PERFORMANCE.mdv1.0 → v2.0: Prisma → Drizzle examples; added ResponsiveContainer ban + recharts v3<Cell>deprecation +useContainerWidthhook; React 19 concurrent features section (useTransition,useDeferredValue); manual memoization flagged unless profiling-justified.LIGHTHOUSE.mdv1.0 → v2.0: FID purged, INP documented with lab-measurement caveat (use TBT as proxy), crawltest integration added, pre-2026-04-05 LCP/FCP crawler measurement-artifact note.DEVOPS.mdv1.0 → v2.0: restructured as CI-process-only with hard scope boundary to DEPLOYMENT; SHA-pinned actions, frozen- lockfile, branch protection, dependabot, gitleaks, CODEOWNERS guidance; IaC/Terraform section removed.TESTING.mdv1.0 → v2.0: full rewrite around the architectural rule No Unit Test Frameworks. Every vitest/jest/RTL/cypress/ playwright mention now appears only as an anti-pattern flag; the canonical 14-stepsmoke:qcpipeline is encoded verbatim from STACK.md.DEAD_CODE.mdv1.2 → v1.3:knip+knip.jsonelevated to first-class; addedcheck:feature-integration+ React.lazy renamed-default adapter + dual SQLite/Postgres schema exclusion- CSS-only imports + orphaned Zustand stores; new Boundaries-vs- Adjacent-Audits matrix.
CONVEX.mdv1.1 → v1.2: fixed two factual errors (fabricatedconvex.jsonenvironment block;process.memoryUsage()not in V8 runtime); added sections for Convex Components (stable 1.17+), Cron Jobs, HTTP Actions, Clerk+Convex auth wire-up.TECHDEBT.mdv1.x → v2.0: cross-audit relationships, false positives, missing stack checks. Subsequent v2.0 re-audit fixed a stale footer date and theconfigLoader.ts→configSecrets.tsenv-exception reference; broadened the React Compiler exception clause; labeled the bash-only pre-audit commands.SCHEMA_CONSTRAINTS.mdv1.2 → v1.3: softened grep commands with false-positive warnings; added real-world template unique-column inventory (incl.apiKeys.keyHashintentionally non-unique due to bcrypt salting); added dual-composite uniqueness pattern (oauthAccounts); added Drizzle partial unique index syntax; clarified dual-dialect reality (apps carry bothschema/andschema-pg/).SPERNAKITV1.mdv2.0 → v2.1 andSPERNAKITV2.mdv3.0 → v3.1: archived with lifecycle banner, retained read-only for historical reference; superseded bySPERNAKIT.mdv3.4.MONITORING.md: deleted. Content consolidated intoDEPLOYMENT.md.
- Breaking (operators must migrate):
DEFAULT_SPEC_FILEflipped fromapp_spec.txttospec.md. All prompts, scripts, help messages, docs, and benchmark fixtures now reference.aidd/spec.md. Projects with an existing.aidd/app_spec.txtmust rename the file (git mv .aidd/app_spec.txt .aidd/spec.md) — no backward-compat fallback. XML-formatted specs should be converted to markdown headings during the migration. zrunsystem prompt now explicitly forbidsnpm/npx(previously just preferredbun/bunx).docs/audit_guide.mdupdated: removedMONITORINGfrom audit inventory tables and example command lines; updatedDEPLOYMENTdescription to reflect its consolidated scope; updatedDEVOPSdescription to drop IaC and focus on CI/quality gates; bumpedTECHDEBTestimated time to 2-6h to match its v2.0 frontmatter.
- Corrected future-dated
last_updatedfrontmatter on two audit definitions that would have misled the/audit-reviewfreshness logic:TECHDEBT.md(2026-06-28→2026-04-23) andSCHEMA_CONSTRAINTS.md(2026-07-11→2026-03-30). Also corrected a matching future-dated audit report in aidd-web (SCHEMA_CONSTRAINTS-2026-05-18.md→-2026-04-23.md).
- Benchmark harness now understands zrun providers.
tools/run-benchmark.mjsforwardsstack.providerasZRUN_PROVIDERon spawn so a zrun-ollama stack and a zrun-zhipu stack can coexist on the leaderboard instead of collapsing into one row.buildRunMatrixalso honours per-taskscoredRepetitions/warmupRepetitionsoverrides so a diagnostic task can run at a different cadence from the 1+3 standard. - New
quiztask +benchmarks/fixtures/quiz/fixture: a codebase comprehension diagnostic that asks the agent to read bundled snapshots ofzrun/src/tools/index.ts,zrun/src/config.ts, andbenchmarks/manifest.jsonand answer four identifier-lookup questions. Scored by regex hit-rate per question with an ANSI-stripping iteration-log fallback for models that emit answers to stdout instead of callingwrite_file. - Four new Ollama stacks in
benchmarks/manifest.jsoncovering the models the quiz was run against:zrun-ollama-llama32-native,zrun-ollama-qwen35-9b-native,zrun-ollama-gpt-oss-20b-native, andzrun-ollama-qwen25-cline-native. Existingzrun-glm51-nativegains an explicitprovider: "zhipu"for schema consistency. tools/quiz-rank.mjsrenders the quiz rows fromleaderboard.jsonas a ranked markdown table with per-question hit breakdown pulled fromruns.jsonlnotes.
- Ollama provider default model changed from
llama3.1:8btogpt-oss:20b. In the 2026-04-19 codebase comprehension quiz,gpt-oss:20bwas the only installed local model to score a clean sweep;llama3.2:latest,qwen3.5:9b, andqwen2.5-7b-clineall failed at the tool-call or response-writing layer. Users who want the older default can pin it viaproviders.ollama.modelinzrun/config.jsonor pass--model llama3.1:8bper run.zrun/config.json.exampleanddocs/ollama-integration.mdupdated to match.
zrun/config.ollama.example.json. The combinedzrun/config.json.examplecovers every shape this file did — an Ollama-only setup is justconfig.json.examplewithdefaultProviderflipped to"ollama"and/or thezhipublock removed. Reduces maintenance surface and the chance of the two examples drifting out of sync on future changes.
- ZRun config.json gains a multi-provider shape: both Zhipu AI and Ollama
settings can live in the same file under
providers.zhipuandproviders.ollama, alongside an optionaldefaultProviderfield. No more hand-editing JSON to swap backends. - Runtime provider selection via four cascading signals (highest first): the
--provider <name>CLI flag, theZRUN_PROVIDERenv var, auto-inference from--model <name>when the model is uniquely owned by one provider's registry (e.g.--model llama3.1:8broutes to ollama,--model glm-5.1routes to zhipu), anddefaultProviderin the config. - Ollama provider registry now carries a curated list of known model families
(
llama3.1,qwen2.5,codellama,deepseek-coder-v2,gpt-oss, etc.) used for--model X → providerinference. The actual installed model list is still discovered at runtime from/api/tags. - Startup banner now shows both provider and model (e.g.
[zrun] Provider: ollama Model: llama3.1:8b) so it's always obvious which backend fired for a given run.
ZRunFileConfig(on-disk) is now distinct fromZRunConfig(resolved). The normalizer promotes a legacy flat config ({ apiKey, model, baseUrl }at top level) intoproviders.zhipuautomatically, so existing setups continue to work without any migration.- API-key-missing error now names the right config path
(
providers.zhipu.apiKey) instead of the deprecated flatapiKey. zrun/config.json.examplerewritten to the multi-provider shape with both zhipu and ollama entries populated as a starting point.- Root
package.jsonandVERSIONaligned at0.14.0;zrun/package.jsonbumped0.3.0 → 0.4.0.
- ZRun gains a provider abstraction layer.
ZRunConfignow carries aproviderfield (defaults to"zhipu"so existing configs keep working unchanged) and bothapiKeyandbaseUrlare optional — each provider declares its own defaults viasrc/providers/index.ts. - Ollama support via
provider: "ollama". Uses Ollama's OpenAI-compatible endpoint, health-probes/api/tagswith a 5s timeout on both the probe and the model listing so a hung local instance can no longer stall CLI startup. Whenmodelis unspecified, auto-selection prefers the provider's declareddefaultModelif installed, else sorts alphabetically so repeat runs are deterministic. zrun/test-ollama.ts— a read-only/api/tagsprobe that reports installed models and exits non-zero when the server is unreachable or has no models pulled.docs/ollama-integration.md— end-to-end Ollama setup, model recommendations, troubleshooting, migration from Zhipu AI, and security notes..aidd/features/records forzrun-provider-abstraction,zrun-ollama-integration, andzrun-ollama-docs-testing.
src/config.tsloadConfigis now async to accommodate the Ollama health probe. The only call site issrc/index.ts, which awaits it at startup.src/client.tsno longer reuses the provider name as a dummy API key; providers that don't require auth (Ollama) get a namedNO_AUTH_DUMMY_KEYsentinel so the OpenAI SDK constructor accepts a non-empty value.- Root
package.jsonversion aligned toVERSION(0.11.2 → 0.13.0); the two had drifted during the 0.12.0 release.
- Zhipu default
baseUrlinsrc/config.tswas missing the/api/path prefix (https://api.z.ai/coding/paas/v4→https://api.z.ai/api/coding/paas/v4).config.json.examplealready had the correct URL, so users who copied the example were unaffected; only users relying on the in-code default would have hit a 404 at first request.
- Coordinator mode (
--coordinator) for fleet-level analysis. A coordinator agent reads a fleet summary JSON, evaluates every project across all signal dimensions, identifies cross-project patterns, and produces structured suggestions for the aidd-web approval queue. Requires--fleet-summaryand--coordinator-outputpaths; optional--suggestion-schemafor output schema reference. Forces single-iteration execution. - Coordinator prompt template (
prompts/coordinator.md) with task type selection matrix, risk-level heuristics, cross-project pattern detection, and structured JSON output schema.
- Feature IDs now use clean descriptive slugs (e.g.,
user-authentication,dashboard-page) instead of thefeature-{timestamp}-{random}format. Remediations and audit findings retain their existing prefixed formats. Updated validation regex, prompts, documentation, and thegenerate-features.shtool to match. - Updated prettier from 3.8.2 to 3.8.3.
- Codex CLI rate limit detection now works correctly. Previously, rate limits from Codex
(
"You've hit your usage limit") were missed due to an unboundPATTERN_RATE_LIMITvariable in thebash -csubprocess and a pattern mismatch ("hit your limit"vs"hit your usage limit"). Iterations fell through as genericexit=1failures instead of sleeping until the rate limit reset. - Widened
PATTERN_RATE_LIMITfrom"hit your limit"to"hit your"to match both Claude Code/ZRun and Codex rate limit message formats. - Added
[RATE_LIMITED]tag detection tomonitor_coprocess_outputas a secondary detection path for JSON-parsed CLIs. - Extended
parse_rate_limit_resetregex to handle Codex's"try again at 5:19 AM"format and uppercase AM/PM. - Fixed latent unbound variable in Claude Code's
json-parser.sh(same root cause, currently unreachable but would surface if stream-json format changed).
- Benchmark harness with fixture manifests, disposable workspaces, cohort comparisons, and machine-readable result artifacts.
- Benchmark documentation covering fixtures, outputs, and report generation.
- New benchmark package scripts for full runs, dry runs, and report-only rebuilds.
- Interview mode (
--interview [FILE]) for iterative codebase Q&A. Processes one question per iteration from.aidd/questions.md(or a supplied file), auto-detecting## headingor?-terminated line formats. Writes per-question responses to.aidd/responses/responseN.mdand maintains aresponses.mdindex of status and links. The interview prompt is read-only — the agent is forbidden from modifying code, features, changelogs, or other metadata, and git-based stuck detection is skipped because onlyresponses/is expected to change. Progress is tracked by response-file count, so partial runs resume at the next unanswered question and the loop exits cleanly once every question has a matching response file.
- Codex CLI execution now uses the bash-oriented bypass path to avoid Windows sandbox shell mismatches during unattended runs.
- Feature validation now requires a
dependenciesarray, with[]as the canonical empty value. - ZRun defaults now allow longer sessions by raising
maxTurnsto 500.
- Prompting and audit guidance now enforce a stronger verification gate before audit findings are converted into remediation work.
- Codex CLI backend with
codex exec --jsonsupport and JSONL parsing. - CLI-specific prompt prelude loading from
prompts/_cli/*. - React Composition Patterns and refreshed React Best Practices audits.
- Idle nudge timeout increased to reduce premature interruptions.