Skip to content

Latest commit

 

History

History
235 lines (173 loc) · 8.95 KB

File metadata and controls

235 lines (173 loc) · 8.95 KB

SmallClaw Self-Repair System — Design & Implementation Plan

Goal: SmallClaw should be able to detect errors in its own background tasks, analyze their root cause in its own source code, propose a fix, wait for your explicit approval, apply the patch, rebuild, and report back — all over Telegram.


The Vision (Plain English)

  1. SmallClaw is running a background task while you're away
  2. It hits an error — maybe a bug in a tool, a type mismatch, a broken import
  3. Instead of just dying silently, it captures the full error + stack trace
  4. You come back and say: "Hey Claw, what happened with that task? Can you figure out the fix?"
  5. SmallClaw reads its own source, analyzes the error, and replies: "Found it. Here's what broke and why. Want me to fix it?"
  6. You say: "Yes, go ahead"
  7. It applies a surgical patch, rebuilds, restarts, and messages you: "Done. Back online."

Or even more autonomously: it proactively messages you when it hits an error — "I hit a bug in task-runner.ts. I think I know how to fix it. Want me to analyze it properly and propose a patch?"


What Already Exists (Don't Rebuild)

Component File Status
Background task engine src/gateway/task-runner.ts ✅ Complete
Multi-step task loop src/gateway/task-store.ts ✅ Complete
Error capture in tasks TaskState.error field ✅ Complete
File read/write/edit tools src/tools/files.ts ✅ Complete
apply_patch tool (unified diff) src/tools/files.ts ✅ Complete
Self-update (git pull + rebuild + restart) src/tools/self-update.ts ✅ Complete
Telegram proactive messaging telegram-channel.ts ✅ Complete
needs_approval job status src/types.ts ✅ Complete
Personality / soul files workspace/SOUL.md, IDENTITY.md ✅ Complete

The Two Critical Gaps

Gap 1 — The AI Can't Read Its Own Source Code

The read / edit tools are path-locked to workspace/. The src/ directory is completely invisible to the AI. This is the single biggest blocker.

Fix: Add a read_source tool (read-only) that exposes src/ files to the AI. Separately, add a patch_source tool that applies a unified diff to src/ files — but this tool requires an approval_token to execute (generated by you saying "yes go ahead").

Gap 2 — No SELF.md — The AI Doesn't Know Its Own Architecture

The AI has SOUL.md (who it is) and TOOLS.md (what tools it has) but nothing that tells it:

  • Where the source files live
  • What each file does
  • How the build process works
  • What the error log locations are

Fix: Create workspace/SELF.md — a map of SmallClaw's own architecture that gets injected into the system prompt like the other workspace files. The AI can then reason about where a bug would live given an error message.


Implementation Plan (Phased)

Phase 1 — Self-Knowledge (SELF.md)

Create workspace/SELF.md with:

  • Full source tree map with one-line descriptions of each file
  • Build process explanation (npm run builddist/)
  • Error log locations (gateway.log, gateway.err.log)
  • How the task runner captures errors
  • Where to look for stack traces

This costs nothing to implement — it's just a markdown file — but it dramatically improves the AI's ability to reason about errors.

Deliverable: workspace/SELF.md


Phase 2 — Source Reading Tool (read_source)

A new tool that lets the AI read files from src/ (read-only, no writes).

// src/tools/source-access.ts
read_source({ path: 'gateway/telegram-channel.ts', start_line: 1, num_lines: 50 })
list_source({ path: 'gateway' })  // list files in a src/ subdirectory

Security: Read-only. Path is always resolved relative to src/. No writes, no deletes, no traversal outside src/.

Deliverable: src/tools/source-access.ts, registered in registry.ts


Phase 3 — The Repair Proposal Flow

Add a propose_repair tool. This tool:

  1. Takes an error message + optional stack trace
  2. Uses the AI's knowledge of the source (via read_source) to identify the likely file and line
  3. Generates a unified diff patch
  4. Stores the patch in a pending state (does NOT apply it yet)
  5. Formats a clear human-readable proposal and sends it to Telegram
  6. Waits for your /approve <repair-id> or /reject <repair-id> command

The patch is stored as a JSON file in .smallclaw/pending-repairs/.

Pending repair #3:
━━━━━━━━━━━━━━━━━━━━━━━━
📍 File: src/tools/files.ts
❌ Error: Cannot read property 'path' of undefined (line 42)
🔍 Cause: args object not validated before destructuring
🩹 Fix: Add null-check guard before line 42

--- a/src/tools/files.ts
+++ b/src/tools/files.ts
@@ -40,6 +40,9 @@
 export async function executeRead(args: ReadToolArgs) {
+  if (!args || typeof args.path !== 'string') {
+    return { success: false, error: 'path is required' };
+  }
   const absPath = resolveWorkspacePath(args.path);

━━━━━━━━━━━━━━━━━━━━━━━━
Reply /approve 3 to apply, or /reject 3 to discard.

Deliverable: src/tools/self-repair.ts


Phase 4 — Apply + Rebuild (The Confirmation Gate)

When you reply /approve <id>:

  1. Load the pending repair from .smallclaw/pending-repairs/<id>.json
  2. Check the patch still applies cleanly (git apply --check)
  3. Apply it to src/
  4. Run npm run build
  5. If build passes → restart gateway → message "Fixed and back online ✅"
  6. If build fails → revert the patch → message "Build failed after patch, reverted ❌. Here's the compiler error:"

The /reject <id> command just deletes the pending file and messages "Repair discarded."

Deliverable: Approval handling in telegram-channel.ts + src/tools/self-repair.ts


Phase 5 — Proactive Error Reporting (Optional / Future)

When a background task fails with an error that looks like a source code bug (stack trace points to src/ or dist/), SmallClaw automatically:

  1. Captures the error + stack
  2. Does a quick analysis (does the stack point to a known source file?)
  3. Messages you: "Task X failed with what looks like a source bug. Want me to analyze it?"

This makes the whole loop feel truly autonomous — it notices, it tells you, it waits for your go-ahead.


Data Flow Diagram

Background Task Running
        │
        ▼
   Error Occurs
        │
        ├─── Stack trace captured in TaskState.error
        │
        ▼
You: "Claw, analyze that error"
        │
        ▼
AI reads SELF.md → knows which file to look at
        │
        ▼
AI calls read_source() → reads the actual source file
        │
        ▼
AI generates unified diff patch
        │
        ▼
propose_repair() → stores patch, sends Telegram proposal
        │
        ▼
You: "/approve 3"
        │
        ▼
patch_source() → applies diff to src/
        │
        ▼
npm run build
        │
   ┌────┴────┐
   │         │
PASS       FAIL
   │         │
Restart   Revert + notify
   │
Message: "Fixed ✅"

Security Model

Action Allowed Requires
Read source files AI can do autonomously
List source files AI can do autonomously
Analyze error + propose patch AI can do autonomously
Apply patch to source 🔒 Your explicit /approve <id>
Run build 🔒 Triggered only after your approval
Restart gateway 🔒 Triggered only after successful build
Modify workspace files Already permitted (existing tools)

The AI cannot apply any source changes without an explicit approval command from you. Period.


File Checklist

  • workspace/SELF.md — architecture map for the AI
  • src/tools/source-access.tsread_source and list_source tools
  • src/tools/self-repair.tspropose_repair tool + patch storage
  • src/gateway/telegram-channel.ts/approve and /reject command handlers
  • src/tools/registry.ts — register the two new tools
  • CHANGELOG.md — document the feature when shipped

Open Questions / Decisions Needed

  1. Model capability: Self-repair requires the AI to write valid unified diffs. Qwen3:4b may struggle with this — consider gating propose_repair behind the secondary/orchestration model if one is configured.

  2. Build output: Should build errors be sent in full to Telegram (could be long) or truncated? Suggest: first 50 lines of compiler output, with a /browse link to the full log.

  3. Repair history: Should accepted/rejected repairs be logged to workspace/memory/? Recommended yes — gives the AI long-term awareness of what bugs it has found and fixed.

  4. Auto-propose threshold: Should the AI proactively propose repairs without being asked, or only when you explicitly ask? Recommend: proactive notification ("I found a bug") but passive proposal ("want me to analyze it?") — never auto-apply.