Skip to content

feat: feedback system, adaptive retrieval, governance audit trail, API hardening, and comprehensive e2e tests#65

Merged
XuPeng-SH merged 17 commits intomatrixorigin:mainfrom
XuPeng-SH:enhance_system
Mar 20, 2026
Merged

feat: feedback system, adaptive retrieval, governance audit trail, API hardening, and comprehensive e2e tests#65
XuPeng-SH merged 17 commits intomatrixorigin:mainfrom
XuPeng-SH:enhance_system

Conversation

@XuPeng-SH
Copy link
Collaborator

What type of PR is this?

  • feat (new feature)
  • fix (bug fix)
  • docs (documentation)
  • style (formatting, no code change)
  • refactor (code change that neither fixes a bug nor adds a feature)
  • perf (performance improvement)
  • test (adding or updating tests)
  • chore (maintenance, tooling)
  • build / ci (build or CI changes)

Which issue(s) this PR fixes

Fixes #

What this PR does / why we need it

1. Feedback & Adaptive Retrieval System

  • New mem_retrieval_feedback table for explicit relevance signals (useful/irrelevant/outdated/wrong)
  • New mem_user_retrieval_params table for per-user adaptive scoring parameters
  • record_feedback() validates signals, verifies memory ownership, updates denormalized counters
  • search_hybrid_from_scored() applies feedback adjustment: (1 + fw * (useful - 0.5*negative)).clamp(0.5, 2.0)
  • DefaultScoringPlugin auto-tunes feedback_weight based on signal ratios (≥10 feedback threshold)
  • REST endpoints: POST /v1/memories/:id/feedback, GET /v1/feedback/stats, GET /v1/feedback/by-tier, GET/PUT /v1/retrieval-params, POST /v1/retrieval-params/tune

2. Governance Audit Trail Enhancement

  • Governance operations now record structured payloads: {"quarantined": N}, {"cleaned_stale": N}, etc.
  • All 5 governance operations (archive_working, cleanup_stale, quarantine, compress_redundant, cleanup_orphaned_incrementals) include detailed audit logs
  • mem_edit_log redesigned: target_ids JSONmemory_id VARCHAR(64) + payload JSON, no PK, CLUSTER BY, UUID v7 for edit_id

3. API Error Handling Improvements

  • New MemoriaError::Validation variant for input validation errors
  • New api_err_typed() function maps error variants to proper HTTP status codes:
    • NotFound → 404
    • Validation/InvalidMemoryType/InvalidTrustTier → 422
    • Blocked → 403
    • Others → 500
  • Applied to record_feedback and store_memory handlers

4. Prometheus Metrics & Admin Config

  • GET /metrics endpoint: Prometheus text exposition format with memoria_memories_total, memoria_users_total, memoria_auth_failures_total, etc.
  • GET /admin/config (master-key-only): runtime config view with redacted DB password

5. MCP Tool Surface Reduction

  • 5 tools hidden from list() but still callable via REST/direct invocation: memory_rebuild_index, memory_get_retrieval_params, memory_tune_params, memory_extract_entities, memory_link_entities
  • Tool count: 18 → 13 in public listing

6. Comprehensive E2E Test Coverage

  • Existing fixes: test_api_feedback_invalid_signal (422 for invalid signal), test_api_tune_retrieval_params (COALESCE fix for empty feedback)
  • New API tests: /metrics, /v1/snapshots/:name/rollback, /v1/entities, /admin/config (with master-key auth)
  • Concurrency tests: parallel stores, entity extraction race condition, concurrent feedback
  • Pressure tests: batch store at 100-item limit
  • Graceful degradation: nonexistent snapshot, feedback on deleted memory, correct after purge

7. Documentation & Templates Sync

  • All 8 markdown templates (Kiro, Cursor, Claude) updated with feedback/adaptive retrieval guidance
  • Steering rules synchronized across all 3 agent platforms
  • API reference and architecture skills updated

Bug Fixes

  • Fixed get_feedback_stats() NULL handling with COALESCE for empty feedback tables
  • Fixed race condition in upsert_entity(): INSERT-first, catch "Duplicate entry" error
  • Fixed batch_upsert_memory_entity_links() to use multi-row INSERT with ON DUPLICATE KEY UPDATE

Feedback weight closed-loop:
- search_hybrid_from_scored() accepts feedback_weight parameter
- retrieve_inner() loads per-user params from mem_user_retrieval_params
- Fulltext fallback path also uses per-user feedback_weight
- Hardcoded 0.1 replaced at all 3 scoring call sites
- .max().min() → .clamp() (clippy)

Observability:
- GET /metrics — Prometheus text exposition format
  memoria_memories_total, memoria_users_total, memoria_feedback_total,
  memoria_graph_nodes_total, memoria_graph_edges_total,
  memoria_snapshots_total, memoria_branches_total,
  memoria_async_tasks, memoria_governance_last_run_timestamp, memoria_info

Config visibility:
- GET /admin/config — runtime config with redacted DB password

Tests:
- test_tuning_affects_scoring rewritten to verify scoring math directly
  (avoids MatrixOne fulltext index flakiness on consecutive retrieves)
- Tool count tests updated: 15 → 18
Two processes with identical MEMORIA_INSTANCE_ID would previously
share the same holder_id, causing the re-entrant lock path to let
both processes acquire the governance lock simultaneously.

Fix: Config::from_env() appends the OS process ID to instance_id:
  MEMORIA_INSTANCE_ID=pod-0 → holder_id = 'pod-0-12345'

This ensures each process has a unique holder_id regardless of
the configured base name, while preserving re-entrant behavior
within a single process.

Tests:
- test_governance_daily_tunes_params_in_db: verifies DefaultGovernanceStrategy
  Daily task actually writes updated feedback_weight to mem_user_retrieval_params
- test_duplicate_instance_id_lock_is_exclusive: verifies PID suffix prevents
  two processes with same base instance_id from both acquiring the lock
tune_params() has three branches:
- useful_ratio > 0.7  → weight * 1.1 (was tested)
- negative_ratio > 0.5 → weight * 0.9 (NEW)
- neutral zone         → weight unchanged (NEW)
- weight clamped [0.05, 0.2] (NEW)

Added test_tune_params_negative_feedback_decreases_weight covering
all four cases via in-memory MockStore (no DB required).
Logging:
- Add TraceLayer: every request logs method+path+status+latency_ms
  (error level for 5xx, warn for 4xx, debug for 2xx/3xx)
- Auth failures now logged with warn! + token prefix (first 8 chars)

Input validation (DoS prevention):
- content: reject empty or >32 KiB
- top_k: clamp to [1, 100] on retrieve and search
- batch_store: reject >100 items; per-item content size check

Prometheus metrics (new counters):
- memoria_auth_failures_total: incremented on every 401/403
- memoria_sensitivity_blocks_total: incremented when sensitivity
  filter blocks a store request
Merged plan-memory-integration into goal-driven-evolution.md:
- Query memory before starting multi-step tasks (GOAL, LESSON, ANTIPATTERN)
- Register goals for multi-session work
- Track steps during execution (working type)
- Capture user feedback immediately (procedural type)
- Iteration review and cleanup workflow
- Changed inclusion from agent_requested to always

Applies to all AI tools: Kiro plan panel, Cursor Composer, Claude multi-step.
Merged plan-memory-integration into goal-driven-evolution.md:
- Query memory before starting multi-step tasks (GOAL, LESSON, ANTIPATTERN)
- Changed inclusion from agent_requested to always

Preserved all original content:
- 📋 PLAN structured storage with steps and risks
- 👍 FEEDBACK for positive experiences
- 🔄 RETRO with Completed [M/N], Key insight, Next improvements
- Bootstrap queries RETRO for context restoration
- Branch merge complete workflow (diff, checkout, merge, delete)
- Goal completion records Iterations count and Final approach

Added:
- When Goal is Abandoned section
- Don't create goals for quick fixes rule

Applies to all AI tools: Kiro plan panel, Cursor Composer, Claude multi-step.
- README: added memory_feedback to Core Tools table
- API Reference skill: added Feedback & Adaptive Retrieval section
  - POST /v1/memories/{id}/feedback
  - GET /v1/feedback/stats
  - GET /v1/feedback/by-tier
  - POST /v1/retrieval-params/tune
  - GET /v1/retrieval-params
- Steering rules (kiro/cursor/claude): added memory_feedback to Read tools
  with usage guidance (when to call, signal types)
@XuPeng-SH XuPeng-SH merged commit 539fd8a into matrixorigin:main Mar 20, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant