Skip to content

MMR post-retrieval dedup for brain_search (R86 Q4)#242

Merged
EtanHey merged 4 commits intomainfrom
feat/mmr-dedup
Apr 16, 2026
Merged

MMR post-retrieval dedup for brain_search (R86 Q4)#242
EtanHey merged 4 commits intomainfrom
feat/mmr-dedup

Conversation

@EtanHey
Copy link
Copy Markdown
Owner

@EtanHey EtanHey commented Apr 14, 2026

Design Summary

  • Implements the R86 Q4 MMR post-retrieval reranking stage in SearchMixin.hybrid_search.
  • Overfetches a 50-candidate pool per retrieval leg, keeps existing RRF plus importance/recency/KG boosts, then reranks the top candidate pool with NumPy cosine MMR at lambda=0.65.
  • Uses float embeddings from chunk_vectors for the similarity matrix so near-duplicate chunks lose top slots without changing existing filters or MCP response formatting.
  • Adds a regression test proving the highest-relevance result stays first while a near-duplicate is pushed behind a distinct relevant result.

Test Output

$ pytest tests/test_hybrid_search.py tests/test_search_quality.py -x -v
============================== 31 passed in 2.76s ==============================
$ pytest
Repo-wide suite is blocked in this environment at:
- tests/test_eval_baselines.py::TestEntityRouting::test_avi_simon_entity

Failure:
- RuntimeError: Cannot send a request, as the client has been closed.

Observed cause:
- The live embedding model load attempted network access to https://huggingface.co/BAAI/bge-large-en-v1.5/resolve/main/adapter_config.json under restricted network.

Note

Add MMR post-retrieval deduplication to hybrid_search in SearchMixin

  • Adds _mmr_rerank_scored_results to search_repo.py, implementing Maximal Marginal Relevance re-ranking that balances relevance and diversity using cosine similarity over chunk embeddings (_MMR_LAMBDA = 0.65).
  • Adds _load_chunk_embeddings helper to fetch float32 embeddings from the chunk_vectors table for MMR input.
  • hybrid_search now fetches a minimum of 50 candidates per leg (_MMR_CANDIDATE_LIMIT) instead of only n_results * 3, increasing the pool available for re-ranking.
  • Changes the default since_hours for brain_enrich across the CLI, MCP tool schema, and handler from 24 to 8760 (1 year), configurable via BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS.
  • Behavioral Change: hybrid_search result ordering will change for callers that relied on pre-MMR ranking; the minimum candidate fetch size also increases, which may affect latency.
📊 Macroscope summarized d545514. 5 files reviewed, 4 issues evaluated, 2 issues filtered, 1 comment posted

🗂️ Filtered Issues

src/brainlayer/mcp/__init__.py — 0 comments posted, 1 evaluated, 1 filtered
  • line 61: Same issue as code object 0: the int() conversion of BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS at module import time will raise an unhelpful ValueError if the environment variable is set to a non-numeric string, crashing the MCP server initialization. [ Cross-file consolidated ]
src/brainlayer/mcp/enrich_handler.py — 0 comments posted, 1 evaluated, 1 filtered
  • line 12: If the BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS environment variable is set to a non-integer string (e.g., "abc" or an empty string ""), int() will raise an unhandled ValueError at module import time, crashing the application with no helpful error message about the misconfiguration. [ Cross-file consolidated ]

Summary by CodeRabbit

  • New Features

    • Hybrid search now applies maximal marginal relevance (MMR)-based reranking to the final result set.
  • Configuration & Enhancements

    • Enrichment command's default lookback time period is now customizable via the BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS environment variable (defaults to 8760 hours, approximately one year).
    • Configuration applies consistently to both CLI and MCP tool integrations.
    • Help documentation updated to display the configured default value.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 14, 2026

📝 Walkthrough

Walkthrough

The pull request centralizes enrichment time-window defaults across CLI, MCP, and handler layers by introducing a DEFAULT_REALTIME_ENRICH_SINCE_HOURS constant sourced from an environment variable (default 8760 hours). Additionally, a new MMR-based reranking mechanism is added to hybrid search to improve result diversity and deduplicate near-duplicate candidates.

Changes

Cohort / File(s) Summary
Enrichment Defaults Configuration
src/brainlayer/cli/__init__.py, src/brainlayer/mcp/__init__.py, src/brainlayer/mcp/enrich_handler.py
Introduced DEFAULT_REALTIME_ENRICH_SINCE_HOURS constant (derived from BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS env var, default 8760). Updated since_hours parameter defaults in CLI enrich command, MCP brain_enrich tool schema/routing, and _brain_enrich handler from hardcoded 24 to use the new constant. Updated help/schema text to reflect computed default.
MMR Reranking Implementation
src/brainlayer/search_repo.py
Added numpy dependency and new _mmr_rerank_scored_results() method to perform diversity-aware reranking using MMR algorithm. Updated hybrid_search() to dynamically compute candidate_fetch_count based on n_results and over-fetch candidates from both semantic and FTS5 legs, then apply MMR reranking before truncation.
MMR Reranking Tests
tests/test_hybrid_search.py
Added test_mmr_rerank_dedupes_near_duplicates() test validating that MMR reranking prioritizes distinct results over near-duplicate pairs, ensuring diversity in top-N selections.

Sequence Diagram(s)

sequenceDiagram
    participant HybridSearch as hybrid_search()
    participant RRF as RRF Scoring
    participant MMR as MMR Reranking
    participant Embeddings as Embedding Loader
    participant Similarity as Cosine Similarity
    participant Output as Final Results

    HybridSearch->>HybridSearch: Fetch candidates (3x over-fetch)
    HybridSearch->>RRF: Combine semantic + FTS5 scores
    RRF->>HybridSearch: Return scored candidates
    HybridSearch->>MMR: Call _mmr_rerank_scored_results()
    MMR->>Embeddings: Load candidate embeddings
    Embeddings->>MMR: Return float embeddings
    MMR->>Similarity: Compute cosine similarity matrix
    Similarity->>MMR: Return similarity scores
    MMR->>MMR: Iteratively select diverse subset (MMR algorithm)
    MMR->>HybridSearch: Return reranked candidates
    HybridSearch->>Output: Truncate to n_results & return
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Poem

🐰 Hopping through defaults with a magical hop,
Eight thousand seven-sixty won't stop!
MMR diversifies, near-duplicates scatter,
Results now fresher—that's all that matters!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 63.64% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title "MMR post-retrieval dedup for brain_search (R86 Q4)" clearly describes the main change (MMR-based deduplication) and the component affected (brain_search). It is concise, specific, and accurately reflects the primary objective of the pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/mmr-dedup

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Owner Author

EtanHey commented Apr 16, 2026

Please review this PR, especially commit 00e08f5 (Fix enrichment default since-hours for MCP/CLI). Root cause: both MCP/CLI defaults used 24-hour window, which capped candidates to ~40 rows vs expected ~19K. This commit bumps default since-hours to 8760 and makes it env-configurable.

@EtanHey EtanHey marked this pull request as ready for review April 16, 2026 08:03
Copy link
Copy Markdown

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b983f953c5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

remaining.remove(best_idx)

reranked = [mmr_candidates[idx] for idx in selected]
return reranked + fallback_candidates + tail_candidates
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve ranking for high-score chunks without embeddings

The new return order reranked + fallback_candidates + tail_candidates forces every chunk missing a float embedding to rank below all MMR-eligible candidates, even when its fused score is higher. In hybrid_search, this can hide top FTS-only results (e.g., deferred-embedding chunks with no chunk_vectors row) from the requested n_results, which is a retrieval-quality regression introduced by this reranker. Please keep fallback items in score-respecting order relative to the reranked set so missing-vector hits are not systematically suppressed.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/brainlayer/cli/__init__.py`:
- Around line 27-29: Move the DEFAULT_REALTIME_ENRICH_SINCE_HOURS constant into
a single canonical module (e.g., enrichment_controller) and replace the three
duplicate definitions by importing that constant where needed; specifically,
define DEFAULT_REALTIME_ENRICH_SINCE_HOURS once in enrichment_controller (using
the same os.environ fallback logic) and in modules that currently declare it
(the modules containing DEFAULT_REALTIME_ENRICH_SINCE_HOURS in
mcp.enrich_handler, mcp.__init__, and cli.__init__) remove the local definition
and add from ..enrichment_controller import DEFAULT_REALTIME_ENRICH_SINCE_HOURS
(or the correct relative import) so all consumers use the single source of
truth.

In `@src/brainlayer/mcp/__init__.py`:
- Around line 61-63: Replace the duplicated constant definition
DEFAULT_REALTIME_ENRICH_SINCE_HOURS in __init__.py by importing it from
enrich_handler (which is already imported in this module) so there is a single
source of truth; after switching to the import, remove the local assignment and
also drop the now-unused import os at top of the file if nothing else uses it.

In `@src/brainlayer/search_repo.py`:
- Around line 155-188: The current return (reranked + fallback_candidates +
tail_candidates) always places all no-embedding candidates after all
vector-backed ones; instead merge reranked and fallback_candidates back together
and stable-sort by the original relevance score (candidate[0]) so fallback items
keep position relative to scores but the reranked items keep their MMR ordering
when scores are equal. Concretely, after computing reranked, form merged =
reranked + fallback_candidates, then do merged_sorted = sorted(merged,
key=lambda c: c[0], reverse=True) (stable sort preserves reranked/tie order),
and return merged_sorted + tail_candidates; references: _load_chunk_embeddings,
mmr_candidates, fallback_candidates, reranked, tail_candidates.

In `@tests/test_hybrid_search.py`:
- Around line 293-341: The test currently only calls the private helper
_mmr_rerank_scored_results, so a regression in the public wiring (hybrid_search)
could break behavior unnoticed; add a call to store.hybrid_search(...) using the
same query/context (e.g., "oauth token rotation") with parameters that trigger
MMR and n_results=3, then assert the returned result ids mirror the expectations
from the private test (top result is "dup-primary" and the top two are not both
the duplicate pair, and "distinct-relevant" appears in the top two) to ensure
the public hybrid_search path forwards the correct candidate pool into
_mmr_rerank_scored_results.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: b338ad88-c72b-431e-afc0-3fe1d7ee76d0

📥 Commits

Reviewing files that changed from the base of the PR and between 15ab4e9 and b983f95.

📒 Files selected for processing (5)
  • src/brainlayer/cli/__init__.py
  • src/brainlayer/mcp/__init__.py
  • src/brainlayer/mcp/enrich_handler.py
  • src/brainlayer/search_repo.py
  • tests/test_hybrid_search.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: test (3.11)
  • GitHub Check: test (3.12)
  • GitHub Check: test (3.13)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Flag risky DB or concurrency changes explicitly and do not hand-wave lock behavior
Enforce one-write-at-a-time concurrency constraint; reads are safe but brain_digest is write-heavy and must not run in parallel with other MCP work
Run pytest before claiming behavior changed safely; current test suite has 929 tests

**/*.py: Use paths.py:get_db_path() for all database path resolution; all scripts and CLI must use this function rather than hardcoding paths
When performing bulk database operations: stop enrichment workers first, checkpoint WAL before and after, drop FTS triggers before bulk deletes, batch deletes in 5-10K chunks, and checkpoint every 3 batches

Files:

  • src/brainlayer/mcp/enrich_handler.py
  • tests/test_hybrid_search.py
  • src/brainlayer/search_repo.py
  • src/brainlayer/cli/__init__.py
  • src/brainlayer/mcp/__init__.py
src/brainlayer/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/brainlayer/**/*.py: Use retry logic on SQLITE_BUSY errors; each worker must use its own database connection to handle concurrency safely
Classification must preserve ai_code, stack_trace, and user_message verbatim; skip noise entries entirely and summarize build_log and dir_listing entries (structure only)
Use AST-aware chunking via tree-sitter; never split stack traces; mask large tool output
For enrichment backend selection: use Groq as primary backend (cloud, configured in launchd plist), Gemini as fallback via enrichment_controller.py, and Ollama as offline last-resort; allow override via BRAINLAYER_ENRICH_BACKEND env var
Configure enrichment rate via BRAINLAYER_ENRICH_RATE environment variable (default 0.2 = 12 RPM)
Implement chunk lifecycle columns: superseded_by, aggregated_into, archived_at on chunks table; exclude lifecycle-managed chunks from default search; allow include_archived=True to show history
Implement brain_supersede with safety gate for personal data (journals, notes, health/finance); use soft-delete for brain_archive with timestamp
Add supersedes parameter to brain_store for atomic store-and-replace operations
Run linting and formatting with: ruff check src/ && ruff format src/
Run tests with pytest
Use PRAGMA wal_checkpoint(FULL) before and after bulk database operations to prevent WAL bloat

Files:

  • src/brainlayer/mcp/enrich_handler.py
  • src/brainlayer/search_repo.py
  • src/brainlayer/cli/__init__.py
  • src/brainlayer/mcp/__init__.py
🧠 Learnings (17)
📓 Common learnings
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T23:32:14.543Z
Learning: Applies to src/brainlayer/*enrichment*.py : Enrichment rate configurable via `BRAINLAYER_ENRICH_RATE` environment variable (default 0.2 = 12 RPM)
📚 Learning: 2026-04-02T23:32:14.543Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T23:32:14.543Z
Learning: Applies to src/brainlayer/*enrichment*.py : Enrichment rate configurable via `BRAINLAYER_ENRICH_RATE` environment variable (default 0.2 = 12 RPM)

Applied to files:

  • src/brainlayer/mcp/enrich_handler.py
  • src/brainlayer/cli/__init__.py
  • src/brainlayer/mcp/__init__.py
📚 Learning: 2026-04-06T08:40:13.531Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T08:40:13.531Z
Learning: Applies to src/brainlayer/**/*.py : Configure enrichment rate via `BRAINLAYER_ENRICH_RATE` environment variable (default 0.2 = 12 RPM)

Applied to files:

  • src/brainlayer/mcp/enrich_handler.py
  • src/brainlayer/cli/__init__.py
  • src/brainlayer/mcp/__init__.py
📚 Learning: 2026-04-11T16:54:45.631Z
Learnt from: EtanHey
Repo: EtanHey/brainlayer PR: 0
File: :0-0
Timestamp: 2026-04-11T16:54:45.631Z
Learning: Applies to `src/brainlayer/enrichment_controller.py` and `src/brainlayer/pipeline/rate_limiter.py`: Gemini API calls in the enrichment pipeline are gated by a token bucket rate limiter. The rate is controlled by `BRAINLAYER_ENRICH_RATE` (default `5/s`, burst `10`) to keep throughput inside the Gemini Flex intended envelope. This default supersedes the earlier 0.2 (12 RPM) default for the Gemini Flex integration path.

Applied to files:

  • src/brainlayer/mcp/enrich_handler.py
  • src/brainlayer/cli/__init__.py
  • src/brainlayer/mcp/__init__.py
📚 Learning: 2026-03-22T15:55:22.017Z
Learnt from: EtanHey
Repo: EtanHey/brainlayer PR: 100
File: src/brainlayer/enrichment_controller.py:175-199
Timestamp: 2026-03-22T15:55:22.017Z
Learning: In `src/brainlayer/enrichment_controller.py`, the `parallel` parameter in `enrich_local()` is intentionally kept in the function signature (currently unused, suppressed with `# noqa: ARG001`) for API stability. Parallel local enrichment via a thread pool or process pool is planned for a future iteration. Do not flag this as dead code requiring removal.

Applied to files:

  • src/brainlayer/mcp/enrich_handler.py
📚 Learning: 2026-04-11T23:47:49.756Z
Learnt from: EtanHey
Repo: EtanHey/brainlayer PR: 0
File: :0-0
Timestamp: 2026-04-11T23:47:49.756Z
Learning: Applies to `src/brainlayer/enrichment_controller.py`: `service_tier="flex"` is the intentional default for all Gemini enrichment calls. Pass-2 enrichment is asynchronous backlog work where 1–15 minute latency is acceptable, and the 50% Gemini Flex Inference discount materially reduces backlog cost. This is locked by R84b design (§8 Q2). The `BRAINLAYER_GEMINI_SERVICE_TIER` environment variable is purely an operational escape hatch (e.g. `standard`), not the intended runtime default. Do not flag `service_tier="flex"` as a concern on this code path.

Applied to files:

  • src/brainlayer/mcp/enrich_handler.py
  • src/brainlayer/cli/__init__.py
📚 Learning: 2026-04-04T23:24:03.159Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-04T23:24:03.159Z
Learning: Applies to src/brainlayer/{vector_store,search}*.py : Chunk lifecycle: implement columns `superseded_by`, `aggregated_into`, `archived_at` on chunks table; exclude lifecycle-managed chunks from default search

Applied to files:

  • src/brainlayer/search_repo.py
📚 Learning: 2026-04-06T08:40:13.531Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T08:40:13.531Z
Learning: Applies to src/brainlayer/**/*.py : Implement chunk lifecycle columns: `superseded_by`, `aggregated_into`, `archived_at` on chunks table; exclude lifecycle-managed chunks from default search; allow `include_archived=True` to show history

Applied to files:

  • src/brainlayer/search_repo.py
📚 Learning: 2026-04-04T23:49:56.485Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-04T23:49:56.485Z
Learning: Applies to src/brainlayer/vector_store.py : Use sqlite-vec with APSW for vector storage and retrieval

Applied to files:

  • src/brainlayer/search_repo.py
📚 Learning: 2026-04-04T15:22:02.740Z
Learnt from: EtanHey
Repo: EtanHey/brainlayer PR: 198
File: hooks/brainlayer-prompt-search.py:241-259
Timestamp: 2026-04-04T15:22:02.740Z
Learning: In `hooks/brainlayer-prompt-search.py` (Python), `record_injection_event()` is explicitly best-effort telemetry: silent `except sqlite3.Error: pass` is intentional — table non-existence or lock failures are acceptable silent failures. `sqlite3.connect(timeout=2)` is the file-open timeout; `PRAGMA busy_timeout` governs per-statement lock-wait. The `DEADLINE_MS` (450ms) guard applies only to the FTS search phase, not to this side-channel write.

Applied to files:

  • src/brainlayer/search_repo.py
  • src/brainlayer/mcp/__init__.py
📚 Learning: 2026-04-06T00:22:38.541Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T00:22:38.541Z
Learning: Applies to src/**/*.py : Store and resolve database path using `paths.py:get_db_path()` — supports environment variable override or defaults to canonical path `~/.local/share/brainlayer/brainlayer.db`

Applied to files:

  • src/brainlayer/cli/__init__.py
📚 Learning: 2026-04-03T11:34:19.303Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T11:34:19.303Z
Learning: Applies to src/brainlayer/cli.py : Use Typer CLI framework for command-line interface in `src/brainlayer/`

Applied to files:

  • src/brainlayer/cli/__init__.py
📚 Learning: 2026-04-06T08:40:13.531Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T08:40:13.531Z
Learning: Applies to src/brainlayer/**/*.py : For enrichment backend selection: use Groq as primary backend (cloud, configured in launchd plist), Gemini as fallback via `enrichment_controller.py`, and Ollama as offline last-resort; allow override via `BRAINLAYER_ENRICH_BACKEND` env var

Applied to files:

  • src/brainlayer/cli/__init__.py
  • src/brainlayer/mcp/__init__.py
📚 Learning: 2026-04-01T01:24:44.281Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T01:24:44.281Z
Learning: Applies to src/brainlayer/*enrichment*.py : Enrichment backend priority: Groq (primary/cloud) → Gemini (fallback) → Ollama (offline last-resort), configurable via `BRAINLAYER_ENRICH_BACKEND` environment variable

Applied to files:

  • src/brainlayer/cli/__init__.py
  • src/brainlayer/mcp/__init__.py
📚 Learning: 2026-04-01T01:24:44.281Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T01:24:44.281Z
Learning: Applies to src/brainlayer/mcp/*.py : MCP tools include: brain_search, brain_store, brain_recall, brain_entity, brain_expand, brain_update, brain_digest, brain_get_person, brain_tags, brain_supersede, brain_archive (legacy brainlayer_* aliases still supported)

Applied to files:

  • src/brainlayer/mcp/__init__.py
📚 Learning: 2026-03-17T01:04:22.497Z
Learnt from: EtanHey
Repo: EtanHey/brainlayer PR: 0
File: :0-0
Timestamp: 2026-03-17T01:04:22.497Z
Learning: Applies to src/brainlayer/mcp/**/*.py and brain-bar/Sources/BrainBar/MCPRouter.swift: The 8 required MCP tools are `brain_search`, `brain_store`, `brain_recall`, `brain_entity`, `brain_expand`, `brain_update`, `brain_digest`, `brain_tags`. `brain_tags` is the 8th tool, replacing `brain_get_person`, as defined in the Phase B spec merged in PR `#72`. The Python MCP server already implements `brain_tags`. Legacy `brainlayer_*` aliases must be maintained for backward compatibility.

Applied to files:

  • src/brainlayer/mcp/__init__.py
📚 Learning: 2026-03-14T02:20:54.656Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-14T02:20:54.656Z
Learning: Applies to **/*.py : Enforce one-write-at-a-time concurrency constraint; reads are safe but brain_digest is write-heavy and must not run in parallel with other MCP work

Applied to files:

  • src/brainlayer/mcp/__init__.py
🔇 Additional comments (6)
src/brainlayer/search_repo.py (1)

15-15: NumPy is already declared as a runtime dependency in pyproject.toml (line 34: "numpy>=1.22,<3.0"), so the import on line 15 does not create an import-time failure.

			> Likely an incorrect or invalid review comment.
src/brainlayer/mcp/enrich_handler.py (2)

5-5: LGTM! Well-structured environment-configurable default.

The constant is cleanly defined and makes the 8760-hour (1-year) default configurable via BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS. This is consistent with the project's pattern for environment-based configuration.

Also applies to: 12-14


19-26: LGTM! Default correctly applied to the handler function.

The since_hours parameter default is now sourced from the module-level constant, ensuring consistency between the handler, MCP schema, and CLI.

src/brainlayer/mcp/__init__.py (2)

1050-1058: LGTM! Tool schema correctly updated with dynamic default and descriptive help text.

The schema now reflects the configurable default value and the description includes the actual default for user clarity.


1347-1355: LGTM! Call routing fallback correctly uses the constant.

The brain_enrich handler correctly falls back to DEFAULT_REALTIME_ENRICH_SINCE_HOURS when since_hours is not provided in arguments.

src/brainlayer/cli/__init__.py (1)

885-896: LGTM! CLI option correctly updated with dynamic default.

The --since-hours option now uses the configurable constant for its default, and the help text clearly displays the current default value.

Comment thread src/brainlayer/cli/__init__.py Outdated
Comment on lines +27 to +29
DEFAULT_REALTIME_ENRICH_SINCE_HOURS = int(
os.environ.get("BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS", "8760")
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Centralize DEFAULT_REALTIME_ENRICH_SINCE_HOURS to avoid triple definition.

This constant is now defined identically in three files:

  • src/brainlayer/mcp/enrich_handler.py (lines 12-14)
  • src/brainlayer/mcp/__init__.py (lines 61-63)
  • src/brainlayer/cli/__init__.py (lines 27-29)

Consider defining it once in a shared location (e.g., enrichment_controller.py since it's already the canonical location for enrichment logic, or a dedicated constants module) and importing it where needed. This ensures consistency if the default or env var name ever changes.

♻️ Proposed approach
  1. Add the constant to enrichment_controller.py:
DEFAULT_REALTIME_ENRICH_SINCE_HOURS = int(
    os.environ.get("BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS", "8760")
)
  1. Import in all consuming modules:
from ..enrichment_controller import DEFAULT_REALTIME_ENRICH_SINCE_HOURS
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/brainlayer/cli/__init__.py` around lines 27 - 29, Move the
DEFAULT_REALTIME_ENRICH_SINCE_HOURS constant into a single canonical module
(e.g., enrichment_controller) and replace the three duplicate definitions by
importing that constant where needed; specifically, define
DEFAULT_REALTIME_ENRICH_SINCE_HOURS once in enrichment_controller (using the
same os.environ fallback logic) and in modules that currently declare it (the
modules containing DEFAULT_REALTIME_ENRICH_SINCE_HOURS in mcp.enrich_handler,
mcp.__init__, and cli.__init__) remove the local definition and add from
..enrichment_controller import DEFAULT_REALTIME_ENRICH_SINCE_HOURS (or the
correct relative import) so all consumers use the single source of truth.

Comment thread src/brainlayer/mcp/__init__.py Outdated
Comment on lines +61 to +63
DEFAULT_REALTIME_ENRICH_SINCE_HOURS = int(
os.environ.get("BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS", "8760")
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider importing the constant from enrich_handler instead of redefining it.

This constant is already defined identically in enrich_handler.py (line 12-14), which this module already imports from (line 38). Importing it would eliminate duplication and ensure a single source of truth.

♻️ Proposed refactor
-DEFAULT_REALTIME_ENRICH_SINCE_HOURS = int(
-    os.environ.get("BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS", "8760")
-)
+from .enrich_handler import DEFAULT_REALTIME_ENRICH_SINCE_HOURS

And remove the import os addition on line 5 if it's no longer needed elsewhere in this file.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
DEFAULT_REALTIME_ENRICH_SINCE_HOURS = int(
os.environ.get("BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS", "8760")
)
from .enrich_handler import DEFAULT_REALTIME_ENRICH_SINCE_HOURS
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/brainlayer/mcp/__init__.py` around lines 61 - 63, Replace the duplicated
constant definition DEFAULT_REALTIME_ENRICH_SINCE_HOURS in __init__.py by
importing it from enrich_handler (which is already imported in this module) so
there is a single source of truth; after switching to the import, remove the
local assignment and also drop the now-unused import os at top of the file if
nothing else uses it.

Comment on lines +155 to +188
embeddings_by_id = self._load_chunk_embeddings([candidate[1] for candidate in top_candidates])
mmr_candidates = [candidate for candidate in top_candidates if candidate[1] in embeddings_by_id]
if len(mmr_candidates) < 2:
return scored

fallback_candidates = [candidate for candidate in top_candidates if candidate[1] not in embeddings_by_id]
relevance = np.array([candidate[0] for candidate in mmr_candidates], dtype=np.float32)
rel_max = float(relevance.max())
if rel_max > 0.0:
normalized_relevance = relevance / rel_max
else:
normalized_relevance = np.ones_like(relevance)

matrix = np.stack([embeddings_by_id[candidate[1]] for candidate in mmr_candidates]).astype(np.float32, copy=False)
norms = np.linalg.norm(matrix, axis=1, keepdims=True)
norms = np.maximum(norms, 1e-12)
cosine = np.clip((matrix / norms) @ (matrix / norms).T, -1.0, 1.0)
np.fill_diagonal(cosine, 0.0)

selected: list[int] = [int(np.argmax(normalized_relevance))]
remaining = set(range(len(mmr_candidates))) - set(selected)

while remaining:
remaining_indices = sorted(remaining)
diversity_penalty = cosine[remaining_indices][:, selected].max(axis=1)
mmr_scores = (_MMR_LAMBDA * normalized_relevance[remaining_indices]) - (
(1.0 - _MMR_LAMBDA) * diversity_penalty
)
best_idx = remaining_indices[int(np.argmax(mmr_scores))]
selected.append(best_idx)
remaining.remove(best_idx)

reranked = [mmr_candidates[idx] for idx in selected]
return reranked + fallback_candidates + tail_candidates
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Preserve score order for candidates without float vectors.

return reranked + fallback_candidates + tail_candidates demotes every no-embedding candidate behind all embedding-backed ones. A result that was originally ranked near the top can disappear from the visible n_results solely because _load_chunk_embeddings() missed its vector.

💡 Proposed fix
         embeddings_by_id = self._load_chunk_embeddings([candidate[1] for candidate in top_candidates])
         mmr_candidates = [candidate for candidate in top_candidates if candidate[1] in embeddings_by_id]
         if len(mmr_candidates) < 2:
             return scored

-        fallback_candidates = [candidate for candidate in top_candidates if candidate[1] not in embeddings_by_id]
+        embedding_backed_ids = {candidate[1] for candidate in mmr_candidates}
         relevance = np.array([candidate[0] for candidate in mmr_candidates], dtype=np.float32)
         rel_max = float(relevance.max())
         if rel_max > 0.0:
             normalized_relevance = relevance / rel_max
         else:
@@
-        reranked = [mmr_candidates[idx] for idx in selected]
-        return reranked + fallback_candidates + tail_candidates
+        reranked_iter = iter(mmr_candidates[idx] for idx in selected)
+        reranked_top = [
+            next(reranked_iter) if candidate[1] in embedding_backed_ids else candidate
+            for candidate in top_candidates
+        ]
+        return reranked_top + tail_candidates
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/brainlayer/search_repo.py` around lines 155 - 188, The current return
(reranked + fallback_candidates + tail_candidates) always places all
no-embedding candidates after all vector-backed ones; instead merge reranked and
fallback_candidates back together and stable-sort by the original relevance
score (candidate[0]) so fallback items keep position relative to scores but the
reranked items keep their MMR ordering when scores are equal. Concretely, after
computing reranked, form merged = reranked + fallback_candidates, then do
merged_sorted = sorted(merged, key=lambda c: c[0], reverse=True) (stable sort
preserves reranked/tie order), and return merged_sorted + tail_candidates;
references: _load_chunk_embeddings, mmr_candidates, fallback_candidates,
reranked, tail_candidates.

Comment on lines +293 to +341
def test_mmr_rerank_dedupes_near_duplicates(self, store):
def embedding(primary: float, secondary: float = 0.0) -> list[float]:
vector = [0.0] * 1024
vector[0] = primary
vector[1] = secondary
return vector

_insert_chunk(
store,
chunk_id="dup-primary",
content="oauth token rotation incident rollback and session repair",
embedding=embedding(1.0, 0.0),
importance=6.0,
)
_insert_chunk(
store,
chunk_id="dup-secondary",
content="oauth token rotation incident rollback and session repair duplicate notes",
embedding=embedding(0.999, 0.001),
importance=6.0,
)
_insert_chunk(
store,
chunk_id="distinct-relevant",
content="oauth token rotation migration checklist and recovery guide",
embedding=embedding(0.72, 0.69),
importance=5.5,
)
_insert_chunk(
store,
chunk_id="distinct-supporting",
content="oauth token rotation audit trail and operator runbook",
embedding=embedding(0.65, 0.75),
importance=5.0,
)

scored = [
(0.99, "dup-primary", "oauth token rotation incident rollback and session repair", {}, 0.01),
(0.98, "dup-secondary", "oauth token rotation incident rollback and session repair duplicate notes", {}, 0.02),
(0.94, "distinct-relevant", "oauth token rotation migration checklist and recovery guide", {}, 0.12),
(0.9, "distinct-supporting", "oauth token rotation audit trail and operator runbook", {}, 0.15),
]

reranked = store._mmr_rerank_scored_results(scored, n_results=3)
ids = [item[1] for item in reranked[:3]]

assert ids[0] == "dup-primary", ids
assert "distinct-relevant" in ids[:2], ids
assert set(ids[:2]) != {"dup-primary", "dup-secondary"}, ids
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Cover the public hybrid_search() path as well.

This regression only exercises the private helper, so it will still pass if the hybrid_search() wiring regresses and never feeds the expected candidate pool into MMR. Please add one assertion through store.hybrid_search(...) to protect the behavior that actually changed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_hybrid_search.py` around lines 293 - 341, The test currently only
calls the private helper _mmr_rerank_scored_results, so a regression in the
public wiring (hybrid_search) could break behavior unnoticed; add a call to
store.hybrid_search(...) using the same query/context (e.g., "oauth token
rotation") with parameters that trigger MMR and n_results=3, then assert the
returned result ids mirror the expectations from the private test (top result is
"dup-primary" and the top two are not both the duplicate pair, and
"distinct-relevant" appears in the top two) to ensure the public hybrid_search
path forwards the correct candidate pool into _mmr_rerank_scored_results.


from ..paths import get_db_path

DEFAULT_REALTIME_ENRICH_SINCE_HOURS = int(os.environ.get("BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS", "8760"))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Low cli/__init__.py:27

At module import time, int(os.environ.get("BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS", "8760")) throws ValueError when the environment variable contains a non-numeric string. Since this happens during CLI initialization, the error appears as an internal traceback rather than a user-friendly configuration message. Consider validating the value and raising a clear typer.BadParameter or SystemExit with a descriptive message.

+_raw_since_hours = os.environ.get("BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS", "8760")
+try:
+    DEFAULT_REALTIME_ENRICH_SINCE_HOURS = int(_raw_since_hours)
+except ValueError:
+    console.print(f"[bold red]Error:[/] BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS must be an integer, got: {_raw_since_hours!r}")
+    raise typer.Exit(1)
+del _raw_since_hours
Also found in 2 other location(s)

src/brainlayer/mcp/__init__.py:61

Same issue as code object 0: the int() conversion of BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS at module import time will raise an unhelpful ValueError if the environment variable is set to a non-numeric string, crashing the MCP server initialization.

src/brainlayer/mcp/enrich_handler.py:12

If the BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS environment variable is set to a non-integer string (e.g., &#34;abc&#34; or an empty string &#34;&#34;), int() will raise an unhandled ValueError at module import time, crashing the application with no helpful error message about the misconfiguration.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file src/brainlayer/cli/__init__.py around line 27:

At module import time, `int(os.environ.get("BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS", "8760"))` throws `ValueError` when the environment variable contains a non-numeric string. Since this happens during CLI initialization, the error appears as an internal traceback rather than a user-friendly configuration message. Consider validating the value and raising a clear `typer.BadParameter` or `SystemExit` with a descriptive message.

Evidence trail:
src/brainlayer/cli/__init__.py lines 27: `DEFAULT_REALTIME_ENRICH_SINCE_HOURS = int(os.environ.get("BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS", "8760"))` - module-level code with no validation or exception handling around the int() call.

Also found in 2 other location(s):
- src/brainlayer/mcp/__init__.py:61 -- Same issue as code object 0: the `int()` conversion of `BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS` at module import time will raise an unhelpful `ValueError` if the environment variable is set to a non-numeric string, crashing the MCP server initialization.
- src/brainlayer/mcp/enrich_handler.py:12 -- If the `BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS` environment variable is set to a non-integer string (e.g., `"abc"` or an empty string `""`), `int()` will raise an unhandled `ValueError` at module import time, crashing the application with no helpful error message about the misconfiguration.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d545514c00

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


# 1. Semantic search leg — prefer binary vectors, fall back to float vectors
# when the binary index is unavailable (for example readonly live DBs).
candidate_fetch_count = max(n_results * 3, _MMR_CANDIDATE_LIMIT)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid multiplying the new candidate floor into 500-KNN scans

hybrid_search now forces candidate_fetch_count to at least 50, but the binary path still applies its existing filtered-query overfetch (effective_k = n_results * 10 in _binary_search). For entity_id/source_filter_like/correction-filtered searches, this turns the intended 50-candidate pool into 500-vector KNN probes, which is a significant latency regression and can trigger MCP timeouts on larger DBs. The MMR floor should be decoupled from the binary leg’s internal overfetch multiplier.

Useful? React with 👍 / 👎.


from ..paths import get_db_path

DEFAULT_REALTIME_ENRICH_SINCE_HOURS = int(os.environ.get("BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS", "8760"))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Guard env-based enrich default parsing at import time

This import-time int(...) cast will raise ValueError if BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS is set to a non-integer string, which prevents the CLI module from loading at all (including unrelated commands). Because this is user-controlled environment input, failing hard during module import is brittle; parse defensively (or defer validation to command execution) so one bad env value does not brick the entire CLI.

Useful? React with 👍 / 👎.

@EtanHey EtanHey merged commit fd04ecd into main Apr 16, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant