MMR post-retrieval dedup for brain_search (R86 Q4) by EtanHey · Pull Request #242 · EtanHey/brainlayer

EtanHey · 2026-04-14T16:36:35Z

Design Summary

Implements the R86 Q4 MMR post-retrieval reranking stage in SearchMixin.hybrid_search.
Overfetches a 50-candidate pool per retrieval leg, keeps existing RRF plus importance/recency/KG boosts, then reranks the top candidate pool with NumPy cosine MMR at lambda=0.65.
Uses float embeddings from chunk_vectors for the similarity matrix so near-duplicate chunks lose top slots without changing existing filters or MCP response formatting.
Adds a regression test proving the highest-relevance result stays first while a near-duplicate is pushed behind a distinct relevant result.

Test Output

$ pytest tests/test_hybrid_search.py tests/test_search_quality.py -x -v
============================== 31 passed in 2.76s ==============================

$ pytest
Repo-wide suite is blocked in this environment at:
- tests/test_eval_baselines.py::TestEntityRouting::test_avi_simon_entity

Failure:
- RuntimeError: Cannot send a request, as the client has been closed.

Observed cause:
- The live embedding model load attempted network access to https://huggingface.co/BAAI/bge-large-en-v1.5/resolve/main/adapter_config.json under restricted network.

Note

Add MMR post-retrieval deduplication to `hybrid_search` in `SearchMixin`

Adds _mmr_rerank_scored_results to search_repo.py, implementing Maximal Marginal Relevance re-ranking that balances relevance and diversity using cosine similarity over chunk embeddings (_MMR_LAMBDA = 0.65).
Adds _load_chunk_embeddings helper to fetch float32 embeddings from the chunk_vectors table for MMR input.
hybrid_search now fetches a minimum of 50 candidates per leg (_MMR_CANDIDATE_LIMIT) instead of only n_results * 3, increasing the pool available for re-ranking.
Changes the default since_hours for brain_enrich across the CLI, MCP tool schema, and handler from 24 to 8760 (1 year), configurable via BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS.
Behavioral Change: hybrid_search result ordering will change for callers that relied on pre-MMR ranking; the minimum candidate fetch size also increases, which may affect latency.

📊 Macroscope summarized d545514. 5 files reviewed, 4 issues evaluated, 2 issues filtered, 1 comment posted

🗂️ Filtered Issues

src/brainlayer/mcp/__init__.py — 0 comments posted, 1 evaluated, 1 filtered

line 61: Same issue as code object 0: the int() conversion of BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS at module import time will raise an unhelpful ValueError if the environment variable is set to a non-numeric string, crashing the MCP server initialization. [ Cross-file consolidated ]

src/brainlayer/mcp/enrich_handler.py — 0 comments posted, 1 evaluated, 1 filtered

line 12: If the BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS environment variable is set to a non-integer string (e.g., "abc" or an empty string ""), int() will raise an unhandled ValueError at module import time, crashing the application with no helpful error message about the misconfiguration. [ Cross-file consolidated ]

Summary by CodeRabbit

New Features
- Hybrid search now applies maximal marginal relevance (MMR)-based reranking to the final result set.
Configuration & Enhancements
- Enrichment command's default lookback time period is now customizable via the BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS environment variable (defaults to 8760 hours, approximately one year).
- Configuration applies consistently to both CLI and MCP tool integrations.
- Help documentation updated to display the configured default value.

coderabbitai · 2026-04-14T16:36:43Z

📝 Walkthrough

Walkthrough

The pull request centralizes enrichment time-window defaults across CLI, MCP, and handler layers by introducing a DEFAULT_REALTIME_ENRICH_SINCE_HOURS constant sourced from an environment variable (default 8760 hours). Additionally, a new MMR-based reranking mechanism is added to hybrid search to improve result diversity and deduplicate near-duplicate candidates.

Changes

Cohort / File(s)	Summary
Enrichment Defaults Configuration `src/brainlayer/cli/__init__.py`, `src/brainlayer/mcp/__init__.py`, `src/brainlayer/mcp/enrich_handler.py`	Introduced `DEFAULT_REALTIME_ENRICH_SINCE_HOURS` constant (derived from `BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS` env var, default 8760). Updated `since_hours` parameter defaults in CLI `enrich` command, MCP `brain_enrich` tool schema/routing, and `_brain_enrich` handler from hardcoded `24` to use the new constant. Updated help/schema text to reflect computed default.
MMR Reranking Implementation `src/brainlayer/search_repo.py`	Added numpy dependency and new `_mmr_rerank_scored_results()` method to perform diversity-aware reranking using MMR algorithm. Updated `hybrid_search()` to dynamically compute `candidate_fetch_count` based on `n_results` and over-fetch candidates from both semantic and FTS5 legs, then apply MMR reranking before truncation.
MMR Reranking Tests `tests/test_hybrid_search.py`	Added `test_mmr_rerank_dedupes_near_duplicates()` test validating that MMR reranking prioritizes distinct results over near-duplicate pairs, ensuring diversity in top-N selections.

Sequence Diagram(s)

sequenceDiagram
    participant HybridSearch as hybrid_search()
    participant RRF as RRF Scoring
    participant MMR as MMR Reranking
    participant Embeddings as Embedding Loader
    participant Similarity as Cosine Similarity
    participant Output as Final Results

    HybridSearch->>HybridSearch: Fetch candidates (3x over-fetch)
    HybridSearch->>RRF: Combine semantic + FTS5 scores
    RRF->>HybridSearch: Return scored candidates
    HybridSearch->>MMR: Call _mmr_rerank_scored_results()
    MMR->>Embeddings: Load candidate embeddings
    Embeddings->>MMR: Return float embeddings
    MMR->>Similarity: Compute cosine similarity matrix
    Similarity->>MMR: Return similarity scores
    MMR->>MMR: Iteratively select diverse subset (MMR algorithm)
    MMR->>HybridSearch: Return reranked candidates
    HybridSearch->>Output: Truncate to n_results & return

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

feat: tiered enrichment selectivity — T0/T1/T2/T3 classifier #82: Modifies src/brainlayer/search_repo.py and alters the hybrid_search() flow; both PRs touch the same function with distinct enhancements (caching vs. MMR reranking).
feat: BGE-M3 re-embedding script + enrichment throughput boost #211: Updates enrichment realtime defaults (since_hours=8760) in enrichment_controller; this PR centralizes that value into a reusable constant across CLI, MCP, and handler layers.

Poem

🐰 Hopping through defaults with a magical hop,
Eight thousand seven-sixty won't stop!
MMR diversifies, near-duplicates scatter,
Results now fresher—that's all that matters! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 63.64% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title "MMR post-retrieval dedup for brain_search (R86 Q4)" clearly describes the main change (MMR-based deduplication) and the component affected (brain_search). It is concise, specific, and accurately reflects the primary objective of the pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/mmr-dedup

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

EtanHey · 2026-04-16T07:45:29Z

Please review this PR, especially commit 00e08f5 (Fix enrichment default since-hours for MCP/CLI). Root cause: both MCP/CLI defaults used 24-hour window, which capped candidates to ~40 rows vs expected ~19K. This commit bumps default since-hours to 8760 and makes it env-configurable.

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b983f953c5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-16T08:08:35Z

+            remaining.remove(best_idx)
+
+        reranked = [mmr_candidates[idx] for idx in selected]
+        return reranked + fallback_candidates + tail_candidates


Preserve ranking for high-score chunks without embeddings

The new return order reranked + fallback_candidates + tail_candidates forces every chunk missing a float embedding to rank below all MMR-eligible candidates, even when its fused score is higher. In hybrid_search, this can hide top FTS-only results (e.g., deferred-embedding chunks with no chunk_vectors row) from the requested n_results, which is a retrieval-quality regression introduced by this reranker. Please keep fallback items in score-respecting order relative to the reranked set so missing-vector hits are not systematically suppressed.

Useful? React with 👍 / 👎.

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/brainlayer/cli/__init__.py`:
- Around line 27-29: Move the DEFAULT_REALTIME_ENRICH_SINCE_HOURS constant into
a single canonical module (e.g., enrichment_controller) and replace the three
duplicate definitions by importing that constant where needed; specifically,
define DEFAULT_REALTIME_ENRICH_SINCE_HOURS once in enrichment_controller (using
the same os.environ fallback logic) and in modules that currently declare it
(the modules containing DEFAULT_REALTIME_ENRICH_SINCE_HOURS in
mcp.enrich_handler, mcp.__init__, and cli.__init__) remove the local definition
and add from ..enrichment_controller import DEFAULT_REALTIME_ENRICH_SINCE_HOURS
(or the correct relative import) so all consumers use the single source of
truth.

In `@src/brainlayer/mcp/__init__.py`:
- Around line 61-63: Replace the duplicated constant definition
DEFAULT_REALTIME_ENRICH_SINCE_HOURS in __init__.py by importing it from
enrich_handler (which is already imported in this module) so there is a single
source of truth; after switching to the import, remove the local assignment and
also drop the now-unused import os at top of the file if nothing else uses it.

In `@src/brainlayer/search_repo.py`:
- Around line 155-188: The current return (reranked + fallback_candidates +
tail_candidates) always places all no-embedding candidates after all
vector-backed ones; instead merge reranked and fallback_candidates back together
and stable-sort by the original relevance score (candidate[0]) so fallback items
keep position relative to scores but the reranked items keep their MMR ordering
when scores are equal. Concretely, after computing reranked, form merged =
reranked + fallback_candidates, then do merged_sorted = sorted(merged,
key=lambda c: c[0], reverse=True) (stable sort preserves reranked/tie order),
and return merged_sorted + tail_candidates; references: _load_chunk_embeddings,
mmr_candidates, fallback_candidates, reranked, tail_candidates.

In `@tests/test_hybrid_search.py`:
- Around line 293-341: The test currently only calls the private helper
_mmr_rerank_scored_results, so a regression in the public wiring (hybrid_search)
could break behavior unnoticed; add a call to store.hybrid_search(...) using the
same query/context (e.g., "oauth token rotation") with parameters that trigger
MMR and n_results=3, then assert the returned result ids mirror the expectations
from the private test (top result is "dup-primary" and the top two are not both
the duplicate pair, and "distinct-relevant" appears in the top two) to ensure
the public hybrid_search path forwards the correct candidate pool into
_mmr_rerank_scored_results.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: b338ad88-c72b-431e-afc0-3fe1d7ee76d0

📥 Commits

Reviewing files that changed from the base of the PR and between 15ab4e9 and b983f95.

📒 Files selected for processing (5)

src/brainlayer/cli/__init__.py
src/brainlayer/mcp/__init__.py
src/brainlayer/mcp/enrich_handler.py
src/brainlayer/search_repo.py
tests/test_hybrid_search.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: test (3.11)
GitHub Check: test (3.12)
GitHub Check: test (3.13)

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Flag risky DB or concurrency changes explicitly and do not hand-wave lock behavior
Enforce one-write-at-a-time concurrency constraint; reads are safe but brain_digest is write-heavy and must not run in parallel with other MCP work
Run pytest before claiming behavior changed safely; current test suite has 929 tests

**/*.py: Use paths.py:get_db_path() for all database path resolution; all scripts and CLI must use this function rather than hardcoding paths
When performing bulk database operations: stop enrichment workers first, checkpoint WAL before and after, drop FTS triggers before bulk deletes, batch deletes in 5-10K chunks, and checkpoint every 3 batches

Files:

src/brainlayer/mcp/enrich_handler.py
tests/test_hybrid_search.py
src/brainlayer/search_repo.py
src/brainlayer/cli/__init__.py
src/brainlayer/mcp/__init__.py

src/brainlayer/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/brainlayer/**/*.py: Use retry logic on SQLITE_BUSY errors; each worker must use its own database connection to handle concurrency safely
Classification must preserve ai_code, stack_trace, and user_message verbatim; skip noise entries entirely and summarize build_log and dir_listing entries (structure only)
Use AST-aware chunking via tree-sitter; never split stack traces; mask large tool output
For enrichment backend selection: use Groq as primary backend (cloud, configured in launchd plist), Gemini as fallback via enrichment_controller.py, and Ollama as offline last-resort; allow override via BRAINLAYER_ENRICH_BACKEND env var
Configure enrichment rate via BRAINLAYER_ENRICH_RATE environment variable (default 0.2 = 12 RPM)
Implement chunk lifecycle columns: superseded_by, aggregated_into, archived_at on chunks table; exclude lifecycle-managed chunks from default search; allow include_archived=True to show history
Implement brain_supersede with safety gate for personal data (journals, notes, health/finance); use soft-delete for brain_archive with timestamp
Add supersedes parameter to brain_store for atomic store-and-replace operations
Run linting and formatting with: ruff check src/ && ruff format src/
Run tests with pytest
Use PRAGMA wal_checkpoint(FULL) before and after bulk database operations to prevent WAL bloat

Files:

src/brainlayer/mcp/enrich_handler.py
src/brainlayer/search_repo.py
src/brainlayer/cli/__init__.py
src/brainlayer/mcp/__init__.py

🧠 Learnings (17)

📓 Common learnings

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T23:32:14.543Z
Learning: Applies to src/brainlayer/*enrichment*.py : Enrichment rate configurable via `BRAINLAYER_ENRICH_RATE` environment variable (default 0.2 = 12 RPM)

📚 Learning: 2026-04-02T23:32:14.543Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T23:32:14.543Z
Learning: Applies to src/brainlayer/*enrichment*.py : Enrichment rate configurable via `BRAINLAYER_ENRICH_RATE` environment variable (default 0.2 = 12 RPM)

Applied to files:

src/brainlayer/mcp/enrich_handler.py
src/brainlayer/cli/__init__.py
src/brainlayer/mcp/__init__.py

📚 Learning: 2026-04-06T08:40:13.531Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T08:40:13.531Z
Learning: Applies to src/brainlayer/**/*.py : Configure enrichment rate via `BRAINLAYER_ENRICH_RATE` environment variable (default 0.2 = 12 RPM)

Applied to files:

src/brainlayer/mcp/enrich_handler.py
src/brainlayer/cli/__init__.py
src/brainlayer/mcp/__init__.py

📚 Learning: 2026-04-11T16:54:45.631Z

Learnt from: EtanHey
Repo: EtanHey/brainlayer PR: 0
File: :0-0
Timestamp: 2026-04-11T16:54:45.631Z
Learning: Applies to `src/brainlayer/enrichment_controller.py` and `src/brainlayer/pipeline/rate_limiter.py`: Gemini API calls in the enrichment pipeline are gated by a token bucket rate limiter. The rate is controlled by `BRAINLAYER_ENRICH_RATE` (default `5/s`, burst `10`) to keep throughput inside the Gemini Flex intended envelope. This default supersedes the earlier 0.2 (12 RPM) default for the Gemini Flex integration path.

Applied to files:

src/brainlayer/mcp/enrich_handler.py
src/brainlayer/cli/__init__.py
src/brainlayer/mcp/__init__.py

📚 Learning: 2026-03-22T15:55:22.017Z

Learnt from: EtanHey
Repo: EtanHey/brainlayer PR: 100
File: src/brainlayer/enrichment_controller.py:175-199
Timestamp: 2026-03-22T15:55:22.017Z
Learning: In `src/brainlayer/enrichment_controller.py`, the `parallel` parameter in `enrich_local()` is intentionally kept in the function signature (currently unused, suppressed with `# noqa: ARG001`) for API stability. Parallel local enrichment via a thread pool or process pool is planned for a future iteration. Do not flag this as dead code requiring removal.

Applied to files:

src/brainlayer/mcp/enrich_handler.py

📚 Learning: 2026-04-11T23:47:49.756Z

Learnt from: EtanHey
Repo: EtanHey/brainlayer PR: 0
File: :0-0
Timestamp: 2026-04-11T23:47:49.756Z
Learning: Applies to `src/brainlayer/enrichment_controller.py`: `service_tier="flex"` is the intentional default for all Gemini enrichment calls. Pass-2 enrichment is asynchronous backlog work where 1–15 minute latency is acceptable, and the 50% Gemini Flex Inference discount materially reduces backlog cost. This is locked by R84b design (§8 Q2). The `BRAINLAYER_GEMINI_SERVICE_TIER` environment variable is purely an operational escape hatch (e.g. `standard`), not the intended runtime default. Do not flag `service_tier="flex"` as a concern on this code path.

Applied to files:

src/brainlayer/mcp/enrich_handler.py
src/brainlayer/cli/__init__.py

📚 Learning: 2026-04-04T23:24:03.159Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-04T23:24:03.159Z
Learning: Applies to src/brainlayer/{vector_store,search}*.py : Chunk lifecycle: implement columns `superseded_by`, `aggregated_into`, `archived_at` on chunks table; exclude lifecycle-managed chunks from default search

Applied to files:

src/brainlayer/search_repo.py

📚 Learning: 2026-04-06T08:40:13.531Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T08:40:13.531Z
Learning: Applies to src/brainlayer/**/*.py : Implement chunk lifecycle columns: `superseded_by`, `aggregated_into`, `archived_at` on chunks table; exclude lifecycle-managed chunks from default search; allow `include_archived=True` to show history

Applied to files:

src/brainlayer/search_repo.py

📚 Learning: 2026-04-04T23:49:56.485Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-04T23:49:56.485Z
Learning: Applies to src/brainlayer/vector_store.py : Use sqlite-vec with APSW for vector storage and retrieval

Applied to files:

src/brainlayer/search_repo.py

📚 Learning: 2026-04-04T15:22:02.740Z

Learnt from: EtanHey
Repo: EtanHey/brainlayer PR: 198
File: hooks/brainlayer-prompt-search.py:241-259
Timestamp: 2026-04-04T15:22:02.740Z
Learning: In `hooks/brainlayer-prompt-search.py` (Python), `record_injection_event()` is explicitly best-effort telemetry: silent `except sqlite3.Error: pass` is intentional — table non-existence or lock failures are acceptable silent failures. `sqlite3.connect(timeout=2)` is the file-open timeout; `PRAGMA busy_timeout` governs per-statement lock-wait. The `DEADLINE_MS` (450ms) guard applies only to the FTS search phase, not to this side-channel write.

Applied to files:

src/brainlayer/search_repo.py
src/brainlayer/mcp/__init__.py

📚 Learning: 2026-04-06T00:22:38.541Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T00:22:38.541Z
Learning: Applies to src/**/*.py : Store and resolve database path using `paths.py:get_db_path()` — supports environment variable override or defaults to canonical path `~/.local/share/brainlayer/brainlayer.db`

Applied to files:

src/brainlayer/cli/__init__.py

📚 Learning: 2026-04-03T11:34:19.303Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T11:34:19.303Z
Learning: Applies to src/brainlayer/cli.py : Use Typer CLI framework for command-line interface in `src/brainlayer/`

Applied to files:

src/brainlayer/cli/__init__.py

📚 Learning: 2026-04-06T08:40:13.531Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T08:40:13.531Z
Learning: Applies to src/brainlayer/**/*.py : For enrichment backend selection: use Groq as primary backend (cloud, configured in launchd plist), Gemini as fallback via `enrichment_controller.py`, and Ollama as offline last-resort; allow override via `BRAINLAYER_ENRICH_BACKEND` env var

Applied to files:

src/brainlayer/cli/__init__.py
src/brainlayer/mcp/__init__.py

📚 Learning: 2026-04-01T01:24:44.281Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T01:24:44.281Z
Learning: Applies to src/brainlayer/*enrichment*.py : Enrichment backend priority: Groq (primary/cloud) → Gemini (fallback) → Ollama (offline last-resort), configurable via `BRAINLAYER_ENRICH_BACKEND` environment variable

Applied to files:

src/brainlayer/cli/__init__.py
src/brainlayer/mcp/__init__.py

📚 Learning: 2026-04-01T01:24:44.281Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T01:24:44.281Z
Learning: Applies to src/brainlayer/mcp/*.py : MCP tools include: brain_search, brain_store, brain_recall, brain_entity, brain_expand, brain_update, brain_digest, brain_get_person, brain_tags, brain_supersede, brain_archive (legacy brainlayer_* aliases still supported)

Applied to files:

src/brainlayer/mcp/__init__.py

📚 Learning: 2026-03-17T01:04:22.497Z

Learnt from: EtanHey
Repo: EtanHey/brainlayer PR: 0
File: :0-0
Timestamp: 2026-03-17T01:04:22.497Z
Learning: Applies to src/brainlayer/mcp/**/*.py and brain-bar/Sources/BrainBar/MCPRouter.swift: The 8 required MCP tools are `brain_search`, `brain_store`, `brain_recall`, `brain_entity`, `brain_expand`, `brain_update`, `brain_digest`, `brain_tags`. `brain_tags` is the 8th tool, replacing `brain_get_person`, as defined in the Phase B spec merged in PR `#72`. The Python MCP server already implements `brain_tags`. Legacy `brainlayer_*` aliases must be maintained for backward compatibility.

Applied to files:

src/brainlayer/mcp/__init__.py

📚 Learning: 2026-03-14T02:20:54.656Z

Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-14T02:20:54.656Z
Learning: Applies to **/*.py : Enforce one-write-at-a-time concurrency constraint; reads are safe but brain_digest is write-heavy and must not run in parallel with other MCP work

Applied to files:

src/brainlayer/mcp/__init__.py

🔇 Additional comments (6)

src/brainlayer/search_repo.py (1)
15-15: NumPy is already declared as a runtime dependency in pyproject.toml (line 34: "numpy>=1.22,<3.0"), so the import on line 15 does not create an import-time failure.
			> Likely an incorrect or invalid review comment.
src/brainlayer/mcp/enrich_handler.py (2)

5-5: LGTM! Well-structured environment-configurable default.

The constant is cleanly defined and makes the 8760-hour (1-year) default configurable via BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS. This is consistent with the project's pattern for environment-based configuration.

Also applies to: 12-14

19-26: LGTM! Default correctly applied to the handler function.

The since_hours parameter default is now sourced from the module-level constant, ensuring consistency between the handler, MCP schema, and CLI.

src/brainlayer/mcp/__init__.py (2)

1050-1058: LGTM! Tool schema correctly updated with dynamic default and descriptive help text.

The schema now reflects the configurable default value and the description includes the actual default for user clarity.

1347-1355: LGTM! Call routing fallback correctly uses the constant.

The brain_enrich handler correctly falls back to DEFAULT_REALTIME_ENRICH_SINCE_HOURS when since_hours is not provided in arguments.

src/brainlayer/cli/__init__.py (1)

885-896: LGTM! CLI option correctly updated with dynamic default.

The --since-hours option now uses the configurable constant for its default, and the help text clearly displays the current default value.

coderabbitai · 2026-04-16T08:11:40Z

+DEFAULT_REALTIME_ENRICH_SINCE_HOURS = int(
+    os.environ.get("BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS", "8760")
+)


🧹 Nitpick | 🔵 Trivial

Centralize DEFAULT_REALTIME_ENRICH_SINCE_HOURS to avoid triple definition.

This constant is now defined identically in three files:

src/brainlayer/mcp/enrich_handler.py (lines 12-14)

src/brainlayer/mcp/__init__.py (lines 61-63)

src/brainlayer/cli/__init__.py (lines 27-29)

Consider defining it once in a shared location (e.g., enrichment_controller.py since it's already the canonical location for enrichment logic, or a dedicated constants module) and importing it where needed. This ensures consistency if the default or env var name ever changes.

♻️ Proposed approach

Add the constant to enrichment_controller.py:

DEFAULT_REALTIME_ENRICH_SINCE_HOURS = int( os.environ.get("BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS", "8760") )

Import in all consuming modules:

from ..enrichment_controller import DEFAULT_REALTIME_ENRICH_SINCE_HOURS

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/brainlayer/cli/__init__.py` around lines 27 - 29, Move the DEFAULT_REALTIME_ENRICH_SINCE_HOURS constant into a single canonical module (e.g., enrichment_controller) and replace the three duplicate definitions by importing that constant where needed; specifically, define DEFAULT_REALTIME_ENRICH_SINCE_HOURS once in enrichment_controller (using the same os.environ fallback logic) and in modules that currently declare it (the modules containing DEFAULT_REALTIME_ENRICH_SINCE_HOURS in mcp.enrich_handler, mcp.__init__, and cli.__init__) remove the local definition and add from ..enrichment_controller import DEFAULT_REALTIME_ENRICH_SINCE_HOURS (or the correct relative import) so all consumers use the single source of truth.

coderabbitai · 2026-04-16T08:11:40Z

+DEFAULT_REALTIME_ENRICH_SINCE_HOURS = int(
+    os.environ.get("BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS", "8760")
+)


🧹 Nitpick | 🔵 Trivial

Consider importing the constant from enrich_handler instead of redefining it.

This constant is already defined identically in enrich_handler.py (line 12-14), which this module already imports from (line 38). Importing it would eliminate duplication and ensure a single source of truth.

♻️ Proposed refactor

-DEFAULT_REALTIME_ENRICH_SINCE_HOURS = int( - os.environ.get("BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS", "8760") -) +from .enrich_handler import DEFAULT_REALTIME_ENRICH_SINCE_HOURS

And remove the import os addition on line 5 if it's no longer needed elsewhere in this file.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

DEFAULT_REALTIME_ENRICH_SINCE_HOURS = int(

os.environ.get("BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS", "8760")

)

from .enrich_handler import DEFAULT_REALTIME_ENRICH_SINCE_HOURS

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/brainlayer/mcp/__init__.py` around lines 61 - 63, Replace the duplicated constant definition DEFAULT_REALTIME_ENRICH_SINCE_HOURS in __init__.py by importing it from enrich_handler (which is already imported in this module) so there is a single source of truth; after switching to the import, remove the local assignment and also drop the now-unused import os at top of the file if nothing else uses it.

coderabbitai · 2026-04-16T08:11:40Z

+        embeddings_by_id = self._load_chunk_embeddings([candidate[1] for candidate in top_candidates])
+        mmr_candidates = [candidate for candidate in top_candidates if candidate[1] in embeddings_by_id]
+        if len(mmr_candidates) < 2:
+            return scored
+
+        fallback_candidates = [candidate for candidate in top_candidates if candidate[1] not in embeddings_by_id]
+        relevance = np.array([candidate[0] for candidate in mmr_candidates], dtype=np.float32)
+        rel_max = float(relevance.max())
+        if rel_max > 0.0:
+            normalized_relevance = relevance / rel_max
+        else:
+            normalized_relevance = np.ones_like(relevance)
+
+        matrix = np.stack([embeddings_by_id[candidate[1]] for candidate in mmr_candidates]).astype(np.float32, copy=False)
+        norms = np.linalg.norm(matrix, axis=1, keepdims=True)
+        norms = np.maximum(norms, 1e-12)
+        cosine = np.clip((matrix / norms) @ (matrix / norms).T, -1.0, 1.0)
+        np.fill_diagonal(cosine, 0.0)
+
+        selected: list[int] = [int(np.argmax(normalized_relevance))]
+        remaining = set(range(len(mmr_candidates))) - set(selected)
+
+        while remaining:
+            remaining_indices = sorted(remaining)
+            diversity_penalty = cosine[remaining_indices][:, selected].max(axis=1)
+            mmr_scores = (_MMR_LAMBDA * normalized_relevance[remaining_indices]) - (
+                (1.0 - _MMR_LAMBDA) * diversity_penalty
+            )
+            best_idx = remaining_indices[int(np.argmax(mmr_scores))]
+            selected.append(best_idx)
+            remaining.remove(best_idx)
+
+        reranked = [mmr_candidates[idx] for idx in selected]
+        return reranked + fallback_candidates + tail_candidates


⚠️ Potential issue | 🟠 Major

Preserve score order for candidates without float vectors.

return reranked + fallback_candidates + tail_candidates demotes every no-embedding candidate behind all embedding-backed ones. A result that was originally ranked near the top can disappear from the visible n_results solely because _load_chunk_embeddings() missed its vector.

💡 Proposed fix

embeddings_by_id = self._load_chunk_embeddings([candidate[1] for candidate in top_candidates]) mmr_candidates = [candidate for candidate in top_candidates if candidate[1] in embeddings_by_id] if len(mmr_candidates) < 2: return scored - fallback_candidates = [candidate for candidate in top_candidates if candidate[1] not in embeddings_by_id] + embedding_backed_ids = {candidate[1] for candidate in mmr_candidates} relevance = np.array([candidate[0] for candidate in mmr_candidates], dtype=np.float32) rel_max = float(relevance.max()) if rel_max > 0.0: normalized_relevance = relevance / rel_max else: @@ - reranked = [mmr_candidates[idx] for idx in selected] - return reranked + fallback_candidates + tail_candidates + reranked_iter = iter(mmr_candidates[idx] for idx in selected) + reranked_top = [ + next(reranked_iter) if candidate[1] in embedding_backed_ids else candidate + for candidate in top_candidates + ] + return reranked_top + tail_candidates

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/brainlayer/search_repo.py` around lines 155 - 188, The current return (reranked + fallback_candidates + tail_candidates) always places all no-embedding candidates after all vector-backed ones; instead merge reranked and fallback_candidates back together and stable-sort by the original relevance score (candidate[0]) so fallback items keep position relative to scores but the reranked items keep their MMR ordering when scores are equal. Concretely, after computing reranked, form merged = reranked + fallback_candidates, then do merged_sorted = sorted(merged, key=lambda c: c[0], reverse=True) (stable sort preserves reranked/tie order), and return merged_sorted + tail_candidates; references: _load_chunk_embeddings, mmr_candidates, fallback_candidates, reranked, tail_candidates.

coderabbitai · 2026-04-16T08:11:40Z

+    def test_mmr_rerank_dedupes_near_duplicates(self, store):
+        def embedding(primary: float, secondary: float = 0.0) -> list[float]:
+            vector = [0.0] * 1024
+            vector[0] = primary
+            vector[1] = secondary
+            return vector
+
+        _insert_chunk(
+            store,
+            chunk_id="dup-primary",
+            content="oauth token rotation incident rollback and session repair",
+            embedding=embedding(1.0, 0.0),
+            importance=6.0,
+        )
+        _insert_chunk(
+            store,
+            chunk_id="dup-secondary",
+            content="oauth token rotation incident rollback and session repair duplicate notes",
+            embedding=embedding(0.999, 0.001),
+            importance=6.0,
+        )
+        _insert_chunk(
+            store,
+            chunk_id="distinct-relevant",
+            content="oauth token rotation migration checklist and recovery guide",
+            embedding=embedding(0.72, 0.69),
+            importance=5.5,
+        )
+        _insert_chunk(
+            store,
+            chunk_id="distinct-supporting",
+            content="oauth token rotation audit trail and operator runbook",
+            embedding=embedding(0.65, 0.75),
+            importance=5.0,
+        )
+
+        scored = [
+            (0.99, "dup-primary", "oauth token rotation incident rollback and session repair", {}, 0.01),
+            (0.98, "dup-secondary", "oauth token rotation incident rollback and session repair duplicate notes", {}, 0.02),
+            (0.94, "distinct-relevant", "oauth token rotation migration checklist and recovery guide", {}, 0.12),
+            (0.9, "distinct-supporting", "oauth token rotation audit trail and operator runbook", {}, 0.15),
+        ]
+
+        reranked = store._mmr_rerank_scored_results(scored, n_results=3)
+        ids = [item[1] for item in reranked[:3]]
+
+        assert ids[0] == "dup-primary", ids
+        assert "distinct-relevant" in ids[:2], ids
+        assert set(ids[:2]) != {"dup-primary", "dup-secondary"}, ids


🧹 Nitpick | 🔵 Trivial

Cover the public hybrid_search() path as well.

This regression only exercises the private helper, so it will still pass if the hybrid_search() wiring regresses and never feeds the expected candidate pool into MMR. Please add one assertion through store.hybrid_search(...) to protect the behavior that actually changed.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/test_hybrid_search.py` around lines 293 - 341, The test currently only calls the private helper _mmr_rerank_scored_results, so a regression in the public wiring (hybrid_search) could break behavior unnoticed; add a call to store.hybrid_search(...) using the same query/context (e.g., "oauth token rotation") with parameters that trigger MMR and n_results=3, then assert the returned result ids mirror the expectations from the private test (top result is "dup-primary" and the top two are not both the duplicate pair, and "distinct-relevant" appears in the top two) to ensure the public hybrid_search path forwards the correct candidate pool into _mmr_rerank_scored_results.

macroscopeapp · 2026-04-16T08:57:20Z


 from ..paths import get_db_path

+DEFAULT_REALTIME_ENRICH_SINCE_HOURS = int(os.environ.get("BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS", "8760"))


🟢 Low cli/__init__.py:27

At module import time, int(os.environ.get("BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS", "8760")) throws ValueError when the environment variable contains a non-numeric string. Since this happens during CLI initialization, the error appears as an internal traceback rather than a user-friendly configuration message. Consider validating the value and raising a clear typer.BadParameter or SystemExit with a descriptive message.

+_raw_since_hours = os.environ.get("BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS", "8760") +try: + DEFAULT_REALTIME_ENRICH_SINCE_HOURS = int(_raw_since_hours) +except ValueError: + console.print(f"[bold red]Error:[/] BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS must be an integer, got: {_raw_since_hours!r}") + raise typer.Exit(1) +del _raw_since_hours

Also found in 2 other location(s)

src/brainlayer/mcp/__init__.py:61

Same issue as code object 0: the int() conversion of BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS at module import time will raise an unhelpful ValueError if the environment variable is set to a non-numeric string, crashing the MCP server initialization.

src/brainlayer/mcp/enrich_handler.py:12

If the BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS environment variable is set to a non-integer string (e.g., "abc" or an empty string ""), int() will raise an unhandled ValueError at module import time, crashing the application with no helpful error message about the misconfiguration.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file src/brainlayer/cli/__init__.py around line 27: At module import time, `int(os.environ.get("BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS", "8760"))` throws `ValueError` when the environment variable contains a non-numeric string. Since this happens during CLI initialization, the error appears as an internal traceback rather than a user-friendly configuration message. Consider validating the value and raising a clear `typer.BadParameter` or `SystemExit` with a descriptive message. Evidence trail: src/brainlayer/cli/__init__.py lines 27: `DEFAULT_REALTIME_ENRICH_SINCE_HOURS = int(os.environ.get("BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS", "8760"))` - module-level code with no validation or exception handling around the int() call. Also found in 2 other location(s): - src/brainlayer/mcp/__init__.py:61 -- Same issue as code object 0: the `int()` conversion of `BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS` at module import time will raise an unhelpful `ValueError` if the environment variable is set to a non-numeric string, crashing the MCP server initialization. - src/brainlayer/mcp/enrich_handler.py:12 -- If the `BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS` environment variable is set to a non-integer string (e.g., `"abc"` or an empty string `""`), `int()` will raise an unhandled `ValueError` at module import time, crashing the application with no helpful error message about the misconfiguration.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d545514c00

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-16T09:01:01Z


        # 1. Semantic search leg — prefer binary vectors, fall back to float vectors
        # when the binary index is unavailable (for example readonly live DBs).
+        candidate_fetch_count = max(n_results * 3, _MMR_CANDIDATE_LIMIT)


Avoid multiplying the new candidate floor into 500-KNN scans

hybrid_search now forces candidate_fetch_count to at least 50, but the binary path still applies its existing filtered-query overfetch (effective_k = n_results * 10 in _binary_search). For entity_id/source_filter_like/correction-filtered searches, this turns the intended 50-candidate pool into 500-vector KNN probes, which is a significant latency regression and can trigger MCP timeouts on larger DBs. The MMR floor should be decoupled from the binary leg’s internal overfetch multiplier.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-16T09:01:01Z


 from ..paths import get_db_path

+DEFAULT_REALTIME_ENRICH_SINCE_HOURS = int(os.environ.get("BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS", "8760"))


Guard env-based enrich default parsing at import time

This import-time int(...) cast will raise ValueError if BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS is set to a non-integer string, which prevents the CLI module from loading at all (including unrelated commands). Because this is user-controlled environment input, failing hard during module import is brittle; parse defensively (or defer validation to command execution) so one bad env value does not brick the entire CLI.

Useful? React with 👍 / 👎.

Add MMR dedup to hybrid search

43a0d2a

Fix enrichment default since-hours for MCP/CLI

00e08f5

fix: sort imports in MCP enrich handler

b983f95

EtanHey marked this pull request as ready for review April 16, 2026 08:03

greptile-apps bot reviewed Apr 16, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Apr 16, 2026

View reviewed changes

coderabbitai bot reviewed Apr 16, 2026

View reviewed changes

style: apply ruff format to PR files

d545514

macroscopeapp bot reviewed Apr 16, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Apr 16, 2026

View reviewed changes

EtanHey merged commit fd04ecd into main Apr 16, 2026
6 checks passed

coderabbitai bot mentioned this pull request Apr 16, 2026

fix: drop legacy unique content_hash index #245

Merged


		from ..paths import get_db_path

		DEFAULT_REALTIME_ENRICH_SINCE_HOURS = int(os.environ.get("BRAINLAYER_DEFAULT_ENRICH_SINCE_HOURS", "8760"))

Conversation

EtanHey commented Apr 14, 2026 • edited by macroscopeapp bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Design Summary

Test Output

Add MMR post-retrieval deduplication to hybrid_search in SearchMixin

🗂️ Filtered Issues

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

EtanHey commented Apr 16, 2026

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

macroscopeapp bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

EtanHey commented Apr 14, 2026 •

edited by macroscopeapp bot

Loading

Add MMR post-retrieval deduplication to `hybrid_search` in `SearchMixin`

coderabbitai bot commented Apr 14, 2026 •

edited

Loading