Skip to content

feat: FTS fallback + query expansion for memory search#18304

Merged
steipete merged 4 commits intoopenclaw:mainfrom
irchelper:feat/fts-query-expansion
Feb 16, 2026
Merged

feat: FTS fallback + query expansion for memory search#18304
steipete merged 4 commits intoopenclaw:mainfrom
irchelper:feat/fts-query-expansion

Conversation

@irchelper
Copy link

@irchelper irchelper commented Feb 16, 2026

Depends on #17800 (OAuth support).

Summary

When no embedding provider is configured, fall back to full-text search instead of disabling memory search. Includes LLM-based query expansion for better FTS results.

Changes

  • FTS fallback in memory manager when embeddings unavailable
  • Query expansion: extract keywords from conversational queries using LLM
  • Null safety guards for FTS-only mode

Depends on #17800

Greptile Summary

Adds FTS-only fallback mode when no embedding provider is configured, with LLM-based query expansion to improve keyword extraction from conversational queries. Also includes OAuth support for Gemini API.

Key changes:

  • FTS fallback when embeddings unavailable: memory search degrades gracefully to keyword search instead of failing
  • Query expansion extracts keywords from conversational queries (e.g., "that thing we discussed about the API" → ["discussed", "API"])
  • Null safety throughout: provider: EmbeddingProvider | null with guards in all embedding operations
  • OAuth support for Gemini via parseGeminiAuth helper (JSON format with token/projectId)

Issues found:

  • FTS-only mode may over-fetch results when searching multiple keywords (see inline comment)

Confidence Score: 4/5

  • Safe to merge with one optimization opportunity in keyword search logic
  • The implementation is well-structured with comprehensive null safety guards and good test coverage. The FTS fallback logic is sound, but there's an efficiency issue where multiple keyword searches could fetch more results than necessary before merging. The OAuth support is straightforward and the query expansion uses sensible stop-word lists for both English and Chinese.
  • Check src/memory/manager.ts:244-246 for the keyword search batching issue

Last reviewed commit: de98782

康熙 added 4 commits February 16, 2026 18:16
Add parseGeminiAuth() to detect OAuth JSON format ({"token": "...", "projectId": "..."})
and use Bearer token authentication instead of x-goog-api-key header.

This allows OAuth users (using gemini-cli-auth extension) to use memory_search
with Gemini embedding API.
Extract parseGeminiAuth() to shared infra module and use it in both
embeddings-gemini.ts and inline-data.ts.

Previously, inline-data.ts directly set x-goog-api-key header without
handling OAuth JSON format. Now it properly supports both traditional
API keys and OAuth tokens.
…aw#17725)

When no embedding provider is available (e.g., OAuth mode without API keys),
memory_search now falls back to FTS-only mode instead of returning disabled: true.

Changes:
- embeddings.ts: return null provider with reason instead of throwing
- manager.ts: handle null provider, use FTS-only search mode
- manager-search.ts: allow searching all models when provider is undefined
- memory-tool.ts: expose search mode in results

The search results now include a 'mode' field indicating 'hybrid' or 'fts-only'.
When searching in FTS-only mode (no embedding provider), extract meaningful
keywords from conversational queries using LLM to improve search results.

Changes:
- New query-expansion module with keyword extraction
- Supports English and Chinese stop word filtering
- Null safety guards for FTS-only mode (provider can be null)
- Lint compliance fixes for string iteration

This helps users find relevant memory entries even with vague queries.
@openclaw-barnacle openclaw-barnacle bot added agents Agent runtime and tooling size: L labels Feb 16, 2026
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

12 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines +244 to +246
const resultSets = await Promise.all(
searchTerms.map((term) => this.searchKeyword(term, candidates).catch(() => [])),
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

each keyword search uses candidates limit, but merging happens after - total results fetched could be searchTerms.length * candidates

if user searches "API discussion design" and it extracts 3 keywords, this fetches up to 600 results (3 * 200) before merging

Suggested change
const resultSets = await Promise.all(
searchTerms.map((term) => this.searchKeyword(term, candidates).catch(() => [])),
);
// Search with each keyword and merge results (limit per-keyword to avoid over-fetching)
const perTermLimit = Math.min(candidates, Math.ceil(candidates / searchTerms.length));
const resultSets = await Promise.all(
searchTerms.map((term) => this.searchKeyword(term, perTermLimit).catch(() => [])),
);
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/memory/manager.ts
Line: 244:246

Comment:
each keyword search uses `candidates` limit, but merging happens after - total results fetched could be `searchTerms.length * candidates`

if user searches "API discussion design" and it extracts 3 keywords, this fetches up to 600 results (3 * 200) before merging

```suggestion
      // Search with each keyword and merge results (limit per-keyword to avoid over-fetching)
      const perTermLimit = Math.min(candidates, Math.ceil(candidates / searchTerms.length));
      const resultSets = await Promise.all(
        searchTerms.map((term) => this.searchKeyword(term, perTermLimit).catch(() => [])),
      );
```

How can I resolve this? If you propose a fix, please make it concise.

@steipete steipete merged commit bcab246 into openclaw:main Feb 16, 2026
26 checks passed
@irchelper irchelper deleted the feat/fts-query-expansion branch February 17, 2026 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants