feat: add llms.txt discovery as default agent behavior by yolo-maxi · Pull Request #18158 · openclaw/openclaw

yolo-maxi · 2026-02-16T15:21:11Z

Summary

Adds automatic llms.txt awareness so OpenClaw agents check for /llms.txt or /.well-known/llms.txt when exploring new domains.

Changes

System prompt: New llms.txt Discovery section (full mode only, when web_fetch is available) instructing agents to check for llms.txt files when visiting new domains
web_fetch tool: Updated description to mention llms.txt discovery

Why

llms.txt is an emerging standard (like robots.txt for AI) that helps site owners describe how AI agents should interact with their content and APIs. Making this a default agent behavior:

Helps the ecosystem adopt agent-native web experiences
Gives agents better context about how to use a site's resources
Zero cost when the file doesn't exist (agents just move on)
Respects site owner preferences for AI interaction

Details

The system prompt section is:

Only included in full prompt mode (not subagents)
Only included when web_fetch tool is available
Instructs agents not to warn when llms.txt is missing (most sites don't have one yet)

TypeScript compiles cleanly with no errors.

Greptile Summary

This PR adds automatic llms.txt awareness to OpenClaw agents. When the web_fetch tool is available and the prompt mode is full (not subagents), the system prompt now includes a new "llms.txt Discovery" section instructing agents to check for /llms.txt or /.well-known/llms.txt when visiting new domains. The web_fetch tool description is also updated to mention this behavior.

The buildLlmsTxtSection function follows the same pattern as other section builders (buildVoiceSection, buildMemorySection, etc.) with proper isMinimal and tool-availability guards
The section is correctly excluded for subagent/minimal/none prompt modes
One logic gap: the guard only checks for web_fetch availability, but the prompt text mentions "via web_fetch or browser" — agents with only browser available won't see this section
The existing minimal-mode test (system-prompt.e2e.test.ts) does not include web_fetch in toolNames and does not assert that the new section is excluded, so the new behavior has no direct test coverage

Confidence Score: 4/5

This PR is safe to merge — it only adds prompt text and a tool description update with no runtime logic changes.
The changes are low-risk (prompt text additions only, no runtime behavior changes). The section builder follows established patterns and has proper guards. One minor logic inconsistency exists between the guard condition and the prompt text regarding the browser tool, but this won't cause failures — it just means agents with only browser available won't get the llms.txt hint. No test coverage was added for the new section.
src/agents/system-prompt.ts — the buildLlmsTxtSection guard/text mismatch regarding browser tool should be reconciled

_{Last reviewed commit: 731963a}

_{(2/5) Greptile learns from your feedback when you react with thumbs up/down!}

Add automatic llms.txt awareness so agents check for /llms.txt or /.well-known/llms.txt when exploring new domains. Changes: - System prompt: new 'llms.txt Discovery' section (full mode only, when web_fetch is available) instructing agents to check for llms.txt files when visiting new domains - web_fetch tool: updated description to mention llms.txt discovery llms.txt is an emerging standard (like robots.txt for AI) that helps site owners describe how AI agents should interact with their content. Making this a default behavior helps the ecosystem adopt agent-native web experiences. Ref: https://llmstxt.org

greptile-apps

_{2 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-16T15:23:52Z

src/agents/system-prompt.ts

+  }
+  if (!params.availableTools.has("web_fetch")) {


Guard only checks web_fetch, but text mentions browser

The prompt text on line 159 says "via web_fetch or browser", but the gate at line 154 only checks for web_fetch. If an agent has browser available but not web_fetch, the section is silently omitted — even though the agent could still check for llms.txt via the browser tool.

Consider also gating on browser:

Suggested change

}

if (!params.availableTools.has("web_fetch")) {

if (!params.availableTools.has("web_fetch") && !params.availableTools.has("browser")) {

return [];

}

Alternatively, if the intent is to only include this when web_fetch is present, the prompt text on line 159 should drop the "or browser" mention to avoid confusion.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/agents/system-prompt.ts Line: 153:154 Comment: **Guard only checks `web_fetch`, but text mentions `browser`** The prompt text on line 159 says "via web_fetch or browser", but the gate at line 154 only checks for `web_fetch`. If an agent has `browser` available but not `web_fetch`, the section is silently omitted — even though the agent could still check for llms.txt via the browser tool. Consider also gating on `browser`: ```suggestion if (!params.availableTools.has("web_fetch") && !params.availableTools.has("browser")) { return []; } ``` Alternatively, if the intent is to only include this when `web_fetch` is present, the prompt text on line 159 should drop the "or browser" mention to avoid confusion. How can I resolve this? If you propose a fix, please make it concise.

sebslight · 2026-02-17T14:12:28Z

Reverted in e6683a6 (reverts merge commit e368c36).

This was an accidental merge, so we rolled it back.

yolo-maxi · 2026-02-17T17:01:56Z

Hey @sebslight — saw the revert. Understood if the merge was accidental, but the PR itself is intentional and all CI checks passed. Would love a proper review when you get a chance.

Happy to address any feedback or restructure the approach if needed.

HenryLoenwind · 2026-02-18T10:58:21Z

Not sure if this is now in or not; it's in the changelog. Anyway:

"Zero cost" is not right. Without a way for the agent to keep track of what "new domains" are, this is two additional tool calls producing tokens to pay for. And with such a way, it's one tool call to read the file (even more tokens) or grep for the domain (if the agent is smart).

It'd be better to keep track of domains and cache the llms.txt in the code, and then give agents the llms.txt the first time they access that domain in a session alongside the fetch result. After scanning the llms.txt for prompt injection. Very heavily. We want them to follow those instructions, but not that hard. Those are basically skills that are downloaded from random sites the websearch has found...oh...on second thought, I don't want my agent to do that!

openclaw-barnacle bot added agents Agent runtime and tooling size: XS labels Feb 16, 2026

greptile-apps bot reviewed Feb 16, 2026

View reviewed changes

steipete merged commit e368c36 into openclaw:main Feb 16, 2026
26 checks passed

sebslight self-assigned this Feb 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add llms.txt discovery as default agent behavior#18158

feat: add llms.txt discovery as default agent behavior#18158
steipete merged 1 commit intoopenclaw:mainfrom
yolo-maxi:feature/llms-txt-discovery

yolo-maxi commented Feb 16, 2026 •

edited by greptile-apps bot

Loading

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Feb 16, 2026

Uh oh!

Uh oh!

sebslight commented Feb 17, 2026

Uh oh!

yolo-maxi commented Feb 17, 2026

Uh oh!

HenryLoenwind commented Feb 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

-  }
-  if (!params.availableTools.has("web_fetch")) {
+  if (!params.availableTools.has("web_fetch") && !params.availableTools.has("browser")) {
+    return [];
+  }

Uh oh!

Conversation

yolo-maxi commented Feb 16, 2026 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Why

Details

Greptile Summary

Confidence Score: 4/5

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sebslight commented Feb 17, 2026

Uh oh!

yolo-maxi commented Feb 17, 2026

Uh oh!

HenryLoenwind commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yolo-maxi commented Feb 16, 2026 •

edited by greptile-apps bot

Loading

HenryLoenwind commented Feb 18, 2026 •

edited

Loading