LGTM!
Now that long-context window is GA for Sonnet 4.6 and Opus 4.6 (see https://claude.com/blog/1m-context-ga), context-1m-2025-08-07 beta headers are no longer needed for Sonnet 4.6 and Opus 4.6
From Anthropic:
The 1M token context window is now generally available for Claude Opus 4.6 and Sonnet 4.6 via API. Both models include the full window at standard pricing—$5/$25 per million tokens for Opus 4.6 and $3/$15 for Sonnet 4.6. Previously there were separate rate limits above and below 200K tokens. We’ve simplified this to a single rate limit for the full context window. As part of this, we’ve raised your base rate limit on Opus 4.6 to 18M to accommodate your existing long context usage with room to grow. Note this applies only to the Gitlab Production (managed account) org. As always, please reach out to request additional increases.
We are also updating pricing multipliers in https://gitlab.com/gitlab-org/customers-gitlab-com/-/merge_requests/15070
Numbered steps to set up and validate the change are strongly suggested.
@wortschi fair point. Should be fine if we already have context info in the MR description.
@romaneisner would you be able to help review it?
Add two comments.
suggestion (non-blocking): it would be great to add anthropic announcement or links to raise awareness that this feature is exclusively support to Opus and Sonnet 4.6 only.
suggestion (nit-picking): 1_000_000 format generally has a better readability, I don't need to count the number of 0
Hi @bastirehm @wortschi @timzallmann , now the Claude 1m context window GA at a standard pricing, we need to add support on ai-gateway to leverage it. cc @achueshev
As a GitLab Duo user, I want to leverage the full 1M context window now available for Claude Opus 4.6 and Sonnet 4.6, so I can work with larger codebases and documents in a single conversation without context truncation.
Anthropic announced general availability of 1M context window (March 13, 2026) with:
Update AI Gateway to support the full 1M context window for Claude Opus 4.6 and Sonnet 4.6 models.
Update Model Configuration
models.yml to reflect 1M (1,000,000) token context window for Claude Opus 4.6 and Sonnet 4.6Remove Beta Header Requirements
anthropic-beta header logic for long-context requestsTesting
Benefits:
Acceptance Criteria:
Hey @bluenoodles, do you have any progress or hit any blockers on this? This feature turns out needed to be prioritized soon. If you are unable to deliver, feel free to let us know. We could also suggest other issues that you can work on.
@bcardoso- would you be able to do the maintainer review of this short adding compaction doc MR?
Addressed
Junming Huang (4bb1219b) at 16 Mar 13:22
fix: update doc yaml example
@GitLabDuo that should be addressed by looking at the yaml example and code itself.
@GitLabDuo should be fine.
Hi @alejandro @fpiva, we are adding conversation history compaction feature to DAP which requires making llm call that is not triggered by user. I have the following questions hope you can help:
fixed
Junming Huang (790ef606) at 16 Mar 13:04
fix: align compaction docs with actual implementation
Junming Huang (9f07b662) at 16 Mar 12:50
fix: address review feedback for compaction docs
@eduardobonet can you help do the initial review of this small MR?