Skip to content

feat(bedrock): add prompt caching support for custom ARNs and inference profiles#1

Closed
marcelloceschia wants to merge 7 commits intodevfrom
fix/bedrock-prompt-caching-custom-arn
Closed

feat(bedrock): add prompt caching support for custom ARNs and inference profiles#1
marcelloceschia wants to merge 7 commits intodevfrom
fix/bedrock-prompt-caching-custom-arn

Conversation

@marcelloceschia
Copy link
Owner

Summary

  • Enable prompt caching for Bedrock models that support it (Claude, Nova)
  • Add 'caching' option for custom ARNs/inference profiles without claude in name
  • Disable caching for Llama, Mistral, Cohere models (not supported)
  • Add comprehensive tests for all caching scenarios

Fixes

  • Prompt cache not supported for custom ARN models
  • 1M context window not configurable

Usage

Users can now configure custom ARNs like:

{
  "provider": {
    "amazon-bedrock": {
      "models": {
        "arn:aws:bedrock:...:application-inference-profile/xxx": {
          "options": { "caching": true },
          "limit": { "context": 1000000, "output": 32000 }
        }
      }
    }
  }
}

Test Plan

  • Added 15+ tests covering all caching scenarios
  • Claude models → Caching ✅
  • Nova models → Caching ✅
  • Llama/Mistral/Cohere → No Caching ❌
  • Custom ARNs with options.caching → Configurable

…ce profiles

- Enable prompt caching for Bedrock models that support it (Claude, Nova)
- Add 'caching' option for custom ARNs/inference profiles without claude in name
- Disable caching for Llama, Mistral, Cohere models (not supported)
- Add comprehensive tests for all caching scenarios

Fixes #1: Prompt cache not supported for custom ARN models
Fixes anomalyco#2: 1M context window not configurable

Users can now configure custom ARNs like:
```json
{
  "provider": {
    "amazon-bedrock": {
      "models": {
        "arn:aws:bedrock:...:application-inference-profile/xxx": {
          "options": { "caching": true },
          "limit": { "context": 1000000, "output": 32000 }
        }
      }
    }
  }
}
```
@marcelloceschia marcelloceschia force-pushed the fix/bedrock-prompt-caching-custom-arn branch from 2ed846f to e349074 Compare March 7, 2026 17:53
@github-actions
Copy link

This PR doesn't fully meet our contributing guidelines and PR template.

What needs to be fixed:

  • PR description is missing required template sections. Please use the PR template.

Please edit this PR description to address the above within 2 hours, or it will be automatically closed.

If you believe this was flagged incorrectly, please let a maintainer know.

marcelloceschia added a commit that referenced this pull request Mar 14, 2026
…ce profiles

- Enable prompt caching for Bedrock models that support it (Claude, Nova)
- Add 'caching' option for custom ARNs/inference profiles without claude in name
- Disable caching for Llama, Mistral, Cohere models (not supported)
- Add comprehensive tests for all caching scenarios

Fixes #1: Prompt cache not supported for custom ARN models
Fixes anomalyco#2: 1M context window not configurable

Users can now configure custom ARNs like:
```json
{
  "provider": {
    "amazon-bedrock": {
      "models": {
        "arn:aws:bedrock:...:application-inference-profile/xxx": {
          "options": { "caching": true },
          "limit": { "context": 1000000, "output": 32000 }
        }
      }
    }
  }
}
```
@github-actions
Copy link

This pull request has been automatically closed because it was not updated to meet our contributing guidelines within the 2-hour window.

Feel free to open a new pull request that follows our guidelines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants