Support gpt-5.1 model in Tiktoken tokenizer#7556
Merged
ericstj merged 2 commits intodotnet:mainfrom Nov 19, 2025
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR adds support for the gpt-5.1 model in the Tiktoken tokenizer implementation, aligning with an open feature request in the official Tiktoken library.
Key changes:
- Added
gpt-5.1model mapping to the O200kBase encoding in the tokenizer configuration - Extended test coverage to include the new
gpt-5.1model across multiple test scenarios
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/Microsoft.ML.Tokenizers/Model/TiktokenTokenizer.cs | Added gpt-5.1 model entries to prefix and exact match lookup tables for O200kBase encoding |
| test/Microsoft.ML.Tokenizers.Tests/TiktokenTests.cs | Added GPT5_1 tokenizer instance and included it in encoding tests and test data parameters |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #7556 +/- ##
==========================================
- Coverage 69.02% 69.02% -0.01%
==========================================
Files 1482 1482
Lines 274093 274096 +3
Branches 28266 28266
==========================================
+ Hits 189183 189184 +1
- Misses 77527 77528 +1
- Partials 7383 7384 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
stephentoub
approved these changes
Nov 19, 2025
ericstj
approved these changes
Nov 19, 2025
Member
|
The |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
There is an open issue requesting the same support in the official Tiktoken library: openai/tiktoken#464.