Prepare CharacterTracker for advanced font features#30608
Merged
QuLogic merged 5 commits intomatplotlib:text-overhaulfrom Oct 2, 2025
Merged
Prepare CharacterTracker for advanced font features#30608QuLogic merged 5 commits intomatplotlib:text-overhaulfrom
CharacterTracker for advanced font features#30608QuLogic merged 5 commits intomatplotlib:text-overhaulfrom
Conversation
58874b6 to
5afd71b
Compare
CharacterTracker.track implementationCharacterTracker for advanced font features
Member
Author
|
Note, I think the original commit was small, and the remaining ended up small enough, that I just put them all in this PR. |
5 tasks
anntzer
reviewed
Sep 29, 2025
anntzer
reviewed
Sep 29, 2025
anntzer
reviewed
Sep 29, 2025
anntzer
reviewed
Sep 29, 2025
anntzer
reviewed
Sep 29, 2025
anntzer
reviewed
Sep 29, 2025
anntzer
reviewed
Sep 29, 2025
| print("%%BeginProlog", file=fh) | ||
| if not mpl.rcParams['ps.useafm']: | ||
| Ndict += len(ps_renderer._character_tracker.used) | ||
| Ndict += sum(map(len, ps_renderer._character_tracker.used.values()), 0) |
Contributor
There was a problem hiding this comment.
The 0 at the end is unneeded.
anntzer
reviewed
Sep 29, 2025
No need to repeat the calculation of subset blocks, but instead offload it to `track_glyph`.
5e89363 to
662ac58
Compare
Instead of splitting fonts into `subset_size` blocks and writing text as character code modulo `subset_size`, compress the blocks by doing two things: 1. Preserve the character code if it lies in the first block. This keeps ASCII (for Type 3) and the Basic Multilingual Plane (for Type 42) as their normal codes. 2. Push everything else into the next spot in the next block, splitting by `subset_size` as necessary. This should reduce the number of additional font subsets to embed.
If mixing languages, sometimes a single character may use different glyphs in one document. In that case, we need to give it a new character code in the next subset, since subset 0 is preserving character codes.
662ac58 to
c21a0b0
Compare
Member
Author
|
OK, I've handled all your comments, I think. I also fixed subsetting in the PostScript backend, noted above. There are 3 test image changes:
|
anntzer
reviewed
Sep 30, 2025
anntzer
approved these changes
Sep 30, 2025
Contributor
anntzer
left a comment
There was a problem hiding this comment.
Just a minor point regarding a comment.
For ligatures or complex shapings, multiple characters may map to a single glyph. In this case, we still want to output a single character code for the string using the font subset, but the `ToUnicode` map should give back all the characters.
Previously, this was supposed to "upgrade" type 3 to type 42 if the number of glyphs overflowed. However, as `CharacterTracker` can suggest a new subset for other reasons (i.e., multiple glyphs for the same character or a glyph for multiple characters may go to a second subset), we do need proper subset handling here as well. Since that is now done, we can drop the "promotion" from type 3 to type 42, as we don't get too many glyphs in each embedded font.
c21a0b0 to
d781040
Compare
ksunden
approved these changes
Oct 2, 2025
Member
Author
|
Removed the image changes (and moved them to the |
Member
Author
|
Linting issues are known (#30626), so merging over those. |
wavebyrd
pushed a commit
to wavebyrd/matplotlib
that referenced
this pull request
Mar 13, 2026
Prepare `CharacterTracker` for advanced font features
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR summary
The original code split fonts into subsets based on the character modulo the subset size (determined by font type limits). This had some limitations:
ordon a multi-char string.To fix this,
CharacterTrackernow tracks characters and glyphs more closely. Specifically,With these changes, the complex/font features/languages tests in #30607 produce correct results.
PR checklist