[mypyc] Use native integers for some sequence indexing operations#19426
Merged
[mypyc] Use native integers for some sequence indexing operations#19426
Conversation
For example, when iterating over a list, now we use a native integer
for the index (which is not exposed to the user). Previously we used
tagged integers, but in these use cases they provide no real benefit.
This simplifies the IR and should slightly improve performance, as fewer
tagged int to native int conversions are needed.
Multiple ops have to be migrated in one go, as these interact with
each other, and by only changing a subset of them would actually
generate more verbose IR, as a bunch of extra coercions would be
needed.
List of impacted statements:
* For loop over sequence
* Assignment like `x, y = a` for tuple/list rvalue
* Dict iteration
* List comprehension
For example, consider this example:
```
def foo(a: list[int]) -> None:
for x in a:
pass
```
Old generated IR was like this:
```
def foo(a):
a :: list
r0 :: short_int
r1 :: ptr
r2 :: native_int
r3 :: short_int
r4 :: bit
r5 :: native_int
r6, r7 :: ptr
r8 :: native_int
r9 :: ptr
r10 :: object
r11 :: int
r12 :: short_int
r13 :: None
L0:
r0 = 0
L1:
r1 = get_element_ptr a ob_size :: PyVarObject
r2 = load_mem r1 :: native_int*
r3 = r2 << 1
r4 = r0 < r3 :: signed
if r4 goto L2 else goto L5 :: bool
L2:
r5 = r0 >> 1
r6 = get_element_ptr a ob_item :: PyListObject
r7 = load_mem r6 :: ptr*
r8 = r5 * 8
r9 = r7 + r8
r10 = load_mem r9 :: builtins.object*
inc_ref r10
r11 = unbox(int, r10)
dec_ref r10
if is_error(r11) goto L6 (error at foo:2) else goto L3
L3:
dec_ref r11 :: int
L4:
r12 = r0 + 2
r0 = r12
goto L1
L5:
return 1
L6:
r13 = <error> :: None
return r13
```
Now the generated IR is simpler:
```
def foo(a):
a :: list
r0 :: native_int
r1 :: ptr
r2 :: native_int
r3 :: bit
r4, r5 :: ptr
r6 :: native_int
r7 :: ptr
r8 :: object
r9 :: int
r10 :: native_int
r11 :: None
L0:
r0 = 0
L1:
r1 = get_element_ptr a ob_size :: PyVarObject
r2 = load_mem r1 :: native_int*
r3 = r0 < r2 :: signed
if r3 goto L2 else goto L5 :: bool
L2:
r4 = get_element_ptr a ob_item :: PyListObject
r5 = load_mem r4 :: ptr*
r6 = r0 * 8
r7 = r5 + r6
r8 = load_mem r7 :: builtins.object*
inc_ref r8
r9 = unbox(int, r8)
dec_ref r8
if is_error(r9) goto L6 (error at foo:2) else goto L3
L3:
dec_ref r9 :: int
L4:
r10 = r0 + 1
r0 = r10
goto L1
L5:
return 1
L6:
r11 = <error> :: None
return r11
```
for more information, see https://pre-commit.ci
p-sawicki
reviewed
Jul 11, 2025
p-sawicki
approved these changes
Jul 11, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
For example, when iterating over a list, now we use a native integer
for the index (which is not exposed to the user). Previously we used
tagged integers, but in these use cases they provide no real benefit.
This simplifies the IR and should slightly improve performance, as fewer
tagged int to native int conversions are needed.
Multiple ops have to be migrated in one go, as these interact with
each other, and by only changing a subset of them would actually
generate more verbose IR, as a bunch of extra coercions would be
needed.
List of impacted statements:
x, y = afor tuple/list rvalueFor example, consider this example:
Old generated IR was like this:
Now the generated IR is simpler: