Skip to content

Bytecode parity#7507

Merged
youknowone merged 3 commits intoRustPython:mainfrom
youknowone:bytecode-parity
Mar 27, 2026
Merged

Bytecode parity#7507
youknowone merged 3 commits intoRustPython:mainfrom
youknowone:bytecode-parity

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Mar 26, 2026

Summary by CodeRabbit

  • Internal Improvements
    • Faster compilation for non-async for-loops over literal sequences.
    • More accurate handling of yields and exception-stack depth for resumed code.
    • Refined scope and cell-variable classification for more reliable module/class behavior.
    • Reduced unnecessary propagation of certain implicit class/annotation variables between scopes.
    • Improved ordering for boolean short-circuit evaluation and more consistent string constant handling.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 26, 2026

📝 Walkthrough

Walkthrough

Modifies compiler codegen and symbol-table behavior: reworks cellvar classification and module/class conditional-annotation handling, emits module MakeCell and adjusts RESUME ordering, optimizes for over list literals, reorders boolean short-circuit ToBool emission, refines yield/exception-depth marking, and tightens PEP 649 annotation/free-name propagation.

Changes

Cohort / File(s) Summary
Cellvar / Conditional-annotation / Bytecode emission
crates/codegen/src/compile.rs
Collects cellvars as (name, flags) and classifies parameter cellvars via SymbolFlags::PARAMETER; inserts __conditional_annotations__ for Class and Module; emits Instruction::MakeCell for module cellvars before RESUME.
Loop & Boolean short-circuit
crates/codegen/src/compile.rs
compile_for detects non-async for over list literals without starred elements and builds a tuple of elements instead of compiling the list; emit_short_circuit_test now emits Instruction::ToBool after Copy { i: 1 } and before conditional jumps.
String constant representation
crates/codegen/src/compile.rs
Docs updated to “surrogate literals” and ConstantData::Str construction uses explicit struct initializer (ConstantData::Str { value: ... }) consistently.
Yield / exception-depth labeling
crates/codegen/src/ir.rs
label_exception_targets computes last_yield_except_depth differently: direct YIELD_VALUE records current except depth; yield-from/await record (stack.len() - 1) to account for internal SETUP_FINALLY.
PEP 649 / annotation free-name propagation
crates/codegen/src/symboltable.rs
Skip fallback __classdict__ requirement when future_annotations is active; merge annotation-scope free names into parent newfree; suppress propagation of implicit class-scope frees through function-like intermediate scopes; restrict conditional-annotation registration to AnnAssign when future_annotations is false and route AnnAssign targets to dedicated scan function.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~55 minutes

Possibly related PRs

Suggested reviewers

  • ShaharNaveh

Poem

🐇 I hop through scopes with tiny feet,
Cells and annotations I neatly meet,
Loops that skip the list’s slow art,
Short-circuits whisper ToBool’s part,
Yields now know their depth — joy in my heart! 🎉

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Bytecode parity' is vague and does not clearly convey the specific changes made, which include cellvar classification, conditional annotations handling, for-loop optimization, short-circuit instruction ordering, and constant construction updates. Consider using a more specific title that describes the main changes, such as 'Improve bytecode generation for cellvars, annotations, and for-loops' or 'Align bytecode output with CPython implementation'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone force-pushed the bytecode-parity branch 2 times, most recently from 4cd51b6 to 71f8b14 Compare March 26, 2026 12:39
@github-actions
Copy link
Contributor

github-actions bot commented Mar 26, 2026

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

[x] test: cpython/Lib/test/test_compile.py (TODO: 27)
[x] test: cpython/Lib/test/test_compiler_assemble.py
[x] test: cpython/Lib/test/test_compiler_codegen.py
[x] test: cpython/Lib/test/test_peepholer.py (TODO: 27)

dependencies:

dependent tests: (no tests depend on compile)

Legend:

  • [+] path exists in CPython
  • [x] up-to-date, [ ] outdated

@youknowone youknowone force-pushed the bytecode-parity branch 2 times, most recently from 6dfb579 to be0e81b Compare March 26, 2026 15:03
Compiler changes:
- Emit TO_BOOL in and/or short-circuit evaluation (COPY+TO_BOOL+JUMP)
- Add module-level __conditional_annotations__ cell (PEP 649)
- Only set conditional annotations for AnnAssign, not function params
- Skip __classdict__ cell when future annotations are active
- Convert list literals to tuples in for-loop iterables
- Fix cell variable ordering: parameters first, then alphabetical
- Fix RESUME DEPTH1 flag for yield-from/await
- Don't propagate __classdict__/__conditional_annotations__ freevar
  through regular functions — only annotation/type-param scopes
- Inline string compilation path
@youknowone youknowone marked this pull request as ready for review March 27, 2026 00:35
SIGSEGV in _finalizer_registry dict access under aggressive GC
and thread switching. Root cause is dict thread-safety in VM.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
crates/codegen/src/compile.rs (1)

1799-1806: Extract the module __conditional_annotations__ setup into one helper.

compile_program now open-codes the same cellvar insertion and MAKE_CELL emission that enter_scope also knows about, but the real module path still bypasses enter_scope. Keeping both init paths in sync will be easy to miss on the next parity tweak.

As per coding guidelines "When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code."

Also applies to: 1810-1817

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/compile.rs` around lines 1799 - 1806, Extract the
duplicated module-level __conditional_annotations__ setup into a single helper
(e.g., add a method like emit_module_conditional_annotations or
init_conditional_annotations) and call it from both compile_program and
enter_scope; specifically, move the logic that checks has_module_cond_ann,
inserts "__conditional_annotations__" into
current_code_info().metadata.cellvars, and emits the MAKE_CELL sequence into
that helper so both code paths (the open‑coded path in compile_program and the
one in enter_scope) invoke the same routine, passing only the differing value
(has_module_cond_ann or the current scope) as needed to avoid duplication.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/codegen/src/compile.rs`:
- Around line 5460-5477: The list→tuple optimization should be skipped when
compiling the iterable of an async for so GET_AITER sees the original list type;
modify the block that currently checks ast::Expr::List (the one that compiles
elts and emits Instruction::BuildTuple) to also verify we are NOT on the
async-for path (e.g., add a guard like !self.is_compiling_async_for() or check
the parent node kind) before building the tuple; if in an async for, fall
through to the existing self.compile_expression(iter)? branch so the list
literal is preserved and error messages remain source-faithful.

In `@crates/codegen/src/symboltable.rs`:
- Around line 300-301: The branch checking symbol_table.needs_classdict and
symbol_table.future_annotations can be wrong for nested class symbol tables
because SymbolTable::new() defaults future_annotations to false and finish()
only sets the top-level module; propagate the parent's future_annotations into
any nested SymbolTable instances when creating them (or have finish() propagate
the flag into child class tables) so nested class tables correctly see the from
__future__ import annotations flag before this check; update the code that
constructs nested symbol tables (where SymbolTable::new() is called for classes)
to copy parent.symbol_table.future_annotations into the child
SymbolTable.future_annotations, ensuring needs_classdict logic uses the correct
flag.

---

Nitpick comments:
In `@crates/codegen/src/compile.rs`:
- Around line 1799-1806: Extract the duplicated module-level
__conditional_annotations__ setup into a single helper (e.g., add a method like
emit_module_conditional_annotations or init_conditional_annotations) and call it
from both compile_program and enter_scope; specifically, move the logic that
checks has_module_cond_ann, inserts "__conditional_annotations__" into
current_code_info().metadata.cellvars, and emits the MAKE_CELL sequence into
that helper so both code paths (the open‑coded path in compile_program and the
one in enter_scope) invoke the same routine, passing only the differing value
(has_module_cond_ann or the current scope) as needed to avoid duplication.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 7ba5c833-290b-4a20-8322-a1a86c7ba71e

📥 Commits

Reviewing files that changed from the base of the PR and between 9282a87 and af963bc.

⛔ Files ignored due to path filters (2)
  • Lib/test/test_peepholer.py is excluded by !Lib/**
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__nested_bool_op.snap is excluded by !**/*.snap
📒 Files selected for processing (3)
  • crates/codegen/src/compile.rs
  • crates/codegen/src/ir.rs
  • crates/codegen/src/symboltable.rs

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/codegen/src/compile.rs`:
- Around line 1113-1116: The module conditional-annotation preamble must be
emitted identically for both entry paths; modify compile_program_single() to
share the same logic used by compile_program() by checking
ste.has_conditional_annotations (or computing the conditional-annotations flag
first) and then invoking the common preamble emissions that insert the
root-module cell into cellvar_cache (insert "__conditional_annotations__") and
emit the MAKE_CELL preamble instead of falling through to RESUME; refactor so
the differing entry value is determined first and the shared logic that updates
cellvar_cache and emits the MAKE_CELL preamble is called from both
compile_program() and compile_program_single(), ensuring bytecode parity (also
apply same refactor to the other occurrence around the code referenced at
~1799-1817).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 44fb417d-3c03-4e6a-9d7b-1706d30790c5

📥 Commits

Reviewing files that changed from the base of the PR and between af963bc and 860a3db.

⛔ Files ignored due to path filters (1)
  • Lib/test/_test_multiprocessing.py is excluded by !Lib/**
📒 Files selected for processing (2)
  • crates/codegen/src/compile.rs
  • crates/codegen/src/symboltable.rs

Comment on lines +1113 to 1116
if ste.has_conditional_annotations
&& matches!(scope_type, CompilerScope::Class | CompilerScope::Module)
{
cellvar_cache.insert("__conditional_annotations__".to_string());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Keep the module conditional-annotation preamble shared across module entry paths.

This now wires the root-module cell + MAKE_CELL preamble only through compile_program(). compile_program_single() still goes straight to RESUME, so the same annotated module will compile differently between exec and single, which leaves a bytecode-parity gap in exactly the area this PR is tightening.

As per coding guidelines, "When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code".

Also applies to: 1799-1817

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/compile.rs` around lines 1113 - 1116, The module
conditional-annotation preamble must be emitted identically for both entry
paths; modify compile_program_single() to share the same logic used by
compile_program() by checking ste.has_conditional_annotations (or computing the
conditional-annotations flag first) and then invoking the common preamble
emissions that insert the root-module cell into cellvar_cache (insert
"__conditional_annotations__") and emit the MAKE_CELL preamble instead of
falling through to RESUME; refactor so the differing entry value is determined
first and the shared logic that updates cellvar_cache and emits the MAKE_CELL
preamble is called from both compile_program() and compile_program_single(),
ensuring bytecode parity (also apply same refactor to the other occurrence
around the code referenced at ~1799-1817).

@youknowone youknowone merged commit 3a8fb76 into RustPython:main Mar 27, 2026
16 checks passed
@youknowone youknowone deleted the bytecode-parity branch March 27, 2026 03:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant