Skip to content

Bytecode parity#7475

Open
youknowone wants to merge 37 commits intoRustPython:mainfrom
youknowone:bytecode-parity-phase1
Open

Bytecode parity#7475
youknowone wants to merge 37 commits intoRustPython:mainfrom
youknowone:bytecode-parity-phase1

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Mar 20, 2026

Summary by CodeRabbit

Release Notes

  • Chores
    • Updated GitHub Actions workflow infrastructure with pinned action versions and improved caching.
    • Enhanced CI/CD pipeline with new dependency installation actions, security linting, and format verification jobs.
    • Upgraded workspace dependencies including Ruff, pyo3, and cryptographic crates.
    • Added automated Python library status updates via scheduled workflows.
    • Expanded spell-check dictionaries for RustPython and CPython terms.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 20, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR significantly refactors GitHub Actions CI/CD infrastructure, adds automated environment setup for RustPython development, introduces new composite actions for dependency installation, reorganizes Dependabot configuration groups, expands spell-check dictionaries, and upgrades Cargo dependencies. It replaces the auto-committing PR format workflow with a check-only format verification workflow and introduces agentic workflows for automating library updates.

Changes

Cohort / File(s) Summary
CI Workflow Refactoring
.github/workflows/ci.yaml
Converted hard-coded OS-specific test logic to matrix-driven configuration; replaced exotic_targets job with matrix-driven cargo_check; introduced CPU-core detection for parallel test execution; added cargo-shear and security-lint (zizmor) jobs; pinned action versions and updated rust-cache configuration; consolidated macOS dependency setup to use new composite action.
Format Verification & Linting
.github/workflows/pr-format.yaml, .github/workflows/pr-auto-commit.yaml
Added new non-committing format-check workflow (pr-format.yaml) with actionlint, cargo fmt, ruff, and opcode metadata generation that posts inline suggestions on failure; deleted old auto-commit workflow that directly modified PR branches.
Periodic CI Workflows
.github/workflows/cron-ci.yaml, .github/workflows/lib-deps-check.yaml, .github/workflows/release.yml, .github/workflows/update-doc-db.yml
Pinned checkout action to commit hash and added persist-credentials configuration; upgraded action versions (sticky-pull-request-comment v2→v3, upload/download-artifact versions); refactored variable handling in branch/PR creation steps.
Library Update Automation
.github/workflows/update-libs-status.yaml, .github/workflows/upgrade-pylib.lock.yml, .github/workflows/upgrade-pylib.md
Added new workflows for automated Python library synchronization: update-libs-status (tracks Lib changes against CPython issue); upgrade-pylib.lock.yml (agentic workflow with Copilot agent execution, threat detection, and safe output handling); upgrade-pylib.md (module selection and quick-upgrade workflow with PR automation).
Composite GitHub Actions
.github/actions/install-linux-deps/action.yml, .github/actions/install-macos-deps/action.yml, .github/aw/actions-lock.json
Added new composite actions for conditional dependency installation on Linux (gcc-multilib, musl-tools, gcc-aarch64-linux-gnu, clang) and macOS (autoconf, automake, libtool, openssl); added pinned action lock file for gh-aw actions.
Development Environment Setup
.claude/scripts/setup-env.sh, .claude/settings.json
Added Bash startup script that switches to RustPython directory, checks/upgrades Python 3.13+, clones CPython v3.14.3 source if needed, and supports force-linking; configured SessionStart hook in Claude settings to execute setup script.
Spell-Check Configuration
.cspell.json, .cspell.dict/cpython.txt, .cspell.dict/rust-more.txt, .cspell.dict/rustpython.txt
Enabled compound-word handling in cspell; added new RustPython dictionary with domain-specific terms; expanded CPython terms dictionary; reorganized word lists from inline config to external dictionary file; adjusted rust-more dictionary entries.
Build & Repository Configuration
.github/dependabot.yml, .gitattributes, .gitignore, Cargo.toml
Reorganized Dependabot cargo groups (removed cranelift from old group, added to wasmtime; introduced crypto and random groups; added npm weekly updates); marked GitHub Actions lock files as linguist-generated; added gitignore pattern for sysconfig vars; upgraded rustpython-vm features (added "gc"), bumped pyo3/workspace/Ruff dependency versions, added parking_lot_core patch.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Possibly related PRs

Suggested reviewers

  • ShaharNaveh
  • fanninpm
  • coolreader18

Poem

🐰 A rabbit hops through workflows new,
With Python clones and CI too,
Format checks and lib-updates flow,
Setup scripts guide the dev to go,
Build and test shall dance in glee!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Bytecode parity' is vague and does not clearly convey what specific changes are being made in this pull request. Consider using a more descriptive title that specifies the main objective, such as 'Implement bytecode parity with CPython 3.14' or 'Align bytecode generation with CPython standards'.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 85.71% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone force-pushed the bytecode-parity-phase1 branch from ab3cbd9 to 7ff591f Compare March 20, 2026 11:41
@github-actions
Copy link
Contributor

github-actions bot commented Mar 20, 2026

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

[ ] test: cpython/Lib/test/test_descr.py (TODO: 39)
[ ] test: cpython/Lib/test/test_descrtut.py (TODO: 3)

dependencies:

dependent tests: (no tests depend on descr)

[x] lib: cpython/Lib/dis.py
[ ] test: cpython/Lib/test/test_dis.py (TODO: 35)

dependencies:

  • dis

dependent tests: (70 tests)

  • dis: test__opcode test_ast test_code test_compile test_compiler_assemble test_dis test_dtrace test_fstring test_inspect test_monitoring test_opcache test_patma test_peepholer test_positional_only_arg
    • bdb: test_bdb
    • inspect: test_abc test_argparse test_asyncgen test_buffer test_builtin test_collections test_coroutines test_decimal test_enum test_functools test_generators test_grammar test_ntpath test_operator test_posixpath test_pydoc test_signal test_sqlite3 test_traceback test_type_annotations test_types test_typing test_unittest test_yield_from test_zipimport test_zoneinfo
      • ast: test_compiler_codegen test_future_stmt test_site test_ssl test_type_comments test_ucn test_unparse
      • asyncio: test_asyncio test_contextlib_async test_logging test_os test_sys_settrace test_unittest
      • cmd: test_cmd
      • dataclasses: test__colorize test_copy test_ctypes test_genericalias test_pprint test_regrtest
      • importlib.metadata: test_importlib
      • pkgutil: test_pkgutil test_runpy
      • rlcompleter: test_rlcompleter
      • trace: test_trace
      • xmlrpc.server: test_docxmlrpc test_xmlrpc
    • modulefinder: test_importlib test_modulefinder

[x] test: cpython/Lib/test/test_compile.py (TODO: 31)
[x] test: cpython/Lib/test/test_compiler_assemble.py
[x] test: cpython/Lib/test/test_compiler_codegen.py
[x] test: cpython/Lib/test/test_peepholer.py (TODO: 29)

dependencies:

dependent tests: (no tests depend on compile)

[ ] test: cpython/Lib/test/test_super.py (TODO: 3)

dependencies:

dependent tests: (no tests depend on super)

[x] test: cpython/Lib/test/test_sys.py (TODO: 7)
[ ] test: cpython/Lib/test/test_syslog.py (TODO: 2)
[ ] test: cpython/Lib/test/test_sys_setprofile.py
[ ] test: cpython/Lib/test/test_sys_settrace.py (TODO: 44)
[ ] test: cpython/Lib/test/test_audit.py
[ ] test: cpython/Lib/test/audit-tests.py

dependencies:

dependent tests: (225 tests)

  • sys: regrtestdata test___all__ test__colorize test__locale test__osx_support test_android test_annotationlib test_argparse test_array test_ast test_asyncio test_audit test_bdb test_bigaddrspace test_bigmem test_bisect test_buffer test_builtin test_bytes test_bz2 test_c_locale_coercion test_calendar test_call test_class test_cmath test_cmd test_cmd_line test_cmd_line_script test_code test_code_module test_codeccallbacks test_codecs test_collections test_compile test_compileall test_complex test_concurrent_futures test_context test_contextlib test_coroutines test_csv test_ctypes test_datetime test_dbm test_dbm_sqlite3 test_decimal test_descr test_dict test_difflib test_dis test_doctest test_doctest2 test_docxmlrpc test_dtrace test_dynamic test_dynamicclassattribute test_email test_ensurepip test_enum test_enumerate test_eof test_except_star test_exceptions test_faulthandler test_fcntl test_file test_file_eintr test_fileinput test_fileio test_float test_fork1 test_format test_fractions test_frame test_frozen test_functools test_future_stmt test_gc test_generators test_genericpath test_getopt test_glob test_grammar test_gzip test_hash test_hashlib test_http_cookiejar test_httpservers test_importlib test_inspect test_int test_io test_iter test_itertools test_json test_largefile test_launcher test_list test_listcomps test_locale test_logging test_long test_lzma test_mailbox test_marshal test_math test_memoryio test_memoryview test_metaclass test_mimetypes test_mmap test_monitoring test_msvcrt test_multiprocessing_forkserver test_multiprocessing_main_handling test_multiprocessing_spawn test_netrc test_ntpath test_numeric_tower test_operator test_optparse test_ordered_dict test_os test_osx_env test_pathlib test_patma test_peepholer test_pickle test_pkg test_pkgutil test_platform test_plistlib test_popen test_posix test_posixpath test_print test_property test_pty test_pwd test_py_compile test_pyclbr test_pydoc test_pyexpat test_quopri test_raise test_range test_re test_regrtest test_repl test_reprlib test_resource test_runpy test_sax test_scope test_script_helper test_select test_selectors test_shutil test_signal test_site test_slice test_smtplib test_socket test_sqlite3 test_ssl test_stable_abi_ctypes test_stat test_statistics test_str test_strftime test_string_literals test_strtod test_struct test_subprocess test_support test_sys test_sys_setprofile test_sys_settrace test_sysconfig test_syslog test_tarfile test_tempfile test_termios test_threadedtempfile test_threading test_threading_local test_threadsignals test_time test_timeit test_tomllib test_tools test_trace test_traceback test_tuple test_type_comments test_types test_typing test_unicode_file test_unicode_file_functions test_unicodedata test_unittest test_univnewlines test_urllib test_urllib2 test_urllib2net test_urlparse test_utf8_mode test_uuid test_venv test_wait3 test_wait4 test_wave test_weakref test_webbrowser test_winconsoleio test_winreg test_with test_wsgiref test_xml_etree test_xmlrpc test_zipapp test_zipfile test_zipfile64 test_zipimport test_zlib

Legend:

  • [+] path exists in CPython
  • [x] up-to-date, [ ] outdated

@youknowone youknowone marked this pull request as ready for review March 20, 2026 12:18
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/codegen/src/compile.rs`:
- Around line 6732-6735: The comment above the ToBool emission is misleading
because ast::Expr::Compare may produce non-bool values; update the comment near
the if !matches!(expression, ast::Expr::Compare) check to state that the Compare
fast path is safe due to the jump opcode’s truthiness handling (not because
Compare always yields a bool), and clarify that everything else needs
Instruction::ToBool before branching; reference the emit!(self,
Instruction::ToBool) call and the ast::Expr::Compare pattern in the comment.
- Around line 4626-4649: The current block always emits and stores an empty
tuple to __static_attributes__ because enter_scope() initializes
static_attributes to Some(empty) but nothing is added before this emit; update
compile.rs to skip emitting/storing __static_attributes__ when the collected
attrs vector is empty (i.e., check the
code_stack.last().unwrap().static_attributes contents and only call
emit_load_const/Instruction::StoreName/name("__static_attributes__") if attrs is
non-empty), or alternatively change enter_scope() to initialize
static_attributes as None and only create/populate it when the collector runs;
target the symbols static_attributes, enter_scope, emit_load_const,
Instruction::StoreName and name("__static_attributes__") when making the change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: df81d4bb-8612-4e9c-a28a-4f556bc8453f

📥 Commits

Reviewing files that changed from the base of the PR and between 2ef77f8 and 7ff591f.

⛔ Files ignored due to path filters (4)
  • Lib/test/test_descr.py is excluded by !Lib/**
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_ands.snap is excluded by !**/*.snap
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_mixed.snap is excluded by !**/*.snap
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_ors.snap is excluded by !**/*.snap
📒 Files selected for processing (1)
  • crates/codegen/src/compile.rs

Comment on lines +4626 to +4649
// Emit __static_attributes__ tuple
{
let attrs: Vec<String> = self
.code_stack
.last()
.unwrap()
.static_attributes
.as_ref()
.map(|s| s.iter().cloned().collect())
.unwrap_or_default();
self.emit_load_const(ConstantData::Tuple {
elements: attrs
.into_iter()
.map(|s| ConstantData::Str { value: s.into() })
.collect(),
});
let static_attrs_name = self.name("__static_attributes__");
emit!(
self,
Instruction::StoreName {
namei: static_attrs_name
}
);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

__static_attributes__ is always serialized as () here.

enter_scope() initializes every class with Some(IndexSet::default()) on Line 1120, but this file never inserts into that set before reaching this block. The new store therefore overwrites every class body’s __static_attributes__ with an empty tuple, including any explicit assignment made in body. Either populate static_attributes before this point or skip the store until the set is non-empty.

Minimal guard to avoid clobbering classes until the collector exists
-        // Emit __static_attributes__ tuple
-        {
-            let attrs: Vec<String> = self
-                .code_stack
-                .last()
-                .unwrap()
-                .static_attributes
-                .as_ref()
-                .map(|s| s.iter().cloned().collect())
-                .unwrap_or_default();
-            self.emit_load_const(ConstantData::Tuple {
-                elements: attrs
-                    .into_iter()
-                    .map(|s| ConstantData::Str { value: s.into() })
-                    .collect(),
-            });
-            let static_attrs_name = self.name("__static_attributes__");
-            emit!(
-                self,
-                Instruction::StoreName {
-                    namei: static_attrs_name
-                }
-            );
-        }
+        if let Some(attrs) = self
+            .code_stack
+            .last()
+            .and_then(|info| info.static_attributes.as_ref())
+            .filter(|attrs| !attrs.is_empty())
+        {
+            self.emit_load_const(ConstantData::Tuple {
+                elements: attrs
+                    .iter()
+                    .cloned()
+                    .map(|s| ConstantData::Str { value: s.into() })
+                    .collect(),
+            });
+            let static_attrs_name = self.name("__static_attributes__");
+            emit!(self, Instruction::StoreName { namei: static_attrs_name });
+        }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/compile.rs` around lines 4626 - 4649, The current block
always emits and stores an empty tuple to __static_attributes__ because
enter_scope() initializes static_attributes to Some(empty) but nothing is added
before this emit; update compile.rs to skip emitting/storing
__static_attributes__ when the collected attrs vector is empty (i.e., check the
code_stack.last().unwrap().static_attributes contents and only call
emit_load_const/Instruction::StoreName/name("__static_attributes__") if attrs is
non-empty), or alternatively change enter_scope() to initialize
static_attributes as None and only create/populate it when the collector runs;
target the symbols static_attributes, enter_scope, emit_load_const,
Instruction::StoreName and name("__static_attributes__") when making the change.

Comment on lines +6732 to +6735
// Compare already produces a bool; everything else needs TO_BOOL
if !matches!(expression, ast::Expr::Compare(_)) {
emit!(self, Instruction::ToBool);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

The Compare rationale is inaccurate.

Rich comparisons can return arbitrary objects; the reason this fast path is safe is the jump opcode’s truthiness handling, not that Compare itself always yields a bool. Please reword the comment so future refactors do not depend on the wrong invariant.

Suggested comment update
-                // Compare already produces a bool; everything else needs TO_BOOL
+                // Rich comparisons may return non-bool objects; skipping TO_BOOL
+                // here is only correct because PopJumpIf* performs truthiness
+                // conversion itself.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Compare already produces a bool; everything else needs TO_BOOL
if !matches!(expression, ast::Expr::Compare(_)) {
emit!(self, Instruction::ToBool);
}
// Rich comparisons may return non-bool objects; skipping TO_BOOL
// here is only correct because PopJumpIf* performs truthiness
// conversion itself.
if !matches!(expression, ast::Expr::Compare(_)) {
emit!(self, Instruction::ToBool);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/compile.rs` around lines 6732 - 6735, The comment above
the ToBool emission is misleading because ast::Expr::Compare may produce
non-bool values; update the comment near the if !matches!(expression,
ast::Expr::Compare) check to state that the Compare fast path is safe due to the
jump opcode’s truthiness handling (not because Compare always yields a bool),
and clarify that everything else needs Instruction::ToBool before branching;
reference the emit!(self, Instruction::ToBool) call and the ast::Expr::Compare
pattern in the comment.

@youknowone youknowone force-pushed the bytecode-parity-phase1 branch from 7ff591f to 904bdaf Compare March 20, 2026 13:56
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
crates/codegen/src/compile.rs (2)

6732-6735: ⚠️ Potential issue | 🟡 Minor

Reword the Compare fast-path comment.

Rich comparisons can return arbitrary objects. Skipping TO_BOOL here is only safe because PopJumpIf* performs truthiness conversion itself.

Suggested comment update
-                // Compare already produces a bool; everything else needs TO_BOOL
+                // Rich comparisons may return non-bool objects; skipping TO_BOOL
+                // here is only correct because PopJumpIf* performs truthiness
+                // conversion itself.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/compile.rs` around lines 6732 - 6735, Update the comment
above the Compare fast-path to note that rich comparisons may return arbitrary
objects and that we only skip emitting Instruction::ToBool for
ast::Expr::Compare because the subsequent conditional jump opcodes
(PopJumpIfTrue / PopJumpIfFalse / other PopJumpIf* handlers) perform the
truthiness conversion themselves; reference the match on expression
(ast::Expr::Compare), the emit!(self, Instruction::ToBool) call, and the
PopJumpIf* jump handlers to make the reason explicit.

4626-4649: ⚠️ Potential issue | 🟠 Major

__static_attributes__ is still always written as ().

This file only initializes static_attributes; it never populates it before this block runs. As written, every class body overwrites __static_attributes__ with an empty tuple, and it does so after compiling body, so it also clobbers an explicit class-body assignment.

Suggested guard until the collector is wired up
-        // Emit __static_attributes__ tuple
-        {
-            let attrs: Vec<String> = self
-                .code_stack
-                .last()
-                .unwrap()
-                .static_attributes
-                .as_ref()
-                .map(|s| s.iter().cloned().collect())
-                .unwrap_or_default();
-            self.emit_load_const(ConstantData::Tuple {
-                elements: attrs
-                    .into_iter()
-                    .map(|s| ConstantData::Str { value: s.into() })
-                    .collect(),
-            });
-            let static_attrs_name = self.name("__static_attributes__");
-            emit!(
-                self,
-                Instruction::StoreName {
-                    namei: static_attrs_name
-                }
-            );
-        }
+        if let Some(attrs) = self
+            .code_stack
+            .last()
+            .and_then(|info| info.static_attributes.as_ref())
+            .filter(|attrs| !attrs.is_empty())
+        {
+            self.emit_load_const(ConstantData::Tuple {
+                elements: attrs
+                    .iter()
+                    .cloned()
+                    .map(|s| ConstantData::Str { value: s.into() })
+                    .collect(),
+            });
+            let static_attrs_name = self.name("__static_attributes__");
+            emit!(self, Instruction::StoreName { namei: static_attrs_name });
+        }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/compile.rs` around lines 4626 - 4649, The code
unconditionally emits and stores __static_attributes__ as an empty tuple because
static_attributes is never populated; fix by guarding the emit: only call
emit_load_const + Instruction::StoreName for __static_attributes__ when
self.code_stack.last().unwrap().static_attributes is Some(vec) and the vec is
non-empty (i.e., contains attributes), so it doesn't overwrite an explicit
class-body assignment or write an empty tuple; use the existing
name("__static_attributes__"), emit_load_const, and the Instruction::StoreName
path when the Option has data.
🧹 Nitpick comments (1)
crates/vm/src/frame.rs (1)

2512-2512: Encode the new 1-based container depth in one place.

These five handlers now rely on the same subtle rule: the opcode arg is 1-based, nth_value() is 0-based, and the lookup happens after popping the transient operand(s). Repeating raw - 1 here makes that contract easy to miss or accidentally revert later; a tiny helper or short comment would make the parity rule much clearer.

Also applies to: 2522-2522, 2864-2864, 3195-3195, 3205-3205

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/frame.rs` at line 2512, Create a small helper to centralize the
1-based→0-based conversion and use it from the handlers instead of repeating `-
1`; for example add a method on the same type (e.g., fn arg_to_index(&self, arg:
u32) -> usize or fn nth_arg_value(&self, arg: u32) -> Value) and replace
occurrences like `let obj = self.nth_value(i.get(arg) - 1);` in the handlers
referenced (the sites at the current diff and the ones you noted: the uses at
~2522, ~2864, ~3195, ~3205) to call that helper so the parity rule is enforced
in one place and clearly named. Ensure the helper documents that opcode args are
1-based and nth_value is 0-based.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/codegen/src/ir.rs`:
- Around line 1652-1660: The return-epilogue predicate (is_return_block) is too
broad — it matches any LOAD_CONST + RETURN_VALUE; narrow it by matching the
actual LoadConst payload to ensure it's specifically loading None: change the
second match clause that currently checks
AnyInstruction::Real(Instruction::LoadConst { .. }) to explicitly match the
constant variant for None (e.g. AnyInstruction::Real(Instruction::LoadConst {
value: Constant::None }) or the equivalent None representation used in this IR),
while keeping the subsequent AnyInstruction::Real(Instruction::ReturnValue)
check; use the existing symbols is_return_block, last_insts,
AnyInstruction::Real, Instruction::LoadConst and Instruction::ReturnValue to
locate and update the condition.

---

Duplicate comments:
In `@crates/codegen/src/compile.rs`:
- Around line 6732-6735: Update the comment above the Compare fast-path to note
that rich comparisons may return arbitrary objects and that we only skip
emitting Instruction::ToBool for ast::Expr::Compare because the subsequent
conditional jump opcodes (PopJumpIfTrue / PopJumpIfFalse / other PopJumpIf*
handlers) perform the truthiness conversion themselves; reference the match on
expression (ast::Expr::Compare), the emit!(self, Instruction::ToBool) call, and
the PopJumpIf* jump handlers to make the reason explicit.
- Around line 4626-4649: The code unconditionally emits and stores
__static_attributes__ as an empty tuple because static_attributes is never
populated; fix by guarding the emit: only call emit_load_const +
Instruction::StoreName for __static_attributes__ when
self.code_stack.last().unwrap().static_attributes is Some(vec) and the vec is
non-empty (i.e., contains attributes), so it doesn't overwrite an explicit
class-body assignment or write an empty tuple; use the existing
name("__static_attributes__"), emit_load_const, and the Instruction::StoreName
path when the Option has data.

---

Nitpick comments:
In `@crates/vm/src/frame.rs`:
- Line 2512: Create a small helper to centralize the 1-based→0-based conversion
and use it from the handlers instead of repeating `- 1`; for example add a
method on the same type (e.g., fn arg_to_index(&self, arg: u32) -> usize or fn
nth_arg_value(&self, arg: u32) -> Value) and replace occurrences like `let obj =
self.nth_value(i.get(arg) - 1);` in the handlers referenced (the sites at the
current diff and the ones you noted: the uses at ~2522, ~2864, ~3195, ~3205) to
call that helper so the parity rule is enforced in one place and clearly named.
Ensure the helper documents that opcode args are 1-based and nth_value is
0-based.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 7134ce0a-4869-417e-9966-1ac435da0e6a

📥 Commits

Reviewing files that changed from the base of the PR and between 7ff591f and 904bdaf.

⛔ Files ignored due to path filters (4)
  • Lib/test/test_descr.py is excluded by !Lib/**
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_ands.snap is excluded by !**/*.snap
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_mixed.snap is excluded by !**/*.snap
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_ors.snap is excluded by !**/*.snap
📒 Files selected for processing (3)
  • crates/codegen/src/compile.rs
  • crates/codegen/src/ir.rs
  • crates/vm/src/frame.rs

Comment on lines +1652 to +1660
let is_return_block = last_insts.len() == 2
&& matches!(
last_insts[0].instr,
AnyInstruction::Real(Instruction::LoadConst { .. })
)
&& matches!(
last_insts[1].instr,
AnyInstruction::Real(Instruction::ReturnValue)
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Return-epilogue detection is too broad for the documented behavior

Line 1652 only checks LOAD_CONST + RETURN_VALUE, but the function is documented to duplicate only LOAD_CONST None + RETURN_VALUE. This can duplicate explicit constant returns as well, which risks bytecode parity/tracing mismatches. Please tighten the predicate to verify the loaded constant is specifically None.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/ir.rs` around lines 1652 - 1660, The return-epilogue
predicate (is_return_block) is too broad — it matches any LOAD_CONST +
RETURN_VALUE; narrow it by matching the actual LoadConst payload to ensure it's
specifically loading None: change the second match clause that currently checks
AnyInstruction::Real(Instruction::LoadConst { .. }) to explicitly match the
constant variant for None (e.g. AnyInstruction::Real(Instruction::LoadConst {
value: Constant::None }) or the equivalent None representation used in this IR),
while keeping the subsequent AnyInstruction::Real(Instruction::ReturnValue)
check; use the existing symbols is_return_block, last_insts,
AnyInstruction::Real, Instruction::LoadConst and Instruction::ReturnValue to
locate and update the condition.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/codegen/src/compile.rs (1)

3502-3507: ⚠️ Potential issue | 🟡 Minor

The inline stack math still describes the old 0-based operand scheme.

These comments still derive i = 2 / i = 0, but the emitted instructions are now i: 3 / i: 1 because the VM interprets these operands as 1-based depths. Please update the explanation so it matches the code.

Also applies to: 3532-3536

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/compile.rs` around lines 3502 - 3507, The inline comments
in the LIST_APPEND handling are computing stack index using a 0-based operand
scheme while the VM uses 1-based depths; update the explanatory math around
Instruction::ListAppend { i: 3 } (and the analogous block at the location
covering 3532-3536) to show the 1-based depth calculation so the derivation
matches the emitted i values (e.g., explain that nth_value uses 1-based depth so
list is at depth 3 leading to i: 3, and adjust the other comment similarly).
♻️ Duplicate comments (2)
crates/codegen/src/compile.rs (2)

6745-6748: ⚠️ Potential issue | 🟡 Minor

The Compare rationale is still inaccurate.

Rich comparisons can return non-bool objects. The fast path is safe because PopJumpIf* performs truthiness handling, not because Compare itself guarantees a bool.

Suggested comment update
-                // Compare already produces a bool; everything else needs TO_BOOL
+                // Rich comparisons may return non-bool objects; skipping TO_BOOL
+                // here is only safe because PopJumpIf* performs truthiness
+                // conversion itself.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/compile.rs` around lines 6745 - 6748, Update the inline
comment above the ToBool emission to correctly state that Compare may return
non-bool objects and that the reason we skip emitting Instruction::ToBool for
ast::Expr::Compare(_) is not that Compare guarantees a bool but because the
fast-path jump instructions (e.g., PopJumpIfTrue/PopJumpIfFalse) perform
truthiness handling themselves; replace the current "Compare already produces a
bool" wording with a note that Compare can return non-bool values and we rely on
PopJumpIf*'s truthiness semantics rather than forcing a ToBool.

4626-4649: ⚠️ Potential issue | 🟠 Major

__static_attributes__ is still always overwritten with ().

Line 4628 builds attrs from static_attributes, but this file never inserts into that set after class scopes initialize it to Some(IndexSet::default()) on Line 1120. Every class body will therefore store an empty tuple here and clobber any explicit __static_attributes__ assignment from body.

Possible guard
-        // Emit __static_attributes__ tuple
-        {
-            let attrs: Vec<String> = self
-                .code_stack
-                .last()
-                .unwrap()
-                .static_attributes
-                .as_ref()
-                .map(|s| s.iter().cloned().collect())
-                .unwrap_or_default();
-            self.emit_load_const(ConstantData::Tuple {
-                elements: attrs
-                    .into_iter()
-                    .map(|s| ConstantData::Str { value: s.into() })
-                    .collect(),
-            });
-            let static_attrs_name = self.name("__static_attributes__");
-            emit!(
-                self,
-                Instruction::StoreName {
-                    namei: static_attrs_name
-                }
-            );
-        }
+        if let Some(attrs) = self
+            .code_stack
+            .last()
+            .and_then(|info| info.static_attributes.as_ref())
+            .filter(|attrs| !attrs.is_empty())
+        {
+            self.emit_load_const(ConstantData::Tuple {
+                elements: attrs
+                    .iter()
+                    .cloned()
+                    .map(|s| ConstantData::Str { value: s.into() })
+                    .collect(),
+            });
+            let static_attrs_name = self.name("__static_attributes__");
+            emit!(self, Instruction::StoreName { namei: static_attrs_name });
+        }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/compile.rs` around lines 4626 - 4649, The compiler
unconditionally emits and stores __static_attributes__ (via emit_load_const +
Instruction::StoreName with name from name("__static_attributes__")), but
code_stack.last().static_attributes is initialized to Some(empty) and never
populated, so every class body is clobbered with an empty tuple; change the code
in compile.rs so you only emit and StoreName for __static_attributes__ when the
current code_stack.last().static_attributes is present and non-empty (i.e.,
check static_attributes.as_ref().map(|s| !s.is_empty()).unwrap_or(false)) so
explicit assignments in the class body are not overwritten.
🧹 Nitpick comments (2)
scripts/dis_dump.py (1)

20-21: Consider adding a comment clarifying PRECALL is CPython-only.

PRECALL was removed in CPython 3.12 and is not present in RustPython's opcode table. A brief comment would help future readers understand why it's included.

📝 Suggested comment
 # Non-semantic filler instructions to skip
+# Note: PRECALL is CPython 3.10-3.11 only (removed in 3.12), included for compatibility
 SKIP_OPS = frozenset({"CACHE", "PRECALL"})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/dis_dump.py` around lines 20 - 21, Add a brief clarifying comment
above the SKIP_OPS definition explaining that the PRECALL opcode is included for
compatibility with older CPython versions, was removed in CPython 3.12, and does
not exist in RustPython’s opcode table; reference the SKIP_OPS constant and the
PRECALL entry so readers know why PRECALL is present even though some Python
implementations (like RustPython) won't have it.
crates/codegen/src/compile.rs (1)

1141-1157: Please add snapshot coverage for the new parity-sensitive shapes.

This file already has disassembly snapshot helpers, but there is no direct coverage here for the prologue reorder before RESUME or the imported-name LOAD_ATTR / PUSH_NULL call path. A tiny snapshot for each would make parity regressions much easier to catch.

Also applies to: 7517-7535

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/compile.rs` around lines 1141 - 1157, Add snapshot tests
to cover the parity-sensitive prologue reorder before RESUME and the
imported-name call path that emits LOAD_ATTR / PUSH_NULL: locate the disassembly
snapshot helpers in crates/codegen/src/compile.rs and add two small snapshots —
one that exercises the MAKE_CELL / CopyFreeVars emission sequence immediately
before RESUME (to catch prologue reorder regressions) and another that builds an
imported-name call site which emits LOAD_ATTR followed by PUSH_NULL (to catch
that call path); wire these into the existing snapshot test harness so they run
with other disassembly snapshots and assert stable output.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/codegen/src/compile.rs`:
- Around line 4651-4662: The code stores the dereferenced dict for
"__classdict__" into the name "__classdictcell__", but it should store the cell
object to match the "__classcell__" pattern; in the block gated by
self.current_symbol_table().needs_classdict (using
get_cell_var_index("__classdict__") and name("__classdictcell__") and the
subsequent emit StoreName), replace the Instruction::LoadDeref { i:
classdict_idx } emission with the cell-loading pseudo-instruction
(PseudoInstruction::LoadClosure { i: classdict_idx.to_u32() } or the equivalent
cell loader) so the actual cell object is stored under "__classdictcell__"
rather than the dereferenced dict.

In `@crates/codegen/src/symboltable.rs`:
- Around line 295-307: The code currently sets symbol_table.needs_classdict
whenever any immediate child scope is CompilerScope::Function or
::AsyncFunction; change this so needs_classdict is set only when there is an
actual classdict demand—i.e., check each child scope for evidence of needing
__classdict__ (for example inspect the child's newfree/cell capture flags or an
annotation/type-parameter marker used for PEP 649) instead of just matching on
CompilerScope::Function/AsyncFunction; update the loop over
symbol_table.sub_tables to test the child's fields (e.g., t.newfree or a
dedicated annotation/type-param flag) and only set symbol_table.needs_classdict
= true when those indicators are present.

In `@scripts/dis_dump.py`:
- Around line 82-83: The mapping _RP_CMP_OPS currently includes an invalid key 0
duplicating "<"; remove the 0 key from _RP_CMP_OPS and ensure callers use a safe
lookup (for example, replace direct indexing with a .get on _RP_CMP_OPS when
resolving comparison strings, e.g., in the code that computes cmp_str from
inst.arg) so unknown enum values produce a clear default like "?cmp{n}" instead
of silently mapping to "<". This change references _RP_CMP_OPS and the bytecode
ComparisonOperator values (inst.arg) used when resolving cmp_str.

---

Outside diff comments:
In `@crates/codegen/src/compile.rs`:
- Around line 3502-3507: The inline comments in the LIST_APPEND handling are
computing stack index using a 0-based operand scheme while the VM uses 1-based
depths; update the explanatory math around Instruction::ListAppend { i: 3 } (and
the analogous block at the location covering 3532-3536) to show the 1-based
depth calculation so the derivation matches the emitted i values (e.g., explain
that nth_value uses 1-based depth so list is at depth 3 leading to i: 3, and
adjust the other comment similarly).

---

Duplicate comments:
In `@crates/codegen/src/compile.rs`:
- Around line 6745-6748: Update the inline comment above the ToBool emission to
correctly state that Compare may return non-bool objects and that the reason we
skip emitting Instruction::ToBool for ast::Expr::Compare(_) is not that Compare
guarantees a bool but because the fast-path jump instructions (e.g.,
PopJumpIfTrue/PopJumpIfFalse) perform truthiness handling themselves; replace
the current "Compare already produces a bool" wording with a note that Compare
can return non-bool values and we rely on PopJumpIf*'s truthiness semantics
rather than forcing a ToBool.
- Around line 4626-4649: The compiler unconditionally emits and stores
__static_attributes__ (via emit_load_const + Instruction::StoreName with name
from name("__static_attributes__")), but code_stack.last().static_attributes is
initialized to Some(empty) and never populated, so every class body is clobbered
with an empty tuple; change the code in compile.rs so you only emit and
StoreName for __static_attributes__ when the current
code_stack.last().static_attributes is present and non-empty (i.e., check
static_attributes.as_ref().map(|s| !s.is_empty()).unwrap_or(false)) so explicit
assignments in the class body are not overwritten.

---

Nitpick comments:
In `@crates/codegen/src/compile.rs`:
- Around line 1141-1157: Add snapshot tests to cover the parity-sensitive
prologue reorder before RESUME and the imported-name call path that emits
LOAD_ATTR / PUSH_NULL: locate the disassembly snapshot helpers in
crates/codegen/src/compile.rs and add two small snapshots — one that exercises
the MAKE_CELL / CopyFreeVars emission sequence immediately before RESUME (to
catch prologue reorder regressions) and another that builds an imported-name
call site which emits LOAD_ATTR followed by PUSH_NULL (to catch that call path);
wire these into the existing snapshot test harness so they run with other
disassembly snapshots and assert stable output.

In `@scripts/dis_dump.py`:
- Around line 20-21: Add a brief clarifying comment above the SKIP_OPS
definition explaining that the PRECALL opcode is included for compatibility with
older CPython versions, was removed in CPython 3.12, and does not exist in
RustPython’s opcode table; reference the SKIP_OPS constant and the PRECALL entry
so readers know why PRECALL is present even though some Python implementations
(like RustPython) won't have it.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: e7a2283a-8db4-4f7f-b366-2e756e9fca89

📥 Commits

Reviewing files that changed from the base of the PR and between 904bdaf and c9a5f48.

⛔ Files ignored due to path filters (1)
  • Lib/test/test_dis.py is excluded by !Lib/**
📒 Files selected for processing (5)
  • crates/codegen/src/compile.rs
  • crates/codegen/src/ir.rs
  • crates/codegen/src/symboltable.rs
  • scripts/compare_bytecode.py
  • scripts/dis_dump.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/codegen/src/ir.rs

Comment on lines +82 to +83
# RustPython's ComparisonOperator enum values → operator strings
_RP_CMP_OPS = {0: "<", 1: "<", 2: ">", 3: "!=", 4: "==", 5: "<=", 6: ">="}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Key 0 in _RP_CMP_OPS appears incorrect.

According to the ComparisonOperator enum in crates/compiler-core/src/bytecode/oparg.rs, valid values start at 1 (Less = 0b001). There is no variant with value 0. The current mapping duplicates "<" for both keys 0 and 1, which could mask bugs where an invalid comparison operator is encountered.

🛠️ Suggested fix: remove key 0 or handle it explicitly
 # RustPython's ComparisonOperator enum values → operator strings
-_RP_CMP_OPS = {0: "<", 1: "<", 2: ">", 3: "!=", 4: "==", 5: "<=", 6: ">="}
+_RP_CMP_OPS = {1: "<", 2: ">", 3: "!=", 4: "==", 5: "<=", 6: ">="}

If a fallback is needed, use .get() with a default in the calling code instead:

cmp_str = _RP_CMP_OPS.get(inst.arg, "?cmp%d" % inst.arg)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# RustPython's ComparisonOperator enum values → operator strings
_RP_CMP_OPS = {0: "<", 1: "<", 2: ">", 3: "!=", 4: "==", 5: "<=", 6: ">="}
# RustPython's ComparisonOperator enum values → operator strings
_RP_CMP_OPS = {1: "<", 2: ">", 3: "!=", 4: "==", 5: "<=", 6: ">="}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/dis_dump.py` around lines 82 - 83, The mapping _RP_CMP_OPS currently
includes an invalid key 0 duplicating "<"; remove the 0 key from _RP_CMP_OPS and
ensure callers use a safe lookup (for example, replace direct indexing with a
.get on _RP_CMP_OPS when resolving comparison strings, e.g., in the code that
computes cmp_str from inst.arg) so unknown enum values produce a clear default
like "?cmp{n}" instead of silently mapping to "<". This change references
_RP_CMP_OPS and the bytecode ComparisonOperator values (inst.arg) used when
resolving cmp_str.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
crates/codegen/src/ir.rs (1)

1710-1718: ⚠️ Potential issue | 🟠 Major

Return-epilogue detection still matches any constant, not just None.

The function documentation at line 1694 states it duplicates LOAD_CONST None + RETURN_VALUE, but the predicate at lines 1711-1714 matches any Instruction::LoadConst { .. } without verifying the constant is None. This could incorrectly duplicate explicit non-None returns.

To match the documented behavior and CPython's semantics, verify the loaded constant is specifically None:
,

🐛 Proposed fix
     let is_return_block = last_insts.len() == 2
-        && matches!(
-            last_insts[0].instr,
-            AnyInstruction::Real(Instruction::LoadConst { .. })
-        )
+        && matches!(last_insts[0].instr, AnyInstruction::Real(Instruction::LoadConst { .. }))
+        && {
+            let const_idx = u32::from(last_insts[0].arg) as usize;
+            // Only match if loading None (implicit return epilogue)
+            // This check would require access to self.metadata.consts
+            // Consider passing consts as a parameter or restructuring
+            true // TODO: Verify constant is None
+        }
         && matches!(
             last_insts[1].instr,
             AnyInstruction::Real(Instruction::ReturnValue)
         );

Note: The function currently takes &mut [Block] and doesn't have access to the constants table. You may need to refactor to pass &CodeUnitMetadata or the consts set to properly verify the constant is ConstantData::None.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/ir.rs` around lines 1710 - 1718, The return-epilogue
detection currently treats any Instruction::LoadConst as a None-return; update
the predicate so it only matches a load of the actual None constant by checking
the LoadConst's index/value against the constants table (ConstantData::None). To
do this, refactor the enclosing function (where is_return_block is computed) to
accept the code unit's metadata or consts slice (e.g., add a &CodeUnitMetadata
or &consts parameter), then replace the loose matches on Instruction::LoadConst
with a check that the referenced constant equals ConstantData::None before
recognizing the pattern; keep references to Instruction::LoadConst,
Instruction::ReturnValue, ConstantData::None and the function that computes
is_return_block to locate and change the logic.
crates/codegen/src/compile.rs (2)

4626-4649: ⚠️ Potential issue | 🟠 Major

Guard this store until static_attributes is actually populated.

Line 1120 still initializes every class scope with Some(IndexSet::default()), and this file still never inserts into that set before reaching this block. As written, every class body gets __static_attributes__ = () here, which also overwrites any explicit assignment from the class body.

💡 Minimal guard
-        // Emit __static_attributes__ tuple
-        {
-            let attrs: Vec<String> = self
-                .code_stack
-                .last()
-                .unwrap()
-                .static_attributes
-                .as_ref()
-                .map(|s| s.iter().cloned().collect())
-                .unwrap_or_default();
-            self.emit_load_const(ConstantData::Tuple {
-                elements: attrs
-                    .into_iter()
-                    .map(|s| ConstantData::Str { value: s.into() })
-                    .collect(),
-            });
-            let static_attrs_name = self.name("__static_attributes__");
-            emit!(
-                self,
-                Instruction::StoreName {
-                    namei: static_attrs_name
-                }
-            );
-        }
+        if let Some(attrs) = self
+            .code_stack
+            .last()
+            .and_then(|info| info.static_attributes.as_ref())
+            .filter(|attrs| !attrs.is_empty())
+        {
+            self.emit_load_const(ConstantData::Tuple {
+                elements: attrs
+                    .iter()
+                    .cloned()
+                    .map(|s| ConstantData::Str { value: s.into() })
+                    .collect(),
+            });
+            let static_attrs_name = self.name("__static_attributes__");
+            emit!(self, Instruction::StoreName { namei: static_attrs_name });
+        }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/compile.rs` around lines 4626 - 4649, The code currently
always emits a StoreName for __static_attributes__ even when the collected attrs
is empty; change the block so you first build the attrs Vec from
self.code_stack.last().unwrap().static_attributes, then only call
emit_load_const(ConstantData::Tuple{...}) and emit the Instruction::StoreName
for self.name("__static_attributes__") if attrs is non-empty (i.e., skip
emitting the tuple/store when attrs.is_empty()) so you don't overwrite explicit
assignments from the class body.

6732-6735: ⚠️ Potential issue | 🟡 Minor

The Compare fast-path comment is still misleading.

The rationale on Line 6732 still assumes ast::Expr::Compare produces a bool. Rich comparisons can return non-bool objects; skipping ToBool here is only safe because the following PopJumpIf* handles truthiness.

📝 Suggested rewording
-                // Compare already produces a bool; everything else needs TO_BOOL
+                // Rich comparisons may return non-bool objects; skipping TO_BOOL
+                // here is only correct because PopJumpIf* performs truthiness
+                // conversion itself.
Do Python rich comparison methods like __lt__ and __eq__ have to return bool, or can they return arbitrary objects whose truthiness is only evaluated later in a conditional context?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/compile.rs` around lines 6732 - 6735, Update the
misleading comment above the Compare fast-path (around the code that checks
ast::Expr::Compare and emits Instruction::ToBool) to state that Python rich
comparison methods can return non-bool objects and therefore do not necessarily
produce a boolean; clarify that the reason the ToBool is skipped for
ast::Expr::Compare is safe only because the subsequent PopJumpIf* bytecode
instructions evaluate truthiness later, not because Compare always yields a
bool. Reference the symbols expression, ast::Expr::Compare, Instruction::ToBool,
and PopJumpIf* in the comment so future readers understand the exact rationale.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/codegen/src/ir.rs`:
- Around line 575-590: The fall-through propagation loop marks next blocks
without confirming the current block is reachable; modify the loop that iterates
using current: BlockIdx(0) so that before marking next (using next =
self.blocks[current.idx()].next) you first check reachable[current.idx()] is
true and only then, if next != BlockIdx::NULL and the last instruction is not a
scope-exit or unconditional-jump (the existing
instructions.last().is_some_and(|ins| ins.instr.is_scope_exit() ||
ins.instr.is_unconditional_jump())), set reachable[next.idx()] = true; this
ensures unreachable blocks (like a terminal block 0) do not cause their
fall-through successors to be marked reachable.

In `@crates/vm/src/builtins/type.rs`:
- Around line 2136-2158: The branch handling the unreachable identifier
__classdictcell__ should be removed and replaced with unified cell-handling
logic that matches the compiler-emitted name __classdict__; extract the
duplicated pattern (reading attrs via typ.attributes.write(), fetching
attrs.get(identifier!(vm, ...)), converting via PyCellRef::try_from_object,
calling cell.set(Some(...)), and attrs.shift_remove(...)) into a small helper
(e.g., handle_class_cell(vm, &mut attrs, identifier!(vm, NAME), value)) and
invoke it for both identifier!(vm, __classcell__) with typ.clone().into() and
identifier!(vm, __classdict__) with dict.clone().into(), ensuring error mapping
stays the same.

---

Duplicate comments:
In `@crates/codegen/src/compile.rs`:
- Around line 4626-4649: The code currently always emits a StoreName for
__static_attributes__ even when the collected attrs is empty; change the block
so you first build the attrs Vec from
self.code_stack.last().unwrap().static_attributes, then only call
emit_load_const(ConstantData::Tuple{...}) and emit the Instruction::StoreName
for self.name("__static_attributes__") if attrs is non-empty (i.e., skip
emitting the tuple/store when attrs.is_empty()) so you don't overwrite explicit
assignments from the class body.
- Around line 6732-6735: Update the misleading comment above the Compare
fast-path (around the code that checks ast::Expr::Compare and emits
Instruction::ToBool) to state that Python rich comparison methods can return
non-bool objects and therefore do not necessarily produce a boolean; clarify
that the reason the ToBool is skipped for ast::Expr::Compare is safe only
because the subsequent PopJumpIf* bytecode instructions evaluate truthiness
later, not because Compare always yields a bool. Reference the symbols
expression, ast::Expr::Compare, Instruction::ToBool, and PopJumpIf* in the
comment so future readers understand the exact rationale.

In `@crates/codegen/src/ir.rs`:
- Around line 1710-1718: The return-epilogue detection currently treats any
Instruction::LoadConst as a None-return; update the predicate so it only matches
a load of the actual None constant by checking the LoadConst's index/value
against the constants table (ConstantData::None). To do this, refactor the
enclosing function (where is_return_block is computed) to accept the code unit's
metadata or consts slice (e.g., add a &CodeUnitMetadata or &consts parameter),
then replace the loose matches on Instruction::LoadConst with a check that the
referenced constant equals ConstantData::None before recognizing the pattern;
keep references to Instruction::LoadConst, Instruction::ReturnValue,
ConstantData::None and the function that computes is_return_block to locate and
change the logic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: fdaf8ac7-6090-45f8-a49c-d4dfd32821db

📥 Commits

Reviewing files that changed from the base of the PR and between c9a5f48 and 93e955c.

⛔ Files ignored due to path filters (1)
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__nested_double_async_with.snap is excluded by !**/*.snap
📒 Files selected for processing (6)
  • crates/codegen/src/compile.rs
  • crates/codegen/src/ir.rs
  • crates/vm/src/builtins/type.rs
  • crates/vm/src/vm/context.rs
  • scripts/compare_bytecode.py
  • scripts/dis_dump.py
✅ Files skipped from review due to trivial changes (1)
  • scripts/dis_dump.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • scripts/compare_bytecode.py

Comment on lines +2136 to +2158
{
let mut attrs = typ.attributes.write();
if let Some(cell) = attrs.get(identifier!(vm, __classcell__)) {
let cell = PyCellRef::try_from_object(vm, cell.clone()).map_err(|_| {
vm.new_type_error(format!(
"__classcell__ must be a nonlocal cell, not {}",
cell.class().name()
))
})?;
cell.set(Some(typ.clone().into()));
attrs.shift_remove(identifier!(vm, __classcell__));
}
if let Some(cell) = attrs.get(identifier!(vm, __classdictcell__)) {
let cell = PyCellRef::try_from_object(vm, cell.clone()).map_err(|_| {
vm.new_type_error(format!(
"__classdictcell__ must be a nonlocal cell, not {}",
cell.class().name()
))
})?;
cell.set(Some(dict.clone().into()));
attrs.shift_remove(identifier!(vm, __classdictcell__));
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Verify where class cell names are emitted/handled.
rg -n --type=rust -C4 '__classcell__|__classdictcell__|needs_classdict|StoreDeref' \
  crates/codegen/src/compile.rs \
  crates/vm/src/builtins/type.rs \
  crates/vm/src/vm/context.rs

Repository: RustPython/RustPython

Length of output: 6183


🏁 Script executed:

# Search for where __classdictcell__ is set or populated
rg -n '__classdictcell__' --type=rust

Repository: RustPython/RustPython

Length of output: 456


🏁 Script executed:

# Look for the broader context of how __classdict__ is handled in codegen
rg -n -B5 -A5 '__classdict__' crates/codegen/src/compile.rs | head -80

Repository: RustPython/RustPython

Length of output: 4644


🏁 Script executed:

# Check if there's any code that converts __classdict__ to __classdictcell__
rg -n 'classdict' --type=rust crates/ | grep -i 'dict\|cell'

Repository: RustPython/RustPython

Length of output: 4952


🏁 Script executed:

# Check the git history or any comments explaining the intent
rg -n -B3 -A3 'classdictcell' --type=rust crates/vm/src/builtins/type.rs

Repository: RustPython/RustPython

Length of output: 801


🏁 Script executed:

# Look for where dict value might be transformed or wrapped as a cell
rg -n 'dict.*cell|cell.*dict' --type=rust crates/vm/src/builtins/type.rs | head -20

Repository: RustPython/RustPython

Length of output: 366


Remove the unreachable __classdictcell__ branch and unify the cell handling logic.

The __classdictcell__ handler at line 2148 is unreachable dead code. The compiler emits __classdict__ as a cell variable (in crates/codegen/src/compile.rs lines 1030, 4592), never __classdictcell__. This naming mismatch breaks bytecode parity—the __classcell__ branch (lines 2138–2147) correctly handles what the compiler emits, but the __classdictcell__ branch can never execute.

Additionally, both cell branches duplicate the same pattern: try-convert to PyCellRef, set the value, and remove from attributes. Extract the cell handling into a shared helper to avoid duplication, then fix the name to match what the compiler actually emits.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/builtins/type.rs` around lines 2136 - 2158, The branch handling
the unreachable identifier __classdictcell__ should be removed and replaced with
unified cell-handling logic that matches the compiler-emitted name
__classdict__; extract the duplicated pattern (reading attrs via
typ.attributes.write(), fetching attrs.get(identifier!(vm, ...)), converting via
PyCellRef::try_from_object, calling cell.set(Some(...)), and
attrs.shift_remove(...)) into a small helper (e.g., handle_class_cell(vm, &mut
attrs, identifier!(vm, NAME), value)) and invoke it for both identifier!(vm,
__classcell__) with typ.clone().into() and identifier!(vm, __classdict__) with
dict.clone().into(), ensuring error mapping stays the same.

@youknowone youknowone force-pushed the bytecode-parity-phase1 branch 3 times, most recently from acda3d0 to 0f2cab9 Compare March 22, 2026 07:56
- Emit TO_BOOL before POP_JUMP_IF_TRUE/FALSE in the general case
  of compile_jump_if (Compare expressions excluded since they
  already produce a bool)
- Module-level __doc__: use STORE_NAME instead of STORE_GLOBAL
- Class body __module__: use LOAD_NAME instead of LOAD_GLOBAL
- Class body: store __firstlineno__ before __doc__
Emit MAKE_CELL for each cell variable and COPY_FREE_VARS N for
free variables at the start of each code object, before RESUME.
These instructions are no-ops in the VM but align the bytecode
with CPython 3.14's output.
Store a tuple of attribute names (currently always empty) as
__static_attributes__ in the class namespace, matching CPython
3.14's class body epilogue. Attribute name collection from
self.xxx accesses is a follow-up task.
test_iter_keys, test_iter_values, test_iter_items now pass
because class bodies emit __static_attributes__ and
__firstlineno__, matching the expected dict key set.
Switch LIST_APPEND, LIST_EXTEND, SET_ADD, SET_UPDATE, MAP_ADD
from 0-based to 1-based stack depth argument, matching CPython's
PEEK(oparg) convention. Adjust the VM to subtract 1 before
calling nth_value.
When the call target is an attribute of an imported name (e.g.,
logging.getLogger()), use plain LOAD_ATTR (method_flag=0) with
a separate PUSH_NULL instead of method-mode LOAD_ATTR. This
matches CPython 3.14's behavior which avoids the method call
optimization for module attribute access.
When the last block in a code object is exactly LOAD_CONST None +
RETURN_VALUE (the implicit return), duplicate these instructions
into blocks that would otherwise fall through to it. This matches
CPython 3.14's behavior of giving each code path its own explicit
return instruction.
Set needs_classdict=true for class scopes that contain function
definitions (def/async def), matching CPython 3.14's behavior for
PEP 649 deferred annotation support. Also restore the Compare
expression check in compile_jump_if to skip TO_BOOL for comparison
operations.
Store the __classdict__ cell reference as __classdictcell__ in
the class namespace when the class has __classdict__ as a cell
variable. Uses LOAD_DEREF (RustPython separates cell vars from
fast locals unlike CPython's unified array).
Run basic dead code elimination (truncating instructions after
RETURN_VALUE/RAISE/JUMP within blocks) at all optimization
levels, not just optimize > 0. CPython always removes this dead
code during assembly.
Only use plain LOAD_ATTR + PUSH_NULL for imports at module or
class scope. Function-local imports use method call mode LOAD_ATTR,
matching CPython 3.14's behavior.
Split DCE into two phases: (1) within-block truncation after
terminal instructions (always runs), (2) whole-block elimination
for blocks only reachable via fall-through from terminal blocks
(runs after normalize_jumps when dead jump instructions exist).
Convert BUILD_TUPLE with size 0 to LOAD_CONST () during constant
folding, matching CPython's optimization for empty tuple literals.
- Remove __classcell__ from class dict after setting the cell value
- Add __classdictcell__ handling: set cell to class namespace dict,
  then remove from class dict
- Register __classdictcell__ identifier
- Use LoadClosure instead of LoadDeref for __classdictcell__ emission
- Reorder MakeFunctionFlag bits to match CPython
- Run ruff format on scripts
The __classdict__ cell addition (for classes with function defs)
and __classdictcell__ store caused cell initialization failures
in importlib. These require deeper VM changes to properly support
the cell variable lifecycle. Reverted for stability.
Use fixpoint iteration to properly determine block reachability:
only mark jump targets of already-reachable blocks, preventing
orphaned blocks from falsely marking their targets as reachable.
Also add a final DCE pass after assembly NOP removal to catch
dead code created by normalize_jumps.
When deciding whether to use plain LOAD_ATTR for attribute calls,
check if the name is imported in any enclosing scope (not just
the current scope). This handles the common pattern where a module
is imported at module level but used inside functions.
Set needs_classdict=true when a class scope contains function
definitions (def/async def), matching CPython 3.14 which always
creates a __classdict__ cell for PEP 649 support in such classes.
Store the __classdict__ cell reference as __classdictcell__ in
the class namespace using LoadClosure (which loads the cell
object itself, not the value inside). This matches CPython 3.14's
class body epilogue.
Revert the class cell cleanup changes from e6975f9 that cause
import failures when frozen module bytecode is stale. The original
behavior (not removing __classcell__ from class dict) is restored.
Restore the __classdict__ cell for classes with function
definitions and __classdictcell__ store in class body epilogue.
Previous failure was caused by stale .pyc cache files containing
bytecode from an intermediate MakeFunctionFlag reorder attempt,
not by these changes themselves.
Reorder discriminants: Defaults=0, KwOnlyDefaults=1, Annotations=2,
Closure=3, Annotate=4, TypeParams=5. This aligns the oparg values
with CPython 3.14's convention.

Note: after this change, stale .pyc cache files must be deleted
(find . -name '*.pyc' -delete) to avoid bytecode mismatch errors.
Override From/TryFrom for MakeFunctionFlag to use power-of-two
values (1,2,4,8,16,32) matching CPython's SET_FUNCTION_ATTRIBUTE
oparg encoding, instead of sequential discriminants (0,1,2,3,4,5).
- scripts/dis_dump.py: Dumps normalized bytecode for Python files,
  usable with both CPython and RustPython. Normalizes code object
  reprs, COMPARE_OP argrepr, and LOAD_FAST_BORROW opnames.
- scripts/compare_bytecode.py: Orchestrates CPython vs RustPython
  bytecode comparison across Lib/, reports match rate and diffs.

Current baseline: 176/1515 files match (11.6%).
@youknowone youknowone force-pushed the bytecode-parity-phase1 branch from 0c7fe3c to 9820993 Compare March 22, 2026 12:58
When all elements of a list literal are constants, convert
LOAD_CONST* + BUILD_LIST N to BUILD_LIST 0 + LOAD_CONST (tuple)
+ LIST_EXTEND 1, matching CPython's constant list optimization.
Generate superinstructions (LOAD_FAST_LOAD_FAST, STORE_FAST_STORE_FAST,
STORE_FAST_LOAD_FAST) at all optimization levels matching CPython.
Match CPython's threshold: constant list folding to BUILD_LIST 0
+ LOAD_CONST tuple + LIST_EXTEND 1 only applies when there are
3 or more constant elements. Smaller lists use element-by-element
LOAD_CONST + BUILD_LIST N.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant