Add global type mutex for thread-safe type operations#7416
Add global type mutex for thread-safe type operations#7416youknowone wants to merge 2 commits intoRustPython:mainfrom
Conversation
📝 WalkthroughWalkthroughThis PR introduces a global Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
d0cfe84 to
1daba38
Compare
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: [ ] test: cpython/Lib/test/test_class.py (TODO: 16) dependencies: dependent tests: (no tests depend on class) Legend:
|
1865733 to
da1d518
Compare
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
extra_tests/custom_text_test_runner.py (1)
391-396:⚠️ Potential issue | 🟡 MinorInconsistent access pattern may cause AttributeError for plain functions.
Line 392-394 still uses the old
__func__.__dict__access pattern while line 395 uses the new helper. For plain functions (without__func__), this will raiseAttributeErrorbefore the helper is even called.Proposed fix
if self.test_types: - if "test_type" in getattr( - test, test._testMethodName - ).__func__.__dict__ and set([s.lower() for s in self.test_types]) == set( + if "test_type" in _get_method_dict(test) and set([s.lower() for s in self.test_types]) == set( [s.lower() for s in _get_method_dict(test)["test_type"]] ):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@extra_tests/custom_text_test_runner.py` around lines 391 - 396, The conditional uses getattr(test, test._testMethodName).__func__.__dict__ which will raise AttributeError for plain functions; replace that access with the safe helper used on the next line by calling _get_method_dict(test) consistently: check for "test_type" in _get_method_dict(test) and compare set([s.lower() for s in self.test_types]) against set([s.lower() for s in _get_method_dict(test)["test_type"]]) so you never access __func__ directly (symbols: self.test_types, test._testMethodName, _get_method_dict, "test_type").
🧹 Nitpick comments (1)
crates/vm/src/frame.rs (1)
7436-7436: Keep the remaining attr specializers on one type snapshot.These sites now start from
version_for_specialization(), but the surrounding specializer still does later MRO/dict/descriptor inspection after that call. That leavesLOAD_ATTR/STORE_ATTRon a looser snapshot than the new__getitem__/__init__paths. I’d either switch these to an atomic lookup helper as well or revalidate the version right before publishing the specialized opcode/cache.Also applies to: 7474-7474, 7694-7694, 7729-7729, 9121-9121
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/frame.rs` at line 7436, The attr specializers call cls.version_for_specialization() early but then perform additional MRO/dict/descriptor inspection, leaving LOAD_ATTR/STORE_ATTR snapshots looser than the new __getitem__/__init__ paths; update the attr-specializer flows (the sites calling version_for_specialization(), e.g. the code around version_for_specialization at the lines flagged and the logic that publishes the specialized opcode/cache for LOAD_ATTR/STORE_ATTR) to either perform the whole lookup atomically via a helper (similar to the __getitem__/__init__ path) or revalidate/refresh the class version immediately before publishing the specialized opcode/cache so the snapshot used for specialization is exact; locate references to version_for_specialization, the LOAD_ATTR/STORE_ATTR specialization code path, and the publish/emit-cache logic and ensure the version is checked/locked at the final publish point.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/vm/src/builtins/type.rs`:
- Around line 2607-2629: The attribute mutation currently invalidates versions
and mutates the attributes map inside with_type_lock but applies slot table
updates (update_slot) outside the same lock; move the call to update_slot() into
the same with_type_lock closure (alongside modified_inner() and the
attributes.write() insert/shift_remove) so the tp_version_tag, inline-cache
invalidation, dict mutation, and slot rewrite occur atomically under the type
lock, then return the previous value and drop it outside the lock as before;
ensure both assign and remove paths in the closure call update_slot on the
appropriate slot and preserve the existing error handling (references:
with_type_lock, modified_inner, attributes.write(), update_slot).
- Around line 1427-1443: Compute and validate the new bases/MROs for zelf and
all descendants before mutating the live structures: inside with_type_lock, do
not immediately assign to zelf.bases; instead build the candidate bases and run
PyType::resolve_mro for zelf and every subclass (the same logic as
update_mro_recursively) into a temporary mapping of PyType -> new_mro
(preserving mro[0] where needed) and return any error if resolution fails; only
after all resolve_mro calls succeed, assign *zelf.bases.write() = bases and then
write each cls.mro from the prepared mapping and proceed (this ensures
update_mro_recursively-like validation happens before any live mutation).
- Around line 1436-1439: The loop over cls.subclasses currently unwraps weak
refs and downcasts (calling upgrade().unwrap() and downcast_ref().unwrap()),
which can panic on stale weakrefs; instead, change the loop to skip entries that
fail to upgrade or fail the downcast by using conditional checks (e.g., if let
Some(strong) = subclass.upgrade() { if let Some(subclass_pytype) =
strong.downcast_ref::<Py<PyType>>() { update_mro_recursively(subclass_pytype,
vm)? } } ) so stale weakrefs are ignored rather than unwrapping and crashing;
apply the same non-unwrapping pattern to the other occurrences around
update_mro_recursively (the similar block at lines ~1451-1459).
- Around line 1427-1449: The MRO update only rewrites bases/mro for zelf and
leaves derived metadata (per-class slot tables and __base__) stale on
subclasses; modify update_mro_recursively (inside with_type_lock) so that after
computing and writing each class's mro you also recompute and write its
base/__base__ (the same logic used for the original class), call that class's
init_slots(&vm.ctx) to rebuild its slots, and call modified_inner() to
invalidate caches for that class before recursing into its subclasses; ensure
you perform these steps for zelf as well (not just at the top level) so every
descendant gets updated slot tables and base-chain info.
- Around line 454-458: with_type_lock currently takes a zero-arg closure so
callers (like assign_version_tag via version_for_specialization /
find_name_in_mro) can end up executing code outside the held lock; change
with_type_lock signature to fn with_type_lock<R>(vm: &VirtualMachine, f: impl
FnOnce(&VirtualMachine) -> R) -> R, acquire the lock into _guard as before and
call f(vm) while the guard is in scope, then update all call sites (notably
version_for_specialization, assign_version_tag, find_name_in_mro and the block
around 490-503) to accept the &VirtualMachine parameter so the work runs while
type_mutex is held, preserving the invariant used by modified_inner.
---
Outside diff comments:
In `@extra_tests/custom_text_test_runner.py`:
- Around line 391-396: The conditional uses getattr(test,
test._testMethodName).__func__.__dict__ which will raise AttributeError for
plain functions; replace that access with the safe helper used on the next line
by calling _get_method_dict(test) consistently: check for "test_type" in
_get_method_dict(test) and compare set([s.lower() for s in self.test_types])
against set([s.lower() for s in _get_method_dict(test)["test_type"]]) so you
never access __func__ directly (symbols: self.test_types, test._testMethodName,
_get_method_dict, "test_type").
---
Nitpick comments:
In `@crates/vm/src/frame.rs`:
- Line 7436: The attr specializers call cls.version_for_specialization() early
but then perform additional MRO/dict/descriptor inspection, leaving
LOAD_ATTR/STORE_ATTR snapshots looser than the new __getitem__/__init__ paths;
update the attr-specializer flows (the sites calling
version_for_specialization(), e.g. the code around version_for_specialization at
the lines flagged and the logic that publishes the specialized opcode/cache for
LOAD_ATTR/STORE_ATTR) to either perform the whole lookup atomically via a helper
(similar to the __getitem__/__init__ path) or revalidate/refresh the class
version immediately before publishing the specialized opcode/cache so the
snapshot used for specialization is exact; locate references to
version_for_specialization, the LOAD_ATTR/STORE_ATTR specialization code path,
and the publish/emit-cache logic and ensure the version is checked/locked at the
final publish point.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: a8a615eb-9d3f-4cd2-a1b8-841c4089a979
⛔ Files ignored due to path filters (1)
Lib/test/test_class.pyis excluded by!Lib/**
📒 Files selected for processing (8)
.cspell.json.github/workflows/cron-ci.yamlcrates/vm/src/builtins/type.rscrates/vm/src/frame.rscrates/vm/src/stdlib/posix.rscrates/vm/src/vm/interpreter.rscrates/vm/src/vm/mod.rsextra_tests/custom_text_test_runner.py
| Self::with_type_lock(vm, || { | ||
| *zelf.bases.write() = bases; | ||
| // Recursively update the mros of this class and all subclasses | ||
| fn update_mro_recursively(cls: &PyType, vm: &VirtualMachine) -> PyResult<()> { | ||
| let mut mro = | ||
| PyType::resolve_mro(&cls.bases.read()).map_err(|msg| vm.new_type_error(msg))?; | ||
| // Preserve self (mro[0]) when updating MRO | ||
| mro.insert(0, cls.mro.read()[0].to_owned()); | ||
| *cls.mro.write() = mro; | ||
| for subclass in cls.subclasses.write().iter() { | ||
| let subclass = subclass.upgrade().unwrap(); | ||
| let subclass: &Py<PyType> = subclass.downcast_ref().unwrap(); | ||
| update_mro_recursively(subclass, vm)?; | ||
| } | ||
| Ok(()) | ||
| } | ||
| Ok(()) | ||
| } | ||
| update_mro_recursively(zelf, vm)?; | ||
| update_mro_recursively(zelf, vm)?; |
There was a problem hiding this comment.
Validate the full __bases__ rewrite before mutating the live hierarchy.
zelf.bases is overwritten before resolve_mro() has succeeded for zelf and every descendant. If any recursive update_mro_recursively() call fails, this setter raises after leaving a partially applied base/MRO graph behind.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm/src/builtins/type.rs` around lines 1427 - 1443, Compute and
validate the new bases/MROs for zelf and all descendants before mutating the
live structures: inside with_type_lock, do not immediately assign to zelf.bases;
instead build the candidate bases and run PyType::resolve_mro for zelf and every
subclass (the same logic as update_mro_recursively) into a temporary mapping of
PyType -> new_mro (preserving mro[0] where needed) and return any error if
resolution fails; only after all resolve_mro calls succeed, assign
*zelf.bases.write() = bases and then write each cls.mro from the prepared
mapping and proceed (this ensures update_mro_recursively-like validation happens
before any live mutation).
| Self::with_type_lock(vm, || { | ||
| *zelf.bases.write() = bases; | ||
| // Recursively update the mros of this class and all subclasses | ||
| fn update_mro_recursively(cls: &PyType, vm: &VirtualMachine) -> PyResult<()> { | ||
| let mut mro = | ||
| PyType::resolve_mro(&cls.bases.read()).map_err(|msg| vm.new_type_error(msg))?; | ||
| // Preserve self (mro[0]) when updating MRO | ||
| mro.insert(0, cls.mro.read()[0].to_owned()); | ||
| *cls.mro.write() = mro; | ||
| for subclass in cls.subclasses.write().iter() { | ||
| let subclass = subclass.upgrade().unwrap(); | ||
| let subclass: &Py<PyType> = subclass.downcast_ref().unwrap(); | ||
| update_mro_recursively(subclass, vm)?; | ||
| } | ||
| Ok(()) | ||
| } | ||
| Ok(()) | ||
| } | ||
| update_mro_recursively(zelf, vm)?; | ||
| update_mro_recursively(zelf, vm)?; | ||
|
|
||
| // Invalidate inline caches | ||
| zelf.modified(); | ||
| // Invalidate inline caches | ||
| zelf.modified_inner(); | ||
|
|
||
| // TODO: do any old slots need to be cleaned up first? | ||
| zelf.init_slots(&vm.ctx); | ||
| // TODO: do any old slots need to be cleaned up first? | ||
| zelf.init_slots(&vm.ctx); |
There was a problem hiding this comment.
Rebasing needs to refresh all derived metadata, not just bases/mro.
Only zelf runs init_slots() after the recursive MRO rewrite, and self.base is never recomputed. Descendants can keep slot tables from the old MRO, while __base__ and base-chain walks still reflect the previous best base.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm/src/builtins/type.rs` around lines 1427 - 1449, The MRO update only
rewrites bases/mro for zelf and leaves derived metadata (per-class slot tables
and __base__) stale on subclasses; modify update_mro_recursively (inside
with_type_lock) so that after computing and writing each class's mro you also
recompute and write its base/__base__ (the same logic used for the original
class), call that class's init_slots(&vm.ctx) to rebuild its slots, and call
modified_inner() to invalidate caches for that class before recursing into its
subclasses; ensure you perform these steps for zelf as well (not just at the top
level) so every descendant gets updated slot tables and base-chain info.
| for subclass in cls.subclasses.write().iter() { | ||
| let subclass = subclass.upgrade().unwrap(); | ||
| let subclass: &Py<PyType> = subclass.downcast_ref().unwrap(); | ||
| update_mro_recursively(subclass, vm)?; |
There was a problem hiding this comment.
Don't unwrap subclass weakrefs during the recursive walk.
subclasses is lazily cleaned, and this branch only appends new weakrefs, so upgrade() can legitimately return None here. Turning that into unwrap() makes a stale weakref crash __bases__ assignment instead of just being skipped.
🔧 Suggested fix
- for subclass in cls.subclasses.write().iter() {
- let subclass = subclass.upgrade().unwrap();
- let subclass: &Py<PyType> = subclass.downcast_ref().unwrap();
- update_mro_recursively(subclass, vm)?;
- }
+ let subclasses: Vec<_> = {
+ let mut subclasses = cls.subclasses.write();
+ subclasses.retain(|weak| weak.upgrade().is_some());
+ subclasses.iter().filter_map(|weak| weak.upgrade()).collect()
+ };
+ for subclass in subclasses {
+ let subclass: &Py<PyType> = subclass.downcast_ref().unwrap();
+ update_mro_recursively(subclass, vm)?;
+ }Also applies to: 1451-1459
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm/src/builtins/type.rs` around lines 1436 - 1439, The loop over
cls.subclasses currently unwraps weak refs and downcasts (calling
upgrade().unwrap() and downcast_ref().unwrap()), which can panic on stale
weakrefs; instead, change the loop to skip entries that fail to upgrade or fail
the downcast by using conditional checks (e.g., if let Some(strong) =
subclass.upgrade() { if let Some(subclass_pytype) =
strong.downcast_ref::<Py<PyType>>() { update_mro_recursively(subclass_pytype,
vm)? } } ) so stale weakrefs are ignored rather than unwrapping and crashing;
apply the same non-unwrapping pattern to the other occurrences around
update_mro_recursively (the similar block at lines ~1451-1459).
| // Drop old value OUTSIDE the type lock to avoid deadlock: | ||
| // dropping may trigger weakref callbacks → method calls → | ||
| // LOAD_ATTR specialization → version_for_specialization → type lock. | ||
| let _prev_value = Self::with_type_lock(vm, || { | ||
| // Invalidate inline caches before modifying attributes. | ||
| // This ensures other threads see the version invalidation before | ||
| // any attribute changes, preventing use-after-free of cached descriptors. | ||
| zelf.modified_inner(); | ||
|
|
||
| if let PySetterValue::Assign(value) = value { | ||
| Ok(zelf.attributes.write().insert(attr_name, value)) | ||
| } else { | ||
| let prev_value = zelf.attributes.write().shift_remove(attr_name); // TODO: swap_remove applicable? | ||
| if prev_value.is_none() { | ||
| return Err(vm.new_attribute_error(format!( | ||
| "type object '{}' has no attribute '{}'", | ||
| zelf.name(), | ||
| attr_name, | ||
| ))); | ||
| } | ||
| Ok(prev_value) | ||
| } | ||
| } | ||
| })?; |
There was a problem hiding this comment.
Keep update_slot() inside the same type-lock transaction.
The dict mutation and version invalidation are published before the slot table is rewritten. Another thread can assign a fresh tp_version_tag or dispatch through the old slot in that gap, so the new version no longer covers the full type state.
🔧 Suggested fix
- let _prev_value = Self::with_type_lock(vm, || {
+ let _prev_value = Self::with_type_lock(vm, || {
// Invalidate inline caches before modifying attributes.
// This ensures other threads see the version invalidation before
// any attribute changes, preventing use-after-free of cached descriptors.
zelf.modified_inner();
- if let PySetterValue::Assign(value) = value {
- Ok(zelf.attributes.write().insert(attr_name, value))
+ let prev_value = if let PySetterValue::Assign(value) = value {
+ zelf.attributes.write().insert(attr_name, value)
} else {
let prev_value = zelf.attributes.write().shift_remove(attr_name); // TODO: swap_remove applicable?
if prev_value.is_none() {
return Err(vm.new_attribute_error(format!(
"type object '{}' has no attribute '{}'",
@@
attr_name,
)));
}
- Ok(prev_value)
- }
+ prev_value
+ };
+
+ if attr_name.as_wtf8().starts_with("__") && attr_name.as_wtf8().ends_with("__") {
+ if assign {
+ zelf.update_slot::<true>(attr_name, &vm.ctx);
+ } else {
+ zelf.update_slot::<false>(attr_name, &vm.ctx);
+ }
+ }
+ Ok(prev_value)
})?;
-
- if attr_name.as_wtf8().starts_with("__") && attr_name.as_wtf8().ends_with("__") {
- if assign {
- zelf.update_slot::<true>(attr_name, &vm.ctx);
- } else {
- zelf.update_slot::<false>(attr_name, &vm.ctx);
- }
- }
Ok(())Also applies to: 2631-2637
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm/src/builtins/type.rs` around lines 2607 - 2629, The attribute
mutation currently invalidates versions and mutates the attributes map inside
with_type_lock but applies slot table updates (update_slot) outside the same
lock; move the call to update_slot() into the same with_type_lock closure
(alongside modified_inner() and the attributes.write() insert/shift_remove) so
the tp_version_tag, inline-cache invalidation, dict mutation, and slot rewrite
occur atomically under the type lock, then return the previous value and drop it
outside the lock as before; ensure both assign and remove paths in the closure
call update_slot on the appropriate slot and preserve the existing error
handling (references: with_type_lock, modified_inner, attributes.write(),
update_slot).
Dropping values inside with_type_lock can trigger weakref callbacks, which may access attributes (LOAD_ATTR specialization) and re-acquire the non-reentrant type mutex, causing deadlock. Return old values from lock closures so they drop after lock release.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
crates/vm/src/frame.rs (2)
7713-7744:⚠️ Potential issue | 🔴 CriticalThe metaclass descriptor check still has a version-skew window.
mcl_attris inspected beforemetaclass_versionis captured here. If another thread installs a metaclass data descriptor between those two steps,LoadAttrClass*can still be cached against the post-mutation metaclass version while bypassing the new descriptor precedence.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/frame.rs` around lines 7713 - 7744, The metaclass descriptor check reads mcl_attr before capturing metaclass_version, leaving a race where a data descriptor could be installed between those steps; to fix, call mcl.version_for_specialization(_vm) and store metaclass_version (and handle the zero-version backoff early) before calling mcl.get_attr(attr_name), then only inspect the attribute's descr_set if metaclass_version is nonzero; reference mcl.version_for_specialization, metaclass_version, mcl.get_attr, mcl_attr, and the LoadAttrClass* specialization path when moving the attribute-inspection after the version capture and preserving the existing adaptive_counter_backoff behavior when version_for_specialization returns 0.
7433-7439:⚠️ Potential issue | 🔴 CriticalCapture the type version from the same snapshot as these slot checks.
These blocks still read mutable type behavior (
getattro/setattro,__bool__/__len__,tp_new/tp_alloc) before they capture the version they cache against. Under free-threading, another thread can mutate the type in between and make us publish a specialization under the new version while still relying on the old behavior, which can skip a newly-installed__getattribute__,__setattr__,__bool__/__len__, or__new__.Also applies to: 7477-7490, 8455-8488, 8781-8794, 9106-9112, 9124-9136
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/frame.rs` around lines 7433 - 7439, The slot-check code reads mutable slot pointers (e.g., cls.slots.getattro.load() / cls.slots.setattro.load(), tp_new/tp_alloc, __bool__/__len__) before capturing the type version, which can race; fix by first taking a single snapshot of the type version (read and store the version tag field from cls once into a local, e.g., version_snapshot) and then read the slot pointers and compute is_default_* flags, finally publish the specialization tagged with that same version_snapshot; apply the same change to the equivalent blocks that check cls.slots.setattro, tp_new/tp_alloc, and __bool__/__len__ (the other occurrences noted at the listed ranges) so all slot-checks and the cached version come from the same snapshot.
♻️ Duplicate comments (5)
crates/vm/src/builtins/type.rs (5)
2614-2636:⚠️ Potential issue | 🔴 CriticalKeep the slot rewrite in the same type-lock transaction.
Lines 2617-2636 publish the dict mutation and version invalidation, but Lines 2638-2643 update the slot table after
type_mutexis released. Another thread can observe the new attribute state, assign a fresh version tag, or dispatch through the old slot table in that window.🔒 Suggested fix
let _prev_value = Self::with_type_lock(vm, || { // Invalidate inline caches before modifying attributes. // This ensures other threads see the version invalidation before // any attribute changes, preventing use-after-free of cached descriptors. zelf.modified_inner(); - if let PySetterValue::Assign(value) = value { - Ok(zelf.attributes.write().insert(attr_name, value)) + let prev_value = if let PySetterValue::Assign(value) = value { + zelf.attributes.write().insert(attr_name, value) } else { let prev_value = zelf.attributes.write().shift_remove(attr_name); // TODO: swap_remove applicable? if prev_value.is_none() { return Err(vm.new_attribute_error(format!( "type object '{}' has no attribute '{}'", @@ attr_name, ))); } - Ok(prev_value) - } + prev_value + }; + + if attr_name.as_wtf8().starts_with("__") && attr_name.as_wtf8().ends_with("__") { + if assign { + zelf.update_slot::<true>(attr_name, &vm.ctx); + } else { + zelf.update_slot::<false>(attr_name, &vm.ctx); + } + } + + Ok(prev_value) })?; - - if attr_name.as_wtf8().starts_with("__") && attr_name.as_wtf8().ends_with("__") { - if assign { - zelf.update_slot::<true>(attr_name, &vm.ctx); - } else { - zelf.update_slot::<false>(attr_name, &vm.ctx); - } - } Ok(())Also applies to: 2638-2643
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/builtins/type.rs` around lines 2614 - 2636, The dict mutation, version invalidation, and the slot-table rewrite must occur inside the same type lock to avoid a race; move the slot-table update logic into the closure passed to Self::with_type_lock so modified_inner(), the attributes.write().insert/shift_remove mutation (handling PySetterValue::Assign), and the slot table rewrite/update happen before the lock is released, but keep dropping the previous value outside the lock by returning it from the closure (as _prev_value) so its destructor runs after with_type_lock returns.
1434-1450:⚠️ Potential issue | 🔴 CriticalValidate the full
__bases__rewrite before touching live state.Line 1435 overwrites
zelf.basesbefore anyresolve_mro()call succeeds. Ifupdate_mro_recursively()then fails forzelfor any descendant, this setter unwinds after partially mutating the hierarchy.
1437-1456:⚠️ Potential issue | 🟠 MajorRecompute
baseand slot tables for every rebased class.
update_mro_recursively()only rewritesmro, and onlyzelfrunsinit_slots()on Line 1456. Descendants keep slot tables derived from the old MRO, andself.base/__base__is never refreshed for any of the rewritten types.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/builtins/type.rs` around lines 1437 - 1456, update_mro_recursively currently only rewrites each type's mro, but doesn't refresh per-class base/slot state; update the function so that after computing and assigning mro it also recomputes and writes the class base/__base__ (e.g. set cls.base to the appropriate MRO entry) and calls cls.init_slots(&vm.ctx) and cls.modified_inner() for that class (instead of only calling init_slots/modified on zelf after recursion). Refer to update_mro_recursively, PyType::resolve_mro, cls.mro, cls.base, cls.subclasses, modified_inner, and init_slots to locate where to update base and reinitialize slots for each rewritten subclass.
1443-1446:⚠️ Potential issue | 🟠 MajorSkip dead subclass weakrefs instead of unwrapping.
subclassesis lazily cleaned. Line 1444'supgrade().unwrap()can legitimately fail and turn__bases__assignment into a panic instead of just ignoring a stale weakref.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/builtins/type.rs` around lines 1443 - 1446, The loop currently calls upgrade().unwrap() (and then downcast_ref().unwrap()) on entries in cls.subclasses, which can panic on dead weakrefs; change it to safely skip stale weakrefs by checking the Option/Result: replace the unwrap chain with a conditional that first does if let Some(strong) = subclass.upgrade() { if let Some(sub_pytype) = strong.downcast_ref::<Py<PyType>>() { update_mro_recursively(sub_pytype, vm)?; } } so dead weakrefs or unexpected types are ignored instead of causing a panic.
496-508:⚠️ Potential issue | 🔴 CriticalSerialize the public version-tag path too.
assign_version_tag()still just forwards to_inner, and Line 1138 still reaches it fromfind_name_in_mro()withouttype_mutex. That leaves the base-before-subclass tagging invariant racing withmodified_inner(), so a subclass can still end up with a nonzerotp_version_tagafter its base was reset to0.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/vm/src/builtins/type.rs`:
- Around line 1607-1617: The annotations cache cleanup should run
unconditionally: inside the closure passed to Self::with_type_lock (around
modified_inner, attributes.write(), and the attrs.insert(identifier!(vm,
__annotate_func__), value) call), always remove any existing identifier!(vm,
__annotations_cache__) instead of only calling attrs.swap_remove(...) when
!vm.is_none(&value); i.e., drop the vm.is_none conditional and unconditionally
call attrs.swap_remove(identifier!(vm, __annotations_cache__)) before inserting
the new __annotate_func__ value so assigning None will also clear the cached
__annotations__.
---
Outside diff comments:
In `@crates/vm/src/frame.rs`:
- Around line 7713-7744: The metaclass descriptor check reads mcl_attr before
capturing metaclass_version, leaving a race where a data descriptor could be
installed between those steps; to fix, call mcl.version_for_specialization(_vm)
and store metaclass_version (and handle the zero-version backoff early) before
calling mcl.get_attr(attr_name), then only inspect the attribute's descr_set if
metaclass_version is nonzero; reference mcl.version_for_specialization,
metaclass_version, mcl.get_attr, mcl_attr, and the LoadAttrClass* specialization
path when moving the attribute-inspection after the version capture and
preserving the existing adaptive_counter_backoff behavior when
version_for_specialization returns 0.
- Around line 7433-7439: The slot-check code reads mutable slot pointers (e.g.,
cls.slots.getattro.load() / cls.slots.setattro.load(), tp_new/tp_alloc,
__bool__/__len__) before capturing the type version, which can race; fix by
first taking a single snapshot of the type version (read and store the version
tag field from cls once into a local, e.g., version_snapshot) and then read the
slot pointers and compute is_default_* flags, finally publish the specialization
tagged with that same version_snapshot; apply the same change to the equivalent
blocks that check cls.slots.setattro, tp_new/tp_alloc, and __bool__/__len__ (the
other occurrences noted at the listed ranges) so all slot-checks and the cached
version come from the same snapshot.
---
Duplicate comments:
In `@crates/vm/src/builtins/type.rs`:
- Around line 2614-2636: The dict mutation, version invalidation, and the
slot-table rewrite must occur inside the same type lock to avoid a race; move
the slot-table update logic into the closure passed to Self::with_type_lock so
modified_inner(), the attributes.write().insert/shift_remove mutation (handling
PySetterValue::Assign), and the slot table rewrite/update happen before the lock
is released, but keep dropping the previous value outside the lock by returning
it from the closure (as _prev_value) so its destructor runs after with_type_lock
returns.
- Around line 1437-1456: update_mro_recursively currently only rewrites each
type's mro, but doesn't refresh per-class base/slot state; update the function
so that after computing and assigning mro it also recomputes and writes the
class base/__base__ (e.g. set cls.base to the appropriate MRO entry) and calls
cls.init_slots(&vm.ctx) and cls.modified_inner() for that class (instead of only
calling init_slots/modified on zelf after recursion). Refer to
update_mro_recursively, PyType::resolve_mro, cls.mro, cls.base, cls.subclasses,
modified_inner, and init_slots to locate where to update base and reinitialize
slots for each rewritten subclass.
- Around line 1443-1446: The loop currently calls upgrade().unwrap() (and then
downcast_ref().unwrap()) on entries in cls.subclasses, which can panic on dead
weakrefs; change it to safely skip stale weakrefs by checking the Option/Result:
replace the unwrap chain with a conditional that first does if let Some(strong)
= subclass.upgrade() { if let Some(sub_pytype) =
strong.downcast_ref::<Py<PyType>>() { update_mro_recursively(sub_pytype, vm)?; }
} so dead weakrefs or unexpected types are ignored instead of causing a panic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: c8c994ce-d646-4364-b6ea-ea561ef96c2b
📒 Files selected for processing (5)
crates/vm/src/builtins/type.rscrates/vm/src/frame.rscrates/vm/src/stdlib/posix.rscrates/vm/src/vm/interpreter.rscrates/vm/src/vm/mod.rs
✅ Files skipped from review due to trivial changes (1)
- crates/vm/src/stdlib/posix.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- crates/vm/src/vm/interpreter.rs
- crates/vm/src/vm/mod.rs
| let _prev_values = Self::with_type_lock(vm, || { | ||
| self.modified_inner(); | ||
| let mut attrs = self.attributes.write(); | ||
| let removed = if !vm.is_none(&value) { | ||
| attrs.swap_remove(identifier!(vm, __annotations_cache__)) | ||
| } else { | ||
| None | ||
| }; | ||
| let prev = attrs.insert(identifier!(vm, __annotate_func__), value); | ||
| (removed, prev) | ||
| }); |
There was a problem hiding this comment.
Clear __annotations_cache__ even when __annotate__ becomes None.
Right now the cache is only cleared when the new value is callable. If the class already materialized __annotations__, assigning None here leaves __annotations_cache__ behind, so later reads keep returning the stale dict.
💡 Minimal fix
let _prev_values = Self::with_type_lock(vm, || {
self.modified_inner();
let mut attrs = self.attributes.write();
- let removed = if !vm.is_none(&value) {
- attrs.swap_remove(identifier!(vm, __annotations_cache__))
- } else {
- None
- };
+ let removed = attrs.swap_remove(identifier!(vm, __annotations_cache__));
let prev = attrs.insert(identifier!(vm, __annotate_func__), value);
(removed, prev)
});🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm/src/builtins/type.rs` around lines 1607 - 1617, The annotations
cache cleanup should run unconditionally: inside the closure passed to
Self::with_type_lock (around modified_inner, attributes.write(), and the
attrs.insert(identifier!(vm, __annotate_func__), value) call), always remove any
existing identifier!(vm, __annotations_cache__) instead of only calling
attrs.swap_remove(...) when !vm.is_none(&value); i.e., drop the vm.is_none
conditional and unconditionally call attrs.swap_remove(identifier!(vm,
__annotations_cache__)) before inserting the new __annotate_func__ value so
assigning None will also clear the cached __annotations__.
Introduce vm.state.type_mutex to serialize type mutation and version-tag assignment, matching CPython's free-threading model.
Summary by CodeRabbit
Bug Fixes
Performance