Skip to content

Bytecode parity phase 3#7514

Merged
youknowone merged 1 commit intoRustPython:mainfrom
youknowone:bytecode-parity
Mar 28, 2026
Merged

Bytecode parity phase 3#7514
youknowone merged 1 commit intoRustPython:mainfrom
youknowone:bytecode-parity

Conversation

@youknowone
Copy link
Copy Markdown
Member

@youknowone youknowone commented Mar 27, 2026

Summary by CodeRabbit

  • Bug Fixes

    • Fixed potential crash when comparison operations encounter unavailable slots
    • Improved async generator yield value handling for better reliability
  • Performance

    • Enhanced compiler optimizations for function call arguments and loop iteration compilation

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 27, 2026

📝 Walkthrough

Walkthrough

This PR modifies bytecode compilation and runtime VM logic in RustPython. Changes include: refining class static attribute scanning with decorator awareness, relocating async-generator yield wrapping from runtime to compile-time, adjusting call argument construction and comprehension call protocols, filtering parameter cells from variable resolution, and adding safety checks for missing comparison slots.

Changes

Cohort / File(s) Summary
Async Generator Yield Handling
crates/codegen/src/compile.rs, crates/vm/src/frame.rs
Moved async-generator yield wrapping logic from runtime (frame.rs removed conditional wrapping) to compile-time (compile.rs now emits AsyncGenWrap before YieldValue). Simplified YieldValue instruction handling to always yield popped values without oparg-based decisions.
Class Static Attribute Scanning
crates/codegen/src/compile.rs
Enhanced class self.xxx = ... attribute scanning to use decorator-aware naming, skip methods decorated with @staticmethod/@classmethod, skip implicit classmethod-like methods (__init_subclass__, __class_getitem__), and target assignments to "self" in __new__ bodies.
Call Protocol and Iteration Optimization
crates/codegen/src/compile.rs
Reorganized call argument construction to explicitly compile positional arguments and assemble via BuildTuple before CallFunctionEx when no starred args present. Kept for-loop iteration optimization restricted to non-async loops. Simplified comprehension call protocol by omitting unconditional PushNull and changing Call { argc: 1 } to Call { argc: 0 }.
Runtime Safety and Variable Resolution
crates/vm/src/builtins/code.rs, crates/vm/src/protocol/object.rs
Updated PyCode::_varname_from_oparg to filter cellvars excluding "parameter cells" (entries also in varnames) for correct variable name resolution. Added safety check in PyObject::_cmp_inner to verify richcompare slot existence before invocation instead of unwrapping unconditionally.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • Bytecode parity #7481: Modifies compile.rs class static-attribute collection and async/generator yield codegen behavior using similar decorator awareness and wrapping logic patterns.
  • Bytecode parity #7507: Adjusts compile.rs codegen for tuple-building, call emission, and async-generator yield/bytecode handling alongside iteration optimizations.
  • Bytecode parity #7504: Modifies compile.rs static-class-attribute scanning and async-generator yield codegen with overlapping decorator-checking and wrapping strategies.

Suggested reviewers

  • ShaharNaveh

Poem

🐰 A hop through the bytecode, async we go,
Class attributes bloom with decorator's glow,
Cells are filtered, comparisons checked with care,
Yield wraps compile-time—magic in the air! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Bytecode parity phase 3' is vague and generic. It uses non-descriptive terms without conveying specific information about the actual changes in the pull request, such as the compiler optimizations, async generator wrapping, or call argument construction improvements. Consider a more descriptive title that highlights the main technical changes, such as 'Optimize async generator codegen and call argument construction' or similar specifics from the changeset.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone force-pushed the bytecode-parity branch 3 times, most recently from 5e65683 to 57f44c6 Compare March 27, 2026 13:43
@youknowone youknowone marked this pull request as ready for review March 27, 2026 14:55
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
crates/codegen/src/compile.rs (2)

4596-4611: ⚠️ Potential issue | 🟡 Minor

Move the __new__ special-case above the first-parameter guard.

Line 4602 still exits before the __new__ branch runs, so def __new__(*args, **kwargs): self.x = 1 will never contribute to __static_attributes__. If __new__ is meant to key off the literal self, that branch needs to run before looking for a positional parameter.

Suggested fix
-            let first_param = f
-                .parameters
-                .posonlyargs
-                .first()
-                .or(f.parameters.args.first())
-                .map(|p| &p.parameter.name);
-            let Some(self_name) = first_param else {
-                continue;
-            };
             // For __new__, first param is cls → scan for "self" instead
             // (CPython scans for the literal name "self" in __new__ bodies)
             if fname == "__new__" {
                 Self::scan_store_attrs(&f.body, "self", attrs);
-            } else {
-                Self::scan_store_attrs(&f.body, self_name.as_str(), attrs);
+                continue;
             }
+            let first_param = f
+                .parameters
+                .posonlyargs
+                .first()
+                .or(f.parameters.args.first())
+                .map(|p| &p.parameter.name);
+            let Some(self_name) = first_param else {
+                continue;
+            };
+            Self::scan_store_attrs(&f.body, self_name.as_str(), attrs);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/compile.rs` around lines 4596 - 4611, The guard that
extracts first_param (from f.parameters.posonlyargs or args) runs before
handling the __new__ special-case, so functions like def __new__(*args,
**kwargs): self.x = 1 are skipped; move the __new__ branch above the first_param
extraction/guard so you always call Self::scan_store_attrs(&f.body, "self",
attrs) when fname == "__new__". Specifically, check fname == "__new__" and call
Self::scan_store_attrs with the literal "self" before using
f.parameters.posonlyargs/.args and the first_param/self_name logic.

5474-5485: ⚠️ Potential issue | 🔴 Critical

This async-for loop transformation does not match CPython behavior and introduces a divergence.

CPython 3.13/3.14 does not rewrite list literals to tuples before GET_AITER in async-for loops. The list remains a list through the entire process, and when GET_AITER fails (e.g., for async for x in [1]: ...), the error message reports the actual type as 'list'.

RustPython's current implementation converts the list to a tuple, which causes:

  1. Bytecode to differ from CPython's
  2. Error messages to report 'tuple' instead of 'list' when async iteration fails

This rewrite should either be removed to match CPython's behavior, or if intentional, the rationale should be documented. If the goal is CPython parity for bytecode generation, this change needs to be reverted.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/compile.rs` around lines 5474 - 5485, The code rewrites a
list literal into a tuple before async-for iteration (the ast::Expr::List match
that emits Instruction::BuildTuple), causing bytecode and error-message
divergence from CPython (GET_AITER should see a list); revert this
transformation so list literals are compiled as lists in async-for contexts by
removing the special-case that turns ast::Expr::List into a tuple (leave
compilation to self.compile_expression for the list expression and do not emit
Instruction::BuildTuple here), or add a clear comment if you intentionally keep
the behavior.
🧹 Nitpick comments (2)
crates/vm/src/builtins/code.rs (1)

1293-1302: Consider avoiding String allocations by using reference-based comparison.

Building a HashSet<String> with to_string() on every call creates heap allocations. Since PyStrInterned references are interned and can be compared directly, you can avoid these allocations:

♻️ Suggested optimization using pointer comparison
-        let varnames_set: std::collections::HashSet<String> =
-            self.code.varnames.iter().map(|s| s.to_string()).collect();
-        let nonparam_cellvars: Vec<_> = self
-            .code
-            .cellvars
-            .iter()
-            .filter(|s| !varnames_set.contains(&s.to_string()))
-            .collect();
+        let nonparam_cellvars: Vec<_> = self
+            .code
+            .cellvars
+            .iter()
+            .filter(|cv| !self.code.varnames.iter().any(|v| *v == **cv))
+            .collect();

Alternatively, if the cellvars/varnames lists are large enough that O(n²) becomes a concern, consider caching the nonparam_cellvars computation in the PyCode struct.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/builtins/code.rs` around lines 1293 - 1302, The current code
builds a HashSet<String> by calling to_string() on each element of
self.code.varnames which causes unnecessary heap allocations; change the
comparison to use references/borrowed interned strings instead (e.g., compare
PyStrInterned or &str values directly) so you avoid allocating in varnames_set
and when filtering self.code.cellvars for nonparam_cellvars, or alternatively
compute and cache nonparam_cellvars inside PyCode to avoid repeated O(n²) work;
update usages of varnames_set, nonparam_cellvars, self.code.varnames, and
self.code.cellvars accordingly to use reference-based comparisons rather than
String allocations.
crates/codegen/src/ir.rs (1)

2029-2038: Fix doc-comment association after inserting the new helper.

duplicate_end_returns docs are now attached to add_generator_stop_handler, which makes rustdoc misleading. Move the return-duplication docs back above fn duplicate_end_returns(...) and keep only StopIteration-handler docs above this helper.

Suggested doc-comment move
-/// Duplicate `LOAD_CONST None + RETURN_VALUE` for blocks that fall through
-/// to the final return block. Matches CPython's behavior of ensuring every
-/// code path that reaches the end of a function/module has its own explicit
-/// return instruction.
 /// Add a StopIteration → RuntimeError handler for generators/coroutines.
 /// CPython: flowgraph.c add_stopiteration_handler()
 /// Appends a handler block that wraps the entire generator body.
 fn add_generator_stop_handler(_flags: CodeFlags, _blocks: &mut Vec<Block>) {
@@
 }
 
+/// Duplicate `LOAD_CONST None + RETURN_VALUE` for blocks that fall through
+/// to the final return block. Matches CPython's behavior of ensuring every
+/// code path that reaches the end of a function/module has its own explicit
+/// return instruction.
 fn duplicate_end_returns(blocks: &mut [Block]) {

As per coding guidelines, "Do not delete or rewrite existing comments unless they are factually wrong or directly contradict the new code."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/ir.rs` around lines 2029 - 2038, The doc comment block
describing return-duplication was accidentally left above the new helper; move
the duplicate_end_returns doc comment back directly above the fn
duplicate_end_returns(...) declaration and leave only the StopIteration/handler
documentation immediately above fn add_generator_stop_handler(...), ensuring
each function has its correct doc-comment so rustdoc associates them with the
intended symbols.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@crates/codegen/src/compile.rs`:
- Around line 4596-4611: The guard that extracts first_param (from
f.parameters.posonlyargs or args) runs before handling the __new__ special-case,
so functions like def __new__(*args, **kwargs): self.x = 1 are skipped; move the
__new__ branch above the first_param extraction/guard so you always call
Self::scan_store_attrs(&f.body, "self", attrs) when fname == "__new__".
Specifically, check fname == "__new__" and call Self::scan_store_attrs with the
literal "self" before using f.parameters.posonlyargs/.args and the
first_param/self_name logic.
- Around line 5474-5485: The code rewrites a list literal into a tuple before
async-for iteration (the ast::Expr::List match that emits
Instruction::BuildTuple), causing bytecode and error-message divergence from
CPython (GET_AITER should see a list); revert this transformation so list
literals are compiled as lists in async-for contexts by removing the
special-case that turns ast::Expr::List into a tuple (leave compilation to
self.compile_expression for the list expression and do not emit
Instruction::BuildTuple here), or add a clear comment if you intentionally keep
the behavior.

---

Nitpick comments:
In `@crates/codegen/src/ir.rs`:
- Around line 2029-2038: The doc comment block describing return-duplication was
accidentally left above the new helper; move the duplicate_end_returns doc
comment back directly above the fn duplicate_end_returns(...) declaration and
leave only the StopIteration/handler documentation immediately above fn
add_generator_stop_handler(...), ensuring each function has its correct
doc-comment so rustdoc associates them with the intended symbols.

In `@crates/vm/src/builtins/code.rs`:
- Around line 1293-1302: The current code builds a HashSet<String> by calling
to_string() on each element of self.code.varnames which causes unnecessary heap
allocations; change the comparison to use references/borrowed interned strings
instead (e.g., compare PyStrInterned or &str values directly) so you avoid
allocating in varnames_set and when filtering self.code.cellvars for
nonparam_cellvars, or alternatively compute and cache nonparam_cellvars inside
PyCode to avoid repeated O(n²) work; update usages of varnames_set,
nonparam_cellvars, self.code.varnames, and self.code.cellvars accordingly to use
reference-based comparisons rather than String allocations.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 4562e2f0-aaec-4585-aaed-fcbb04e40640

📥 Commits

Reviewing files that changed from the base of the PR and between 4107217 and 57f44c6.

⛔ Files ignored due to path filters (1)
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__nested_double_async_with.snap is excluded by !**/*.snap
📒 Files selected for processing (5)
  • crates/codegen/src/compile.rs
  • crates/codegen/src/ir.rs
  • crates/vm/src/builtins/code.rs
  • crates/vm/src/frame.rs
  • crates/vm/src/protocol/object.rs

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
crates/codegen/src/compile.rs (1)

7919-7934: Please lock the new call lowering with a disassembly regression.

Both the explicit tuple materialization for CallFunctionEx and the comprehension CALL { argc: 0 } path are stack-shape-sensitive. A focused snapshot for a starred/**kwargs call and a comprehension invocation would make parity regressions much easier to catch.

As per coding guidelines, "When modifying bytecode instructions, perform a clean build by running: rm -r target/debug/build/rustpython-* && find . | grep -E "\.pyc$" | xargs rm -r".

Also applies to: 8292-8293

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/compile.rs` around lines 7919 - 7934, Add a disassembly
regression test that locks the new call lowering behavior for both the
starred/**kwargs path (CallFunctionEx lowering) and the comprehension invocation
path that emits CALL { argc: 0 }; specifically create focused snapshot tests
that exercise a starred/`**kwargs` call and a comprehension call so any
stack-shape-sensitive changes are caught. Locate the lowering logic around
compile_expression, emit_load_const, set_source_range and the
Instruction::BuildTuple / CallFunctionEx emission in compile.rs (also check the
similar spots near the referenced 8292-8293 change) and add tests that assert
the produced bytecode/disassembly matches the expected snapshot for both cases;
run the clean build steps described in the guidelines before committing the
snapshots.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@crates/codegen/src/compile.rs`:
- Around line 7919-7934: Add a disassembly regression test that locks the new
call lowering behavior for both the starred/**kwargs path (CallFunctionEx
lowering) and the comprehension invocation path that emits CALL { argc: 0 };
specifically create focused snapshot tests that exercise a starred/`**kwargs`
call and a comprehension call so any stack-shape-sensitive changes are caught.
Locate the lowering logic around compile_expression, emit_load_const,
set_source_range and the Instruction::BuildTuple / CallFunctionEx emission in
compile.rs (also check the similar spots near the referenced 8292-8293 change)
and add tests that assert the produced bytecode/disassembly matches the expected
snapshot for both cases; run the clean build steps described in the guidelines
before committing the snapshots.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 2b118189-25bd-4908-9a6a-f0b0c12eab42

📥 Commits

Reviewing files that changed from the base of the PR and between 57f44c6 and 40cab9c.

⛔ Files ignored due to path filters (1)
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__nested_double_async_with.snap is excluded by !**/*.snap
📒 Files selected for processing (4)
  • crates/codegen/src/compile.rs
  • crates/vm/src/builtins/code.rs
  • crates/vm/src/frame.rs
  • crates/vm/src/protocol/object.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/vm/src/builtins/code.rs

@youknowone youknowone merged commit f7556b0 into RustPython:main Mar 28, 2026
15 checks passed
@youknowone youknowone deleted the bytecode-parity branch March 28, 2026 00:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant