Uh oh!
There was an error while loading. Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork 34k
gh-131798: JIT: Optimize _POP_CALL_TWO_LOAD_CONST_INLINE_BORROW#134369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uh oh!
There was an error while loading. Please reload this page.
Conversation
tomasr8 commented May 20, 2025 • edited by bedevere-app bot
Loading Uh oh!
There was an error while loading. Please reload this page.
edited by bedevere-app bot
Uh oh!
There was an error while loading. Please reload this page.
f37b9c3 to 82e0a17ComparePython/bytecodes.c Outdated
| PyStackRef_CLOSE(pop1); | ||
| } | ||
| tier2 pure op(_POP_CALL_LOAD_CONST_INLINE_BORROW, (ptr/4, callable, null -- value)){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove the pure annotations from these please? They don't get optimized to anything so there's not really a use for them.
Fidget-Spinner left a comment • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. I think @brandtbucher brought this up before, but I think we should be making the uop buffer add-to rather than in-place. What I mean is that instead of adding a pop variant for every optimization we want to do:
... _POP_X_LOAD_CONST_INLINE_BORROW We should add:
... _POP_TOP _POP_TOP ... _LOAD_CONST_INLINE_BORROW This way:
- We don't have to add a new special pop uop for every single op. This saves on mental overhead, because a special pop variant of the uop for every uop we want to optimize is really tough to reason about.
- In the optimizer, we can just generalize it to
POP_TOPX times, which is easier to write code to optimize for, rather than add a new entry toremove_unneeded_uops. As you probably noticed already, it's really tedious to do this.
This would require allowing us to add to the uop buffer, rather than overwriting it in-place. It might be a bigger change, but if you're up for it, it would greatly simplify all future optimizations, and I'd be grateful for someone cleaning that part up. It would best be a separate PR.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
tomasr8 commented May 21, 2025
We actually discussed this w/ Brandt yesterday and I totally agree with that. Brandt also mentioned that it is ok to add this for now but I don't mind closing this and focusing on changing the uop buffer right away. Note that Brandt greatly simplified |
brandtbucher commented May 21, 2025
Yeah, 90% of the reason why I felt the urgency to move away from in-place modification was because the logic in the pass that removes matching pushes and pops was getting completely out-of-hand. That's not a concern anymore, so I think we can keep the momentum going on the current in-place optimizer work for the time being. |
Fidget-Spinner commented May 21, 2025 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
Ok. Can we please make sure to move away from the in-place optimizer after this PR for any new pop tops load const inline X instructions? The pushes and pops are no longer an issue, but the number of new uops variants we're adding is getting out of hand. I have no clue what some of these uops do unless I specifically look up their emission in the optimizer. We don't need this many uops and fewer uops is generally a better thing (less maintainer burden, less human error ...). If we can generalize to just a few |
brandtbucher commented May 21, 2025
Yep, I agree. It also allows us to specialize individual pops for specific known types, immortal values, non-escaping pops, etc. After the sprint ends tomorrow, we can start with larger cleanup efforts (which I agree we need, in the tracer too). |
Fidget-Spinner commented May 21, 2025
Great! Feel free to dismiss my review then in the GH UI. |
3d71a8e to 47995d0Compare47995d0 to 90d38d7Comparebrandtbucher commented May 22, 2025
When I get back, I'll start working on making the optimizer operate on a copy of the stream. This is good for now though. Really cool to see entire calls start disappearing. |
d1ea8ed into python:mainUh oh!
There was an error while loading. Please reload this page.
brandtbucher commented May 22, 2025
Looks great. @tomasr8: natural follow-ups would be support for tuples of types in |
tomasr8 commented May 22, 2025
on it! |
Optimize
_POP_CALL_TWO_LOAD_CONST_INLINE_BORROWThis requires adding some additional ops:
_POP_CALL_ONE_LOAD_CONST_INLINE_BORROW_POP_CALL_LOAD_CONST_INLINE_BORROW_POP_THREE_POP_TWOWe can now optimize:
into just
_POP_CALL_LOAD_CONST_INLINE_BORROWi.e. we save two stack pushes and two stack pops.