Uh oh!
There was an error while loading. Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork 34k
gh-114058: Foundations of the Tier2 redundancy eliminator#115085
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-114058: Foundations of the Tier2 redundancy eliminator #115085
Uh oh!
There was an error while loading. Please reload this page.
Conversation
Fidget-Spinner commented Feb 6, 2024 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
Co-Authored-By: Mark Shannon <9448417+markshannon@users.noreply.github.com>
Co-Authored-By: Jules <57632293+JuliaPoo@users.noreply.github.com> Co-Authored-By: Guido van Rossum <gvanrossum@users.noreply.github.com>
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
markshannon left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks much closer to what I think we need. Thanks.
What's the API for _Py_UOpsSymType? It isn't clear what's internal and what's API.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
| _Py_UOpsSymType**stack_pointer; | ||
| _Py_UOpsSymType**stack; | ||
| _Py_UOpsSymType**locals; | ||
| } _Py_UOpsAbstractFrame; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use much the same layout as an actual frame and put the locals and stack at the end:
... _Py_UOpsSymType **stack_pointer; _Py_UOpsSymType *localsplus[1]}
Fidget-SpinnerFeb 7, 2024 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially did that, but this new layout is easier for inlining, because it actually models the state if all frames were to be inlined. So it makes things like calculating the new locals offset easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we aren't inlining yet, and we might choose a different approach. Making it a single struct would make memory management simpler.
Fidget-SpinnerFeb 7, 2024 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now making it a single struct would make memory management a little more complex, because my "frames" are just an array of static sized structs. Their localsplus is just accessing a giant pool. frame->prev is just index--. It's all rather clean. And I just need to manage memory once instead of for 7 giant frames.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, we won't be doing any allocations, just reusing a buffer on the thread-state. But's that future PR.
But we still need to manage the memory within that buffer, and monolithic frames will make that simpler.
This should be fine for now, though.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
markshannon commented Feb 6, 2024
How hard would it be to change the API to take |
Fidget-Spinner commented Feb 6, 2024 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
Quite possible. I can just convert to the internal representation internally. (The representation is just to save space, and support unions in the future). The only thing is there are some "types" that cant be represented using pytype object. In that case I will special case them. |
markshannon commented Feb 7, 2024
It is probably best to add those cases to the API. |
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Python/optimizer_analysis.c Outdated
| #defineGETLOCAL(idx) ((ctx->frame->locals[idx])) | ||
| #defineREPLACE_OP(op, arg, oper) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like macros that assume context, and we rarely (maybe never) want to set the oparg or operand.
| #defineREPLACE_OP(op, arg, oper) \ | |
| #defineREPLACE_OP(INST, OP) (INST)->opcode = (OP) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will need operand once we start burning in _LOAD_CONST_INLINE and friends.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
| staticvoid | ||
| remove_unneeded_uops(_PyUOpInstruction*buffer, intbuffer_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has this changed, or has it just moved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not change it. It just moved.
Python/optimizer_analysis.c Outdated
| // The only valid error we can raise is MemoryError. | ||
| // Other times it's not really errors but things like not being able | ||
| // to fetch a function version because the function got deleted. | ||
| returnPyErr_Occurred() ? -1 : 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need to check PyErr_Occurred()? uop_redundancy_eliminator should not be returning -1 unless an error has occurred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the old code, constant propagation could raise MemoryError.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
markshannon left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now.
…onGH-115085) --------- Co-authored-by: Mark Shannon <9448417+markshannon@users.noreply.github.com> Co-authored-by: Jules <57632293+JuliaPoo@users.noreply.github.com> Co-authored-by: Guido van Rossum <gvanrossum@users.noreply.github.com>
This sets up the tier 2 optimizer's redundancy eliminator foundations.
It does the following:
It's basically a port of #114059 over using Mark's DSL. Constant propagation and a lot of type propagation rules torn out, for the sake of simpler code for now. We can easily add more later. I just want to get the foundations in.
I expect no speedups for this, as for simpler reviewing, most of the code is torn out. However, I believe that #114059 already shows this approach has potential for speedups.