Uh oh!
There was an error while loading. Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork 34k
gh-119109: improve functools.partial vectorcall with keywords#124584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-119109: improve functools.partial vectorcall with keywords #124584
Uh oh!
There was an error while loading. Please reload this page.
Conversation
dg-pb commented Sep 26, 2024 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
rhettinger commented Sep 26, 2024
Perhaps @vstinner has the time and interest in looking at this. |
dg-pb commented Sep 29, 2024
I think it is a good compromise between simplicity and performance now. One micro-optimization that I couldn't figure out how to do simply is pre-storing Not sure how much sense it makes yet, but I posted faster-cpython/ideas#699 in relation to this. Ready for review now. |
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
dg-pb commented Oct 17, 2024
Was wandering if it might be worth factoring out macros for private use. |
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
serhiy-storchaka left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. 👍
kumaraditya303 commented Jul 7, 2025
No refleaks: ❯ ./python.exe -m test -R 3:3 test_functoolsUsing random seed: 4582604610:00:00 load avg: 1.72 Run 1 test sequentially in a single process0:00:00 load avg: 1.72 [1/1] test_functoolsbeginning 6 repetitions. Showing number of leaks (. for 0 or less, X for 10 or more)123:456XX. ...0:00:02 load avg: 1.66 [1/1] test_functools passed== Tests result: SUCCESS ==1 test OK.Total duration: 2.1 secTotal tests: run=321 skipped=1Total test files: run=1/1Result: SUCCESS |
dg-pb commented Jul 7, 2025 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
I get the same on And also the same if I remove all tests except Maybe not related to this PR? EDIT: I assumed this shows that there are refleaks? Or am I failing to interpret something? Never used |
kumaraditya303 commented Jul 8, 2025 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
The output I posted means that there are no leaks, -R is used to check for reference leaks and 3:3 indicates the numbers of runs to checks for leaks. |
kumaraditya303 commented Jul 8, 2025
I think the resizing logic is unnecessary and adds extra complexity, would you simplify it to what Serhiy suggested or alternatively remove it altogether @dg-pb? |
dg-pb commented Jul 8, 2025 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
I don't think it was simplification. He did propose an alternative and I incorporated it into the logic. Why I kept the old one # only: (noveralloc > init_stack_size / 2)p=partial(a=1) f(a=2) # initial stack size: 3# used stack size: 1Thus, would be kicking in too often. 10% is I think a good number, given we are talking about the pool of 0.01% of cases. I feel that what is currently there provides necessary protection at minimal cost and is unlikely to cause any issues. But if you have better rationale in mind, I am open to further amendments.
Things like recursion of "Anything that can go wrong will go wrong." Thus, given enough time it is very likely that more than 1 instance of issues with this will happen. Given it is pretty much for free, and the block can be easily removed if there is a need for it in the future, I am in favour of being "better safe than sorry" here. |
kumaraditya303 commented Jul 8, 2025
I don't have better resizing strategy in mind atm so I'll not block this PR on it now, I'll merge this now thanks. |
Uh oh!
There was an error while loading. Please reload this page.
f9932f5 into python:mainUh oh!
There was an error while loading. Please reload this page.
…python#124584) Co-authored-by: Kumar Aditya <[email protected]> Co-authored-by: Serhiy Storchaka <[email protected]>
…python#124584) Co-authored-by: Kumar Aditya <[email protected]> Co-authored-by: Serhiy Storchaka <[email protected]>
…python#124584) Co-authored-by: Kumar Aditya <[email protected]> Co-authored-by: Serhiy Storchaka <[email protected]>
hawkinsp commented Jul 29, 2025
Is a 3.14 backport possible for this PR, given it fixes a free threading race also? |
kumaraditya303 commented Jul 30, 2025
This is performance improvement so it cannot be backported, I haven't seen any real crashes on this so adding supression should be fine for 3.14. |
…python#124584) Co-authored-by: Kumar Aditya <[email protected]> Co-authored-by: Serhiy Storchaka <[email protected]>
…python#124584) Co-authored-by: Kumar Aditya <[email protected]> Co-authored-by: Serhiy Storchaka <[email protected]>
(Potentially closes#128050)
This IMO is the best approach to resolve fallback "issue". It:
a) Eliminates the need for the fallback or any need to switch between implementation after initial construction
b) Delivers performance benefits for vectorcall when
partialhas keywordsBenchmark:
No penalty for calls without
pto_kwds.Non negligible speed improvement for calls with
pto_kwds: 27 - 55%