Uh oh!
There was an error while loading. Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork 33.9k
gh-133395: add option for extension modules to specialize BINARY_OP/SUBSCR, apply to arrays#133396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uh oh!
There was an error while loading. Please reload this page.
Conversation
iritkatriel commented May 4, 2025 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
markshannon left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the _PyBinopSpecializationDescr supposed to be extensible?
If so we'll need a means for extensions to free the structs they have allocated. Or is the intention that they are statically allocated?
If they aren't extensible, maybe the VM should allocate them and clean them up?
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
iritkatriel commented May 5, 2025
I assumed it's extensible and the extension manages it. The alternative is that it has a |
eendebakpt commented May 5, 2025
Are there any microbenchmarks that show this improves performance? |
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Misc/NEWS.d/next/Core_and_Builtins/2025-05-04-19-32-47.gh-issue-133395.VhWWEP.rst Outdated Show resolvedHide resolved
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Co-authored-by: Michael Droettboom <mdboom@gmail.com>
082dbf7 into python:mainUh oh!
There was an error while loading. Please reload this page.
| } | ||
| break; | ||
| } | ||
| /* _GUARD_BINARY_OP_EXTEND is not a viable micro-op for tier 2 because it uses the 'this_instr' variable */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, this change made it so that any traces containing BINARY_OP_EXTEND can't be JIT-compiled. Is there a way to rewrite this to be tier-2 friendly? Mutating inline caches doesn't work there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably don't have to NULL the cache, just change the opcode to BINARY_OP and hope nobody tries to use the pointer in the cache anymore. Will make it harder to debug if they do.
brandtbucher commented May 6, 2025
Also, based on the benchmarking results that were posted, it looks like the |
nascheme commented May 6, 2025
This seems to cause the refleaks unit test to fail for the test_datetime module. The leak seems to be related to the |
iritkatriel commented May 6, 2025
The crash should have been fixed. I'll check the leak. Re tier 2 - maybe there's a way but we can just deopt them when jitting if not. |
vstinner commented May 6, 2025
The follow code is enough to reprodue the issue: importarrayimportgcdeffunc(): a=array.array('B', b'hello world') # call array_binop_specialize() 3 times, once per loopprint("specialize") for_inrange(10): a[1] print("specialize") for_inrange(10): a[1] print("specialize") for_inrange(10): a[1] func() # array_subscr_free() is not calleddelfuncgc.collect()The specialization calls PyMem_Malloc() in array_binop_specialize(), but nothing calls PyMem_Free(): array_subscr_free() is not called. |
Eclips4 commented May 6, 2025
It seems that |
iritkatriel commented May 6, 2025
I don't think that's the issue. Free needs to happen when the code object is freed, and currently it's not. |
…ze BINARY_OP/SUBSCR, apply to arrays (python#133396)" This reverts commit 082dbf7.
iritkatriel commented May 6, 2025
I'm going to revert this so it doesn't delay the release today. It's not terribly important for this to be in 3.14. |
brandtbucher commented May 6, 2025
@iritkatriel I just ran JIT benchmarks on the head of this branch. Those three benchmarks are still broken, and it makes the JIT 1% slower (with individual benchmarks slowing down up to 25%, one benchmark using 8% more memory): So we should make sure to benchmark the JIT when this lands again, or find a new approach (I don't love having strong references in the inline caches, even just pointers to owned memory). |
…Y_OP/SUBSCR, apply to arrays (python#133396)
…ze BINARY_OP/SUBSCR, apply to arrays (python#133396)" (python#133498)
Benchmark results: https://github.com/faster-cpython/benchmarking-public/blob/main/results/bm-20250502-3.14.0a7%2B-ec344eb/bm-20250502-azure-x86_64-iritkatriel-array_subscr-3.14.0a7%2B-ec344eb-pystats-vs-base.md
Specialization stats reflect the change, no overall improvement in perf.