Uh oh!
There was an error while loading. Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork 34k
gh-128942: make arraymodule.c free-thread safe#128943
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uh oh!
There was an error while loading. Please reload this page.
Conversation
tom-pytel commented Jan 17, 2025 • edited by bedevere-app bot
Loading Uh oh!
There was an error while loading. Please reload this page.
edited by bedevere-app bot
Uh oh!
There was an error while loading. Please reload this page.
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
tom-pytel commented Jan 17, 2025
Requesting a look from @ZeroIntensity. |
ZeroIntensity left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this! At a glance, don't use goto with critical sections--it's messier and can be error-prone (IIRC there's problems with it compiling on MSVC too). Instead, create a wrapper function like this:
staticPyObject*array_something(arrayobject*self, PyObject*arg){PyObject*res; Py_BEGIN_CRITICAL_SECTION(self); res=array_something_lock_held(); // the actual implementation functionPy_END_CRITICAL_SECTION(); returnres}For functions that are defined with the Argument Clinic, you can use @critical_section to do this even faster. (See what I did for the ssl module.)
tom-pytel commented Jan 17, 2025 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
Made requested changes, also made arrayiter safe. Also found a good old-fashioned non-free-threaded bug which probably goes back a long way - if try to Question remaining about need to lock bytes source object in |
erlend-aasland commented Jan 17, 2025
Could you separate that out from this PR and submit a dedicated PR for this bug? We might want to backport it. |
erlend-aasland commented Jan 17, 2025
Regarding the style nits: please make sure all new code (and in most cases, also changed code) follow PEP-7. |
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
tom-pytel commented Jan 18, 2025
Still would like to know if |
Misc/NEWS.d/next/Library/2025-01-17-13-53-32.gh-issue-128942.DxzaIg.rst Outdated Show resolvedHide resolved
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
ZeroIntensity left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're getting there! Now, we're overlocking things a bit. This isn't always the rule, but in general you only need to lock something if you want to avoid inconsistent state (making it a "critical section"!).
That, and keep in mind that things that have a _Py or Py prefix are generally thread safe themselves and will lock if they need to. You don't need to lock for every call to Python.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
kumaraditya303 commented Feb 27, 2025
❯ ./python -m test test_array -R 3:3Using random seed: 19191006670:00:00 load avg: 36.22 Run 1 test sequentially in a single process0:00:00 load avg: 36.22 [1/1] test_arraybeginning 6 repetitions. Showing number of leaks (. for 0 or less, X for 10 or more)123:456XX. ...0:00:16 load avg: 28.56 [1/1] test_array passed== Tests result: SUCCESS ==1 test OK.Total duration: 16.1 secTotal tests: run=886 skipped=1Total test files: run=1/1Result: SUCCESSNo refleaks. |
tom-pytel commented Feb 27, 2025
Thx for the onboarding with this @ZeroIntensity |
8ba0d7b into python:mainUh oh!
There was an error while loading. Please reload this page.
colesbury commented Feb 28, 2025
The PR regressed free threading performance substantially on the scimark benchmarks from pyperformance, so I'm going to revert the PR for now. We'll need to figure out a way of making |
…)" The change regressed performance on scimark benchmarks from the pyperformance benchmark suite. This reverts commit 8ba0d7b.
tom-pytel commented Feb 28, 2025
Will play with it checking perf along the way to see if a lesser standard of thread-safe can work (ignore data integrity and just make sure nothing crashes). |
colesbury commented Feb 28, 2025 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
I looked at it briefly under Linux's
We can consider using a similar strategy as We can also consider moving |
tom-pytel commented Feb 28, 2025
List deals with objects with their own |
colesbury commented Feb 28, 2025
I guess it'll be a bit simpler than So if we adapt the list strategy it might look like:
|
tom-pytel commented Feb 28, 2025
Simplest worst case scenario thing depending on performance could be a spinlock to make access to In any case, got some crude numbers for impact of just I have all weekend to play with this so will try various things and what you mentioned. |
tom-pytel commented Mar 2, 2025
Continued in #130771. |
This PR is a work in progress but it currently fixes the crashes mentioned in the related issue so I am sending it up now to elicit feedback to make sure I am on the right path. So far the changes have been made functional but as minimal as possible to facilitate review. The critical sections are fairly broad so optimization may follow if functionality is confirmed. The changes are mostly additions of critical sections to publicly visible methods and slots gotten from the two structures listed below (as well as a few internal functions):
The following public methods/slots have been modified directly:
The following have not been modified but depend for safety on the functions they call:
The following look safe as is:
And the following non-public utility functions have been made safe:
Here are a few questions needing guidance:
Changed clinic sig of
array_array_frombytes()so I could lock the original source object during op (maybe its a writeable bytearray). Want to prevent modification of data during the operation but not sure this will even do this so not sure is necessary or even desired. Should I remove this and leave just thePyObject_GetBuffer()for protection? Also for this added a forward declaration ofarray_array_frombytes()forarray_array_fromfile_impl()instead of moving stuff around for minimal impact.In
array_newthere may or may not be an initializer object, if there is not then theargstuple is used as a stand-in for the critical section lock. Is there a better way to do this?Misc:
array_richcomparewas incrementing refcount ofPy_TrueandPy_False, removed that as they are immortal.The following were larger / slightly more complicated functions to check so should double check
array_array_frombytes,array_richcompare,array_ass_subscrandarray_new.More work is still needed, like the array iterator and more testing and verification.
arraymodule is not free-thread safe. #128942