Uh oh!
There was an error while loading. Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork 34k
gh-115999: Add partial free-thread specialization for BINARY_SUBSCR#127227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uh oh!
There was an error while loading. Please reload this page.
Conversation
corona10 commented Nov 24, 2024 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
corona10 commented Nov 24, 2024
I am going to write test codes at this week. |
| ht->_spec_cache.getitem=descriptor; | ||
| ht->_spec_cache.getitem_version=version; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should make to be thread-safe by using LOCK_TYPE(), but it will be handled with separated PRs.
1dacf5d to 1a901a3Comparecorona10 commented Nov 27, 2024
@mpage Would you like to take a look? |
mpage left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this! I left one comment inline, but otherwise looks good to me. I kicked off benchmark runs for default/free-threaded builds and will post the results once they're finished.
Uh oh!
There was an error while loading. Please reload this page.
corona10 commented Nov 28, 2024
@mpage Done thanks! |
corona10 commented Nov 29, 2024 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
FYI, aarch64 linux CI failure is not related to this PR, #127332 will fix the issue. |
mpage commented Nov 30, 2024
I should have caught this last time, but I think we want to use the fast-path for list accesses in |
corona10 commented Nov 30, 2024
You mean that the fast-path that can avoid locking object? |
mpage commented Nov 30, 2024
Yep |
corona10 commented Nov 30, 2024 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
What about checking |
mpage commented Nov 30, 2024
Yes, those are needed for correctness. I think for now we probably want to have two slightly different code paths for the free-threaded and default builds. The default build should continue to use the existing logic to fetch the item from the list. The free-threaded builds should use |
corona10 commented Nov 30, 2024
Matt, I define a new |
Python/bytecodes.c Outdated
| Py_ssize_tindex= ((PyLongObject*)sub)->long_value.ob_digit[0]; | ||
| #ifdefPy_GIL_DISABLED | ||
| STAT_INC(BINARY_SUBSCR, hit); | ||
| PyObject*res_o=_PyList_GetItemRef(list, index); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default build of list_get_item_ref is almost same as current implementation,
we might able to this implementation to the default build also.
Lines 352 to 361 in 328187c
| #else | |
| staticinlinePyObject* | |
| list_get_item_ref(PyListObject*op, Py_ssize_ti) | |
| { | |
| if (!valid_index(i, Py_SIZE(op))){ | |
| returnNULL; | |
| } | |
| returnPy_NewRef(PyList_GET_ITEM(op, i)); | |
| } | |
| #endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would certainly be cleaner, but it might come with some performance cost in the default build.
mpage left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! I kicked off new benchmark runs and will post the results here once they're finished. They take a few hours to run and there are other runs queued ahead of them, so that will probably be sometime tomorrow.
There are a couple tests that should be enabled using requires_specialization_ft now that we're specializing BINARY_SUBSCR in free-threaded builds:
- test_opcache.TestRacesDoNotCrash.test_binary_subscr_list_int
- test_dis.DisTests.test_binary_subscr_specialize - This should probably be deleted. It's covered by the tests that you've added and we want to move the specialization tests out of
test_dis.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
mpage left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Most recent performance results look good:
- Neutral on the default build.
- ~1% improvement on free-threaded builds.
Include/internal/pycore_list.h Outdated
| PyAPI_FUNC(PyObject*) _PyList_Extend(PyListObject*, PyObject*); | ||
| externvoid_PyList_DebugMallocStats(FILE*out); | ||
| // _PyList_GetItemRef should be used only when the object is known to be a list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would also be worth mentioning that this does not raise errors whereas PyList_GetItemRef does.
e271340 into python:mainUh oh!
There was an error while loading. Please reload this page.
vstinner commented Dec 2, 2024
It seems like the change introduced a regression: #127521. |
--disable-gilbuilds #115999STORE_SUBSCR#127169 would be merged.PyDict_GetItemRefis thread-safe internally.CALLinstructions in free-threaded builds #127123 is merged, I should follow several strategies from here.