Uh oh!
There was an error while loading. Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork 33.9k
gh-112075: Iterating a dict shouldn't require locks#115108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uh oh!
There was an error while loading. Please reload this page.
Conversation
DinoV commented Feb 6, 2024 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
1266799 to 8d3080bCompare8d3080b to bf395f6Compare2b2e75a to 0dd1a06Compare
colesbury left a comment • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want to avoid expensive atomic operations, like atomic compare exchange and atomic increments, as well as avoiding locking.
This means giving up some atomicity for list and dict iterators compared to the GIL behavior. We should still avoid crashes/memory corruption, but I think it's okay for concurrent calls to next(it)on the same iterator object to return the same object. These iterators are almost always used by only a single thread and the performance cost of making the next atomic is relatively high.
See #114843 for the list iterator changes.
Uh oh!
There was an error while loading. Please reload this page.
DinoV commented Feb 16, 2024
I was worried some crazy person might be using these things to distribute work across threads :P. I'm happy to make relax the guarantees and see how that goes. I suppose if that ever becomes an issue we can deal with it then :) |
0dd1a06 to 3c03084CompareObjects/dictobject.c Outdated
| #endif | ||
| #ifndefPy_GIL_DISABLED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| #endif | |
| #ifndefPy_GIL_DISABLED | |
| #else/* Py_GIL_DISABLED */ |
Objects/dictobject.c Outdated
| returnNULL; | ||
| } | ||
| #endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| #endif | |
| #endif/* Py_GIL_DISABLED */ |
colesbury left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few code style comments below
Objects/dictobject.c Outdated
| #ifdefPy_GIL_DISABLED | ||
| if (has_unique_reference(result)){ | ||
| #else | ||
| if (Py_REFCNT(result) ==1){ | ||
| #endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
has_unique_reference already has special cases for Py_GIL_DISABLED and the default build:
| #ifdefPy_GIL_DISABLED | |
| if (has_unique_reference(result)){ | |
| #else | |
| if (Py_REFCNT(result) ==1){ | |
| #endif | |
| if (has_unique_reference(result)){ |
Objects/dictobject.c Outdated
| if (values==NULL) | ||
| goto concurrent_modification; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the preferred style is to always include braces in new C code
Objects/dictobject.c Outdated
| if (i >= used) | ||
| goto fail; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
Objects/dictobject.c Outdated
| // Even though we hold the lock here we may still lose a race against | ||
| // a lock-free iterator, therefore we may end up retrying our iteration. | ||
| retry: | ||
| start_pos=i=_Py_atomic_load_ssize_relaxed(&di->di_pos); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use the wrappers from pycore_pyatomic_ft_wrappers.h to reduce the number of #ifdef statements.
Uh oh!
There was an error while loading. Please reload this page.
e2a13a2 to 0941e62Compare
colesbury left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, but I think it'd be better if acquire_key_value follows the -1/0 convention. The comment above acquire_key_value would need to be updated too.
| staticint | ||
| acquire_key_value(PyObject**key_loc, PyObject*value, PyObject**value_loc, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also here: -1 for error and 0 for success
0941e62 to a1d7718Compare
Makes iteration of a dict be lock free for the forward iteration case.
Handles races against the dict as well as allowing the iterator to be used from multiple threads simultaneously.
Includes some of the shared object marking from #115109
dictobjects thread-safe in--disable-gilbuilds #112075