Skip to content

Conversation

@yoney
Copy link
Contributor

@yoneyyoney commented Oct 21, 2025

In the fbthrift-pythonbenchmarks, we observed a performance regression when comparing Python 3.14t to 3.14, particularly during Thrift struct initialization and deserialization. The regression was linked to the use of the PySet_Add() API for populating set and frozenset objects, where a critical section is always applied—even when the set is uniquely referenced during the initialization phase. By skipping the critical section for uniquely referenced frozenset, we were able to reduce the regression.

In a micro-benchmark specifically targeting PySet_Add(), we observed a 19% performance improvement on Intel Broadwell systems and around 27% improvement on ARM/GH200 systems.

cc: @mpage@colesbury@Yhg1s

@yoneyyoney marked this pull request as ready for review October 21, 2025 23:58
Copy link
Member

@Yhg1sYhg1s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This deserves a new item: https://devguide.python.org/core-team/committing/#how-to-add-a-news-entry

It'd also be good to file an issue to describe the problem and the performance improvement this change gives, for future reference. The PR title should be updated to mention the issue.

@yoneyyoney changed the title Optimize PySet_Add() for uniquely referenced sets in free-threadinggh-140476: Optimize PySet_Add() for uniquely referenced sets in free-threadingOct 22, 2025
@yoneyyoney changed the title gh-140476: Optimize PySet_Add() for uniquely referenced sets in free-threadinggh-140476: Optimize PySet_Add() for frozenset in free-threadingOct 23, 2025
@yoney
Copy link
ContributorAuthor

I made a couple of changes and narrowed the optimisation down to frozenset only. This is because the API could be used for sets that are either global variables or members of another wrapper. In such cases, the object can be accessed by multiple threads, even if it is uniquely referenced. I was thinking about the global variables, but thanks to @kumaraditya303 for pointing out the scenario involving wrappers as well.

The global variable or wrapper scenario is also relevant for frozenset. However, API documentation specifies a special condition.

Also works with frozenset instances (like PyTuple_SetItem() it can be used to fill in the values of brand new frozensets before they are exposed to other code)

If the condition "before they are exposed to other code" is sufficient to proceed with his optimisation, then we still have some wins (19% Intel, 27% ARM benchmarked after changes). Otherwise, I am going to close this PR and look for alternative.

@colesbury@sergey-miryanov, I haven’t included the double _PyObject_IsUniquelyReferenced() to handle the cases where hashing could introduce another reference and pass it to another thread for frozenset. This is essentially exposing the frozenset to other code, which is what the API documentation addresses.

However, it might be a good idea to move the hashing outside of the critical_section in general, to minimize the time spent in the critical section. I can do it in a separate PR (or this) if that sounds reasonable to you.

@Yhg1s
Copy link
Member

Global variables are in fact not a concern, because those must be a separate reference (otherwise the set could be destroyed during PySet_Add, which can call arbitrary Python code that would delete the global). Other borrowed references are a concern, though (and not for the first time... Raah.)

sergey-miryanov
sergey-miryanov previously approved these changes Oct 24, 2025
@sergey-miryanovsergey-miryanov dismissed their stale reviewOctober 25, 2025 02:47

IIUC, we have changed the semantics of the PySet_Add. For the sets we now can call it if have more than one reference, is this by intention?

@sergey-miryanov
Copy link
Contributor

Oh, it is good. Sorry for the noise.

@yoneyyoney closed this Oct 28, 2025
@yoneyyoney deleted the ft_set_frozenset branch October 28, 2025 15:51
@yoneyyoney restored the ft_set_frozenset branch October 28, 2025 15:55
@yoney
Copy link
ContributorAuthor

Sorry, that was a mistake. I was trying to clean up branches.

@yoneyyoney reopened this Oct 28, 2025
Copy link
Contributor

@eendebakpteendebakpt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A note unrelated to the PR: the frozenset caches the hash value, and calling PySet_Add after calculating the hash of a frozenset will result in an incorrect hash. We could add a check for this in the PyFrozenSet_Check path (personally I do not think it is worth it).

@kumaraditya303kumaraditya303 merged commit 298e907 into python:mainNov 11, 2025
46 checks passed
StanFromIreland pushed a commit to StanFromIreland/cpython that referenced this pull request Dec 6, 2025
…ng (python#140440) Avoids critical section in `PySet_Add` when adding items to newly created frozensets. Co-authored-by: Kumar Aditya <kumaraditya@python.org>
@yoneyyoney deleted the ft_set_frozenset branch January 27, 2026 17:07
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants

@yoney@Yhg1s@sergey-miryanov@colesbury@eendebakpt@kumaraditya303