Skip to content

Conversation

@eendebakpt
Copy link
Contributor

@eendebakpteendebakpt commented Oct 16, 2025

In the PR we untrack frozen tuples for the normal constructors. There are a few methods shared between the set and frozenset (for example set_intersection in setobject.c) where we have not added the untracking. (this is possible, but I am not sure this is worthwhile to do).

Here is a small script to test the idea:

importgcimporttimefromstatisticsimportmeannumber_of_iterations=20number_of_gc_iterations=50deltas= [] gc.disable() gc.collect() forkkinrange(number_of_iterations): t0=time.perf_counter() forjjinrange(number_of_gc_iterations): gc.collect() dt=time.perf_counter() -t0deltas.append(dt) print(f"time per collection: mean {1e3*mean(deltas) /number_of_iterations:.3f} [ms], min {1e3*min(deltas) /number_of_iterations:.3f} [ms]") sets= [frozenset([ii]) foriiinrange(10_000)] deltas= [] print("---") gc.disable() gc.collect() forkkinrange(number_of_iterations): t0=time.perf_counter() forjjinrange(number_of_gc_iterations): gc.collect() dt=time.perf_counter() -t0deltas.append(dt) print(f"time per collection: mean {1e3*mean(deltas) /number_of_iterations:.3f} [ms], min {1e3*min(deltas) /number_of_iterations:.3f} [ms]") #%% Show statistics of frozen containersgc.collect() defcandidate(obj): returnall(notgc.is_tracked(x) forxinobj) forimmutable_typein (tuple, frozenset): number_of_objects_tracked=0number_of_candidates=0number_of_immutable_candidates=0forobjingc.get_objects(): number_of_objects_tracked+=1iftype(obj) isimmutable_type: number_of_candidates+=1# print(f"{type(obj)} ={obj}")ifcandidate(obj): number_of_immutable_candidates+=1print(f"type {immutable_type}") print(f" {number_of_objects_tracked=}") print(f" {number_of_candidates=}") print(f" {number_of_immutable_candidates=}")

It measures the performance of garbage collection, and outputs some statistics for the numbers of frozen containers.

Main:

time per collection: mean 1.311 [ms], min 1.301 [ms] --- time per collection: mean 2.467 [ms], min 2.272 [ms] type <class 'tuple'> number_of_objects_tracked=18330 number_of_candidates=546 number_of_immutable_candidates=1 type <class 'frozenset'> number_of_objects_tracked=18330 number_of_candidates=10059 number_of_immutable_candidates=10057 

PR

time per collection: mean 1.285 [ms], min 1.251 [ms] --- time per collection: mean 1.424 [ms], min 1.396 [ms] type <class 'tuple'> number_of_objects_tracked=8273 number_of_candidates=546 number_of_immutable_candidates=6 type <class 'frozenset'> number_of_objects_tracked=8273 number_of_candidates=2 number_of_immutable_candidates=0 

Note: generative ai was used in creating the PR

Co-authored-by: Mikhail Efimov <efimov.mikhail@gmail.com>
@sergey-miryanov
Copy link
Contributor

sergey-miryanov commented Oct 17, 2025

Maybe it is worth to change tp_alloc for something like:

PyObject*PyFrozenSet_Alloc(PyTypeObject*type, Py_ssize_tnitems){PyObject*obj=PyType_GenericAlloc(type, nitems); if (obj==NULL){returnNULL} _PyFrozenSet_MaybeUntrack(obj); returnobj}

@eendebakpt
Copy link
ContributorAuthor

Maybe it is worth to change tp_alloc for something like:

The tp_alloc is used in make_new_set, which in turn is called by make_new_set. The last one is used set_intersection which modifies a frozenset. So adding _PyFrozenSet_MaybeUntrack to tp_alloc would mean we have to add a _PyFrozenSet_MaybeTrack to the end of set_intersection. This is a complication I do not want to tackle (certainly not in this PR).

Copy link
Contributor

@sergey-miryanovsergey-miryanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good to me.

@vstinner
Copy link
Member

Would it be possible to write tests in Python rather than in C?

@eendebakpt
Copy link
ContributorAuthor

Would it be possible to write tests in Python rather than in C?

I tried, but it is not easy. We have to expose PySet_Add (frozenset().add does not exist on the python side). I added pyset_add on the _testcapi module (with pyset_add just calling PySet_Add). But running this on a frozenset from the python side does not work: when calling _testcapi.pyset_add(frozen_set, item) there too many references to the frozen_set and PySet_Add will fail with an internal error here:

PyErr_BadInternalCall();

And when calling _testcapi.pyset_add(frozenset(), item) we do not have the frozenset available to test whether tracking has been enabled.

@sergey-miryanov
Copy link
Contributor

And when calling _testcapi.pyset_add(frozenset(), item) we do not have the frozenset available to test whether tracking has been enabled.

IIUC, if you return the first argument from pyset_add then you can test it on the python side.

@eendebakpt
Copy link
ContributorAuthor

And when calling _testcapi.pyset_add(frozenset(), item) we do not have the frozenset available to test whether tracking has been enabled.

IIUC, if you return the first argument from pyset_add then you can test it on the python side.

Ok, I gave it another try. The first attempt failed, but by using the vectorcall convention I can keep the reference count at 1 also from the Python side.

Co-authored-by: Victor Stinner <vstinner@python.org>
Copy link
Member

@vstinnervstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. With one last comment :-)

@eendebakpt
Copy link
ContributorAuthor

LGTM. With one last comment :-)

Nice. Give me one second to double check something.

Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

@eendebakpt@sergey-miryanov@vstinner@efimov-mikhail