Skip to content

Do not track immutable tuples in PyTuple_Pack#139389

@sergey-miryanov

Description

@sergey-miryanov

Feature or enhancement

Proposal:

When we use PyTuple_Pack all objects already well constructed. If we know that they immutable we can skip tracking it in GC, because GC will untrack them eventually.

I have a PR ready and benchmark results:

Geometric mean: 1.01x faster (Win11 x64, 11th Gen Intel(R) Core(TM) i5-11600K @ 3.90GHz, 48d0d0d)

All benchmarks:

+--------------------------+----------+------------------------+ | Benchmark | main | tuples | +==========================+==========+========================+ | async_generators | 435 ms | 430 ms: 1.01x faster | +--------------------------+----------+------------------------+ | asyncio_tcp | 750 ms | 756 ms: 1.01x slower | +--------------------------+----------+------------------------+ | asyncio_tcp_ssl | 1.91 sec | 1.92 sec: 1.01x slower | +--------------------------+----------+------------------------+ | comprehensions | 22.1 us | 21.8 us: 1.01x faster | +--------------------------+----------+------------------------+ | bench_mp_pool | 104 ms | 103 ms: 1.01x faster | +--------------------------+----------+------------------------+ | bench_thread_pool | 1.29 ms | 1.27 ms: 1.01x faster | +--------------------------+----------+------------------------+ | coroutines | 28.2 ms | 27.7 ms: 1.02x faster | +--------------------------+----------+------------------------+ | coverage | 88.5 ms | 86.3 ms: 1.02x faster | +--------------------------+----------+------------------------+ | crypto_pyaes | 90.1 ms | 86.7 ms: 1.04x faster | +--------------------------+----------+------------------------+ | deepcopy | 310 us | 307 us: 1.01x faster | +--------------------------+----------+------------------------+ | deepcopy_memo | 36.4 us | 36.1 us: 1.01x faster | +--------------------------+----------+------------------------+ | deltablue | 5.19 ms | 4.85 ms: 1.07x faster | +--------------------------+----------+------------------------+ | django_template | 45.5 ms | 45.8 ms: 1.01x slower | +--------------------------+----------+------------------------+ | docutils | 2.47 sec | 2.45 sec: 1.01x faster | +--------------------------+----------+------------------------+ | dulwich_log | 86.2 ms | 86.9 ms: 1.01x slower | +--------------------------+----------+------------------------+ | fannkuch | 449 ms | 441 ms: 1.02x faster | +--------------------------+----------+------------------------+ | float | 85.3 ms | 82.5 ms: 1.03x faster | +--------------------------+----------+------------------------+ | create_gc_cycles | 1.17 ms | 1.17 ms: 1.01x faster | +--------------------------+----------+------------------------+ | gc_traversal | 2.97 ms | 2.88 ms: 1.03x faster | +--------------------------+----------+------------------------+ | generators | 43.0 ms | 41.6 ms: 1.03x faster | +--------------------------+----------+------------------------+ | genshi_text | 28.9 ms | 28.7 ms: 1.01x faster | +--------------------------+----------+------------------------+ | go | 160 ms | 153 ms: 1.04x faster | +--------------------------+----------+------------------------+ | hexiom | 8.39 ms | 8.13 ms: 1.03x faster | +--------------------------+----------+------------------------+ | json_dumps | 8.62 ms | 8.69 ms: 1.01x slower | +--------------------------+----------+------------------------+ | logging_format | 12.5 us | 12.2 us: 1.02x faster | +--------------------------+----------+------------------------+ | logging_silent | 139 ns | 140 ns: 1.01x slower | +--------------------------+----------+------------------------+ | logging_simple | 11.3 us | 11.1 us: 1.01x faster | +--------------------------+----------+------------------------+ | mako | 14.2 ms | 14.4 ms: 1.01x slower | +--------------------------+----------+------------------------+ | mdp | 1.47 sec | 1.50 sec: 1.02x slower | +--------------------------+----------+------------------------+ | meteor_contest | 104 ms | 102 ms: 1.02x faster | +--------------------------+----------+------------------------+ | nbody | 114 ms | 113 ms: 1.01x faster | +--------------------------+----------+------------------------+ | pickle_pure_python | 439 us | 436 us: 1.01x faster | +--------------------------+----------+------------------------+ | pprint_safe_repr | 953 ms | 916 ms: 1.04x faster | +--------------------------+----------+------------------------+ | pprint_pformat | 1.95 sec | 1.88 sec: 1.04x faster | +--------------------------+----------+------------------------+ | pyflate | 506 ms | 492 ms: 1.03x faster | +--------------------------+----------+------------------------+ | python_startup | 28.5 ms | 27.4 ms: 1.04x faster | +--------------------------+----------+------------------------+ | python_startup_no_site | 23.2 ms | 22.2 ms: 1.05x faster | +--------------------------+----------+------------------------+ | raytrace | 361 ms | 345 ms: 1.05x faster | +--------------------------+----------+------------------------+ | regex_compile | 146 ms | 146 ms: 1.01x faster | +--------------------------+----------+------------------------+ | regex_effbot | 2.03 ms | 2.02 ms: 1.01x faster | +--------------------------+----------+------------------------+ | regex_v8 | 23.9 ms | 22.7 ms: 1.06x faster | +--------------------------+----------+------------------------+ | richards | 66.1 ms | 59.9 ms: 1.10x faster | +--------------------------+----------+------------------------+ | richards_super | 71.6 ms | 68.7 ms: 1.04x faster | +--------------------------+----------+------------------------+ | scimark_fft | 300 ms | 294 ms: 1.02x faster | +--------------------------+----------+------------------------+ | scimark_lu | 135 ms | 131 ms: 1.03x faster | +--------------------------+----------+------------------------+ | scimark_monte_carlo | 83.3 ms | 82.4 ms: 1.01x faster | +--------------------------+----------+------------------------+ | scimark_sor | 157 ms | 150 ms: 1.05x faster | +--------------------------+----------+------------------------+ | scimark_sparse_mat_mult | 4.27 ms | 4.35 ms: 1.02x slower | +--------------------------+----------+------------------------+ | spectral_norm | 122 ms | 118 ms: 1.03x faster | +--------------------------+----------+------------------------+ | sqlglot_optimize | 60.7 ms | 60.9 ms: 1.00x slower | +--------------------------+----------+------------------------+ | sympy_expand | 501 ms | 503 ms: 1.00x slower | +--------------------------+----------+------------------------+ | sympy_sum | 143 ms | 144 ms: 1.01x slower | +--------------------------+----------+------------------------+ | sympy_str | 287 ms | 292 ms: 1.02x slower | +--------------------------+----------+------------------------+ | telco | 7.26 ms | 7.33 ms: 1.01x slower | +--------------------------+----------+------------------------+ | tomli_loads | 2.23 sec | 2.25 sec: 1.01x slower | +--------------------------+----------+------------------------+ | typing_runtime_protocols | 189 us | 185 us: 1.02x faster | +--------------------------+----------+------------------------+ | unpack_sequence | 65.4 ns | 68.7 ns: 1.05x slower | +--------------------------+----------+------------------------+ | unpickle | 13.9 us | 14.1 us: 1.01x slower | +--------------------------+----------+------------------------+ | unpickle_pure_python | 303 us | 300 us: 1.01x faster | +--------------------------+----------+------------------------+ | xml_etree_parse | 130 ms | 130 ms: 1.01x slower | +--------------------------+----------+------------------------+ | xml_etree_iterparse | 107 ms | 108 ms: 1.01x slower | +--------------------------+----------+------------------------+ | xml_etree_process | 79.2 ms | 78.6 ms: 1.01x faster | +--------------------------+----------+------------------------+ | Geometric mean | (ref) | 1.01x faster | +--------------------------+----------+------------------------+ 

Benchmark hidden because not significant (20): 2to3, chaos, deepcopy_reduce, genshi_xml, html5lib, json_loads, nqueens, pathlib, pickle, pickle_dict, pickle_list, pidigits, regex_dna, sqlglot_normalize, sqlglot_parse, sqlglot_transpile, sqlite_synth, sympy_integrate, unpickle_list, xml_etree_generate

It doesn't hurt performance, but can decrease number of objects in GC to check and untrack.

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

No response

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    interpreter-core(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usagetype-featureA feature request or enhancement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions