Skip to content

Conversation

@skirpichev
Copy link
Member

@skirpichevskirpichev commented Jul 17, 2025

@skirpichev
Copy link
MemberAuthor

benchmark: #136681 (comment)

@ZeroIntensity
Copy link
Member

Won't this break when called concurrently?

@skirpichev

This comment was marked as resolved.

@skirpichevskirpichev deleted the ac-argsbuf/136681 branch July 19, 2025 08:22
@skirpichevskirpichev restored the ac-argsbuf/136681 branch July 29, 2025 14:22
@skirpichev
Copy link
MemberAuthor

Hmm, @ZeroIntensity, test seems fixed with using combined static and _Thread_local. May this work or this is too naive approach?

@ZeroIntensity
Copy link
Member

ZeroIntensity commented Jul 29, 2025

Hm, you could try it and see what the test suite says. To my knowledge, thread local lookups are generally slower, so we might not see a speedup.

@skirpichevskirpichev reopened this Jul 29, 2025
@skirpichev
Copy link
MemberAuthor

skirpichev commented Jul 29, 2025

Well, CI tests pass, but that might be just an accident.

thread local lookups are generally slower, so we might not see a speedup.

Here my quick measurements for default configure arguments on Linux box. (Free-threading build might change the picture.) Micro-benchmarks are for math.fmin(): when only positional arguments are allowed (as in the main) vs positional-or-keyword allowed.

In the main:

Benchmarkposonly-refposorkw-ref
fmin(1.0, 2.0)169 ns188 ns: 1.11x slower
fmin(1.0, 2.0) x 2 times969 ns1.00 us: 1.03x slower
fmin(1.0, 2.0) x 10 times2.09 us2.27 us: 1.08x slower
fmin(1.0, 2.0) x 100 times14.8 us16.5 us: 1.12x slower
Geometric mean(ref)1.09x slower

With the patch:

Benchmarkposonly-patchposorkw-patch
fmin(1.0, 2.0)170 ns173 ns: 1.02x slower
fmin(1.0, 2.0) x 10 times2.06 us2.08 us: 1.01x slower
fmin(1.0, 2.0) x 100 times14.8 us15.0 us: 1.01x slower
Geometric mean(ref)1.01x slower

Benchmark hidden because not significant (1): fmin(1.0, 2.0) x 2 times

Details
importpyperffrommathimportfmindeff(n): for_inrange(n): fmin(1.0, 2.0) runner=pyperf.Runner() runner.bench_func('fmin(1.0, 2.0)', fmin, 1.0, 2.0) fornin [2, 10, 100]: s=f'fmin(1.0, 2.0) x {n:3} times'runner.bench_func(s, f, n)

@skirpichevskirpichev deleted the ac-argsbuf/136681 branch November 13, 2025 13:15
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

@skirpichev@ZeroIntensity