Skip to content

Conversation

@Marius-Juston
Copy link
Contributor

@Marius-JustonMarius-Juston commented Apr 4, 2025

Benchmark for all zeros

Benchmarkregex_decimalnon_regex_ends_fast_decimal
exact230 ns83.5 ns: 2.76x faster
No zeros 1199 ns83.9 ns: 2.38x faster
No zeros 10207 ns84.1 ns: 2.46x faster
No zeros 100206 ns86.4 ns: 2.38x faster
No zeros 1000201 ns94.1 ns: 2.13x faster
Mixed 1204 ns169 ns: 1.21x faster
Mixed 10200 ns170 ns: 1.18x faster
Mixed 100208 ns175 ns: 1.19x faster
Mixed 1000198 ns231 ns: 1.17x slower
Zero 1231 ns150 ns: 1.54x faster
Zero 10237 ns180 ns: 1.32x faster
Zero 100287 ns185 ns: 1.55x faster
Zero 1000783 ns245 ns: 3.20x faster
Geometric mean(ref)1.72x faster

Benchmark for exact half

Benchmarkhalf_regexhalf_non_regex_correct
exact176 ns97.1 ns: 1.81x faster
No half 1178 ns124 ns: 1.44x faster
No half 10177 ns125 ns: 1.42x faster
No half 100177 ns125 ns: 1.42x faster
No half 1000180 ns126 ns: 1.42x faster
Mixed 1177 ns126 ns: 1.41x faster
Mixed 10177 ns124 ns: 1.42x faster
Mixed 100178 ns125 ns: 1.42x faster
Mixed 1000177 ns127 ns: 1.39x faster
Half 1232 ns230 ns: 1.01x faster
Half 10235 ns269 ns: 1.14x slower
Half 100288 ns268 ns: 1.08x faster
Half 1000780 ns338 ns: 2.31x faster
Geometric mean(ref)1.38x faster

The tests benchmarks

importpyperfdefsetup_tests_zero(): names= [ ('exact', Decimal('3.5')._int) ] foriin [1, 10, 100, 1000]: names.append((f'No zeros {i}', Decimal('3.1'+'12'*i)._int)) foriin [1, 10, 100, 1000]: names.append((f'Mixed {i}', Decimal('3.12'+'0'*i)._int)) foriin [1, 10, 100, 1000]: names.append((f'Zero {i}', Decimal('3.1'+'0'*i)._int)) returnnamesdefsetup_tests_half(): names= [ ('exact', Decimal('3.5')._int) ] foriin [1, 10, 100, 1000]: names.append((f'No half {i}', Decimal('3.1'+'12'*i)._int)) foriin [1, 10, 100, 1000]: names.append((f'Mixed {i}', Decimal('3.5'+'0'*i)._int)) foriin [1, 10, 100, 1000]: names.append((f'Half {i}', Decimal('3.15'+'0'*i)._int)) returnnamesmethod=''benchmark_method='half'benchmark=setup_tests_halfifbenchmark_method=='half'elsesetfunc=_round_downifmethod=='regex'else_is_all_zerosrunner=pyperf.Runner() forn, vinbenchmark(): # print(func(v))runner.bench_func(n, func, v)

@eendebakpt
Copy link
Contributor

eendebakpt commented Apr 4, 2025

I have some doubts this is worthwhile. The _all_zeros is indeed faster, but is that a bottleneck in actual computations? I would like to see at least a benchmark with a public method from the decimal module and a benchmark with huge numbers.

_all_zeros = re.compile('0*$').match
_exact_half = re.compile('50*$').match
# Checks for regex 0*$
def _all_zeros(d_int,prec=0):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the prec argument?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we need it, since when it gets called in:

def_round_down(self, prec): """Also known as round-towards-0, truncate."""if_all_zeros(self._int, prec): return0else: return-1

and

def_round_half_up(self, prec): """Rounds 5 up (away from 0)"""ifself._int[prec] in'56789': return1elif_all_zeros(self._int, prec): return0else: return-1

they both need the argument

@Marius-Juston
Copy link
ContributorAuthor

Benchmarkregex_decimalnon_regex_ends_fast_decimalnon_regex_while
exact230 ns83.5 ns: 2.76x faster119 ns: 1.93x faster
No zeros 1199 ns83.9 ns: 2.38x faster119 ns: 1.67x faster
No zeros 10207 ns84.1 ns: 2.46x faster119 ns: 1.74x faster
No zeros 100206 ns86.4 ns: 2.38x faster119 ns: 1.74x faster
No zeros 1000201 ns94.1 ns: 2.13x faster127 ns: 1.59x faster
Mixed 1204 ns169 ns: 1.21x faster119 ns: 1.72x faster
Mixed 10200 ns170 ns: 1.18x faster119 ns: 1.68x faster
Mixed 100208 ns175 ns: 1.19x faster120 ns: 1.74x faster
Mixed 1000198 ns231 ns: 1.17x slower126 ns: 1.57x faster
Zero 1231 ns150 ns: 1.54x faster120 ns: 1.92x faster
Zero 10237 ns180 ns: 1.32x faster120 ns: 1.98x faster
Zero 100287 ns185 ns: 1.55x faster120 ns: 2.40x faster
Zero 1000783 ns245 ns: 3.20x faster126 ns: 6.20x faster
Geometric mean(ref)1.72x faster1.97x faster

with the new while loop ( should have done this method first lol )

@Marius-Juston
Copy link
ContributorAuthor

Benchmarkhalf_regexhalf_non_regex_correcthalf_non_regex_while
exact176 ns97.1 ns: 1.81x faster125 ns: 1.41x faster
No half 1178 ns124 ns: 1.44x faster127 ns: 1.40x faster
No half 10177 ns125 ns: 1.42x faster126 ns: 1.40x faster
No half 100177 ns125 ns: 1.42x faster126 ns: 1.41x faster
No half 1000180 ns126 ns: 1.42x faster128 ns: 1.40x faster
Mixed 1232 ns230 ns: 1.01x faster126 ns: 1.84x faster
Mixed 10235 ns269 ns: 1.14x slower126 ns: 1.87x faster
Mixed 100288 ns268 ns: 1.08x faster126 ns: 2.28x faster
Mixed 1000780 ns338 ns: 2.31x faster128 ns: 6.10x faster
Half 1177 ns126 ns: 1.41x faster126 ns: 1.41x faster
Half 10177 ns124 ns: 1.42x faster126 ns: 1.40x faster
Half 100178 ns125 ns: 1.42x faster126 ns: 1.41x faster
Half 1000177 ns127 ns: 1.39x faster128 ns: 1.38x faster
Geometric mean(ref)1.38x faster1.70x faster

With the new while loop method for the _exact_half as well

@Marius-Juston
Copy link
ContributorAuthor

I have some doubts this is worthwhile. The _all_zeros is indeed faster, but is that a bottleneck in actual computations? I would like to see at least a benchmark with a public method from the decimal module and a benchmark with huge numbers.

This is probably not a bottleneck in actual computation ( it's a nice and easy optimization for sure ) and i was planning to use public methods later for benchmarking ( separating them this way though made it simpler to optimize individually ). And yeah I was planning to do larger numbers as well to test this out with repeating patterns as well to see. ( though since the prec sets the initial location of the match or the while loop it does not actually really matter how large the number is but really what the precision is )

Though these addition benchmarks will take me some time to do since I will be very busy, so if you wanted to test it yourself feel free!

@picnixz
Copy link
Member

I don't think this change needs to be done. _pydecimal is only used as a fallback when the decimal C extension is not available. In addition, this is not really better from a readability PoV, so I would prefer keeping this as is.

@skirpichev
Copy link
Member

This is probably not a bottleneck in actual computation ( it's a nice and easy optimization for sure )

If so, hardly it's a "nice optimization". E.g. the re dependency is not removed.

BTW, what's you use case for the pure-Python decimal module? In most cases, people will just use C-coded extension.

@Marius-Juston
Copy link
ContributorAuthor

Fair enough, I don't actually have a use case for the _pydecimal I was just looking through files and functions to see if there was anything that interested me

Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

@Marius-Juston@eendebakpt@picnixz@skirpichev