Skip to content

Improve performances of uuid.* functions#128150

@picnixz

Description

@picnixz

Feature or enhancement

The dedicated UUID constructors (e.g., uuid.uuid4()) generate bytes and pass them to the UUID constructor. However, the latter performs multiple and redundant checks. We can by-pass those checks since we are actually creating manually the UUID object. Here are the benchmarks for a PGO and python -OO (no LTO) build and a dedicated UUID.from_int constructor:

+----------------------------------------+---------+-----------------------+ | Benchmark | ref | final | +========================================+=========+=======================+ | uuid1(node, None) | 1.20 us | 1.16 us: 1.04x faster | +----------------------------------------+---------+-----------------------+ | uuid1(None, clock_seq) | 1.16 us | 1.14 us: 1.02x faster | +----------------------------------------+---------+-----------------------+ | uuid3(NAMESPACE_DNS, os.urandom(16)) | 1.13 us | 809 ns: 1.40x faster | +----------------------------------------+---------+-----------------------+ | uuid3(NAMESPACE_DNS, os.urandom(1024)) | 2.08 us | 1.73 us: 1.20x faster | +----------------------------------------+---------+-----------------------+ | uuid4() | 1.16 us | 885 ns: 1.31x faster | +----------------------------------------+---------+-----------------------+ | uuid5(NAMESPACE_DNS, os.urandom(16)) | 1.15 us | 832 ns: 1.39x faster | +----------------------------------------+---------+-----------------------+ | uuid5(NAMESPACE_DNS, os.urandom(1024)) | 1.57 us | 1.27 us: 1.24x faster | +----------------------------------------+---------+-----------------------+ | uuid8() | 952 ns | 694 ns: 1.37x faster | +----------------------------------------+---------+-----------------------+ | Geometric mean | (ref) | 1.21x faster | +----------------------------------------+---------+-----------------------+ Benchmark hidden because not significant (1): uuid1() +----------------------------------------+---------+-----------------------+ | Benchmark | ref | final | +========================================+=========+=======================+ | uuid3(NAMESPACE_DNS, os.urandom(16)) | 1.13 us | 809 ns: 1.40x faster | +----------------------------------------+---------+-----------------------+ | uuid3(NAMESPACE_DNS, os.urandom(1024)) | 2.08 us | 1.73 us: 1.20x faster | +----------------------------------------+---------+-----------------------+ | uuid4() | 1.16 us | 885 ns: 1.31x faster | +----------------------------------------+---------+-----------------------+ | uuid5(NAMESPACE_DNS, os.urandom(16)) | 1.15 us | 832 ns: 1.39x faster | +----------------------------------------+---------+-----------------------+ | uuid5(NAMESPACE_DNS, os.urandom(1024)) | 1.57 us | 1.27 us: 1.24x faster | +----------------------------------------+---------+-----------------------+ | uuid8() | 952 ns | 694 ns: 1.37x faster | +----------------------------------------+---------+-----------------------+ | Geometric mean | (ref) | 1.31x faster | +----------------------------------------+---------+-----------------------+ Ignored benchmarks (3) of ref.json: uuid1(), uuid1(None, clock_seq), uuid1(node, None) 

The above benchmarks keep constants as is since constant folding would remove the inefficiency of recomputing 1 << const everytime. With a hardcoded 1 << const, the numbers are (almost) identical.

I did not change the UUIDv1 generation because I observed that it would be worse in the uuid.uuid1() form (but 50% faster when either the node or the clock sequence is given, but this is likely not the usual call form).

Benchmark script
importosimportrandomimportuuidimportpyperfif__name__=='__main__': runner=pyperf.Runner() runner.bench_func('uuid1()', uuid.uuid1) node=random.getrandbits(48) runner.bench_func('uuid1(node, None)', uuid.uuid1, node) clock_seq=random.getrandbits(14) runner.bench_func('uuid1(None, clock_seq)', uuid.uuid1, None, clock_seq) ns=uuid.NAMESPACE_DNSrunner.bench_func('uuid3(NAMESPACE_DNS, os.urandom(16))', uuid.uuid3, ns, os.urandom(16)) runner.bench_func('uuid3(NAMESPACE_DNS, os.urandom(1024))', uuid.uuid3, ns, os.urandom(1024)) runner.bench_func('uuid4()', uuid.uuid4) ns=uuid.NAMESPACE_DNSrunner.bench_func('uuid5(NAMESPACE_DNS, os.urandom(16))', uuid.uuid5, ns, os.urandom(16)) runner.bench_func('uuid5(NAMESPACE_DNS, os.urandom(1024))', uuid.uuid5, ns, os.urandom(1024)) runner.bench_func('uuid8()', uuid.uuid8)

I'll submit a PR and we can decide what to keep and what to remove for maintainibility purposes. Note that the uuid module has been improved a lot performance-wise especially in terms of import time but I believe that constructing UUIDs objects via their dedicated functions.

Linked PRs

Metadata

Metadata

Assignees

Labels

performancePerformance or resource usagestdlibStandard Library Python modules in the Lib/ directorytype-featureA feature request or enhancement

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions