gh-119396: Optimize PyUnicode_FromFormat() UTF-8 decoder#119398

vstinner · 2024-05-22T12:49:25Z

Add unicode_decode_utf8_writer() to write directly characters into a _PyUnicodeWriter writer: avoid the creation of a temporary string. Optimize PyUnicode_FromFormat() by using the new
unicode_decode_utf8_writer().

Rename unicode_fromformat_write_cstr() to
unicode_fromformat_write_utf8().

Microbenchmark on the code:

return PyUnicode_FromFormat( "%s %s %s %s %s.", "format", "multiple", "utf8", "short", "strings");

Result: 620 ns +- 8 ns -> 382 ns +- 2 ns: 1.62x faster.

Issue: [C API] Add an efficient public PyUnicodeWriter API #119182

Issue: Optimize _PyUnicodeWriter implementation #119396

vstinner · 2024-05-22T12:50:44Z

Benchmark:

diff --git a/Modules/_testcapimodule.c b/Modules/_testcapimodule.c index f99ebf0dde..0752b2b1d2 100644 --- a/Modules/_testcapimodule.c+++ b/Modules/_testcapimodule.c@@ -3312,6 +3312,14 @@ function_set_warning(PyObject *Py_UNUSED(module), PyObject *Py_UNUSED(args)) Py_RETURN_NONE} +static PyObject *+bench(PyObject *Py_UNUSED(module), PyObject *Py_UNUSED(args))+{+ return PyUnicode_FromFormat(+ "%s %s %s %s %s.",+ "format", "multiple", "utf8", "short", "strings");+}+ static PyMethodDef TestMethods[] ={{"set_errno", set_errno, METH_VARARGS},{"test_config", test_config, METH_NOARGS}, @@ -3454,6 +3462,7 @@ static PyMethodDef TestMethods[] ={{"check_pyimport_addmodule", check_pyimport_addmodule, METH_VARARGS},{"test_weakref_capi", test_weakref_capi, METH_NOARGS},{"function_set_warning", function_set_warning, METH_NOARGS}, +{"bench", bench, METH_NOARGS},{NULL, NULL} /* sentinel */ };

Command:

./python -m venv env env/bin/python -m pip install pyperf env/bin/python -m pyperf timeit -s 'import _testcapi; func=_testcapi.bench''func()' -v -o ref.json

Result, Python built with gcc -O3:

620 ns +- 8 ns -> 382 ns +- 2 ns: 1.62x faster

vstinner · 2024-05-22T14:40:59Z

Oh, there was a performance regression on b"abc".decode(): I fixed it.

Benchmark:

importpyperfimport_testcapirunner=pyperf.Runner() utf8=b'abc'runner.bench_func('abc', utf8.decode) utf8='abcé'.encode() runner.bench_func('abc + UTF-8', utf8.decode) utf8='éabc'.encode() runner.bench_func('UTF-8 + abc', utf8.decode) utf8=b'x'* (1024*1024) runner.bench_func('ASCII 1 MiB', utf8.decode) utf8= ('x'* (1024*1024) +'é').encode() runner.bench_func('ASCII 1 MiB + UTF-8', utf8.decode) utf8= ('é'+'x'* (1024*1024)).encode() runner.bench_func('UTF-8 + ASCII 1 MiB', utf8.decode) utf8= ('€'+'x'* (1024*1024)).encode() runner.bench_func('UTF-8 euro + ASCII 1 MiB', utf8.decode)

Results, Python built with gcc -O3, CPU isolation.

+---------------------+---------+-----------------------+ | Benchmark | ref | change | +=====================+=========+=======================+ | abc | 73.7 ns | 74.7 ns: 1.01x slower | +---------------------+---------+-----------------------+ | abc + UTF-8 | 167 ns | 172 ns: 1.03x slower | +---------------------+---------+-----------------------+ | ASCII 1 MiB | 118 us | 118 us: 1.00x faster | +---------------------+---------+-----------------------+ | ASCII 1 MiB + UTF-8 | 1.08 ms | 1.07 ms: 1.00x faster | +---------------------+---------+-----------------------+ | UTF-8 + ASCII 1 MiB | 572 us | 570 us: 1.00x faster | +---------------------+---------+-----------------------+ | Geometric mean | (ref) | 1.00x slower | +---------------------+---------+-----------------------+ Benchmark hidden because not significant (2): UTF-8 + abc, UTF-8 euro + ASCII 1 MiB

=> There is no significant impact on bytes.decode() performance (no slow down).

vstinner · 2024-05-22T14:42:10Z

cc @serhiy-storchaka

serhiy-storchaka

LGTM.

Objects/unicodeobject.c

Add unicode_decode_utf8_writer() to write directly characters into a _PyUnicodeWriter writer: avoid the creation of a temporary string. Optimize PyUnicode_FromFormat() by using the new unicode_decode_utf8_writer(). Rename unicode_fromformat_write_cstr() to unicode_fromformat_write_utf8(). Microbenchmark on the code: return PyUnicode_FromFormat( "%s %s %s %s %s.", "format", "multiple", "utf8", "short", "strings"); Result: 620 ns +- 8 ns -> 382 ns +- 2 ns: 1.62x faster.

vstinner · 2024-05-22T19:22:36Z

I enabled automerge. Thanks for the review @serhiy-storchaka.

…n#119398) Add unicode_decode_utf8_writer() to write directly characters into a _PyUnicodeWriter writer: avoid the creation of a temporary string. Optimize PyUnicode_FromFormat() by using the new unicode_decode_utf8_writer(). Rename unicode_fromformat_write_cstr() to unicode_fromformat_write_utf8(). Microbenchmark on the code: return PyUnicode_FromFormat( "%s %s %s %s %s.", "format", "multiple", "utf8", "short", "strings"); Result: 620 ns +- 8 ns -> 382 ns +- 2 ns: 1.62x faster.

vstinner added the skip news label May 22, 2024

bedevere-appbot added the awaiting core review label May 22, 2024

bedevere-appbot mentioned this pull request May 22, 2024
[C API] Add an efficient public PyUnicodeWriter API #119182
Closed

vstinner force-pushed the utf8_writer branch from 6c8aedc to d3fe16fCompare May 22, 2024 14:40

serhiy-storchaka approved these changes May 22, 2024
View reviewed changes

Objects/unicodeobject.c Outdated Show resolvedHide resolved
Objects/unicodeobject.c Outdated Show resolvedHide resolved

bedevere-appbot added awaiting merge and removed awaiting core review labels May 22, 2024

vstinner added 3 commits May 22, 2024 21:17

Fix unicode_decode_utf8() perf regression
99f4b13

Address review
d496db8

vstinner force-pushed the utf8_writer branch from d3fe16f to d496db8Compare May 22, 2024 19:19

vstinner enabled auto-merge (squash) May 22, 2024 19:20

vstinner disabled auto-merge May 22, 2024 20:45

vstinner enabled auto-merge (squash) May 22, 2024 20:45

vstinner changed the title ~~gh-119182: Optimize PyUnicode_FromFormat() UTF-8 decoder~~gh-119398: Optimize PyUnicode_FromFormat() UTF-8 decoderMay 22, 2024

vstinner changed the title ~~gh-119398: Optimize PyUnicode_FromFormat() UTF-8 decoder~~gh-119396: Optimize PyUnicode_FromFormat() UTF-8 decoderMay 22, 2024

bedevere-appbot mentioned this pull request May 22, 2024
Optimize _PyUnicodeWriter implementation #119396
Closed

vstinner merged commit 9b422fc into python:mainMay 22, 2024

vstinner deleted the utf8_writer branch May 22, 2024 21:05

bedevere-appbot removed the awaiting merge label May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gh-119396: Optimize PyUnicode_FromFormat() UTF-8 decoder#119398

gh-119396: Optimize PyUnicode_FromFormat() UTF-8 decoder #119398

Uh oh!

vstinner commented May 22, 2024•
edited by bedevere-app bot
Loading

Uh oh!

vstinner commented May 22, 2024•
edited
Loading

Uh oh!

vstinner commented May 22, 2024

Uh oh!

vstinner commented May 22, 2024

Uh oh!

serhiy-storchaka left a comment

Uh oh!

Uh oh!

Uh oh!

vstinner commented May 22, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

gh-119396: Optimize PyUnicode_FromFormat() UTF-8 decoder#119398

gh-119396: Optimize PyUnicode_FromFormat() UTF-8 decoder #119398

Uh oh!

Conversation

vstinner commented May 22, 2024• edited by bedevere-app botLoading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner commented May 22, 2024• edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner commented May 22, 2024

Uh oh!

vstinner commented May 22, 2024

Uh oh!

serhiy-storchaka left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

vstinner commented May 22, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vstinner commented May 22, 2024•
edited by bedevere-app bot
Loading

vstinner commented May 22, 2024•
edited
Loading