Skip to content

Conversation

@vstinner
Copy link
Member

@vstinnervstinner commented May 13, 2025

Replace most PyUnicodeWriter_WriteUTF8() calls with PyUnicodeWriter_WriteASCII().


📚 Documentation preview 📚: https://cpython-previews--133973.org.readthedocs.build/

Replace most PyUnicodeWriter_WriteUTF8() calls with PyUnicodeWriter_WriteASCII().
@vstinner
Copy link
MemberAuthor

JSON benchmark: #133832 (comment)

Benchmarkrefchange
encode 100 booleans7.15 us6.54 us: 1.09x faster
encode 100 integers11.6 us11.7 us: 1.01x slower
encode 100 "ascii" strings13.4 us13.2 us: 1.02x faster
encode escaped string len=1281.11 us1.10 us: 1.01x faster
encode 1000 booleans39.3 us32.9 us: 1.19x faster
encode Unicode string len=10004.93 us4.94 us: 1.00x slower
encode 10000 booleans343 us286 us: 1.20x faster
encode ascii string len=1000028.5 us28.8 us: 1.01x slower
encode escaped string len=998438.7 us38.9 us: 1.00x slower
encode Unicode string len=1000042.6 us42.4 us: 1.00x faster
Geometric mean(ref)1.02x faster

Benchmark hidden because not significant (11): encode 100 floats, encode ascii string len=100, encode Unicode string len=100, encode 1000 integers, encode 1000 floats, encode 1000 "ascii" strings, encode ascii string len=1000, encode escaped string len=896, encode 10000 integers, encode 10000 floats, encode 10000 "ascii" strings

Up to 1.20x faster to encode booleans is interesting knowing that these strings are very short: "true" (4 characters) and "false" (5 characters).

@vstinner
Copy link
MemberAuthor

The PyUnicodeWriter_WriteASCII() function is faster than PyUnicodeWriter_WriteUTF8(), but has an undefined behavior if the input string contains non-ASCII characters.

@serhiy-storchaka: What do you think of this function?

@vstinner
Copy link
MemberAuthor

cc @ZeroIntensity

Copy link
Member

@ZeroIntensityZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits

@serhiy-storchaka
Copy link
Member

Well, we had _PyUnicodeWriter_WriteASCIIString for reasons.

But unicode_decode_utf8_writer is already optimized for ASCII. Can it be optimized even more? In theory, it can be made almost as fast as _PyUnicodeWriter_WriteASCIIString.

We can add private _PyUnicodeWriter_WriteASCII for now, to avoid regression in JSON encode, and then try to squeeze nanoseconds from PyUnicodeWriter_WriteUTF8. If we fail, we can add public PyUnicodeWriter_WriteASCII.

Co-authored-by: Peter Bierma <[email protected]>
@vstinner
Copy link
MemberAuthor

But unicode_decode_utf8_writer is already optimized for ASCII. Can it be optimized even more?

I don't think that it can become as fast or faster than a function which takes ASCII string as argument. If we know that the input string is ASCII, there is no need to scan the string for non-ASCII characters, and we can take the fast path.

You're right that the UTF-8 decoder is already highly optimized.

@vstinner
Copy link
MemberAuthor

In short:

  • PyUnicodeWriter_WriteUTF8() calls ascii_decode() which is an efficient ASCII decoder.
  • PyUnicodeWriter_WriteASCII() calls memcpy().

It's hard to beat memcpy() performance!

@serhiy-storchaka
Copy link
Member

Yes, although it was close, at least for moderately large strings. Could it be optimized even more? I don't know.

But decision about PyUnicodeWriter_WriteASCII should be made by the C API Workgroup. I'm not sure of my opinion yet. This API is unsafe.

@vstinner
Copy link
MemberAuthor

I created capi-workgroup/decisions#65 issue.

@vstinner
Copy link
MemberAuthor

Benchmark:

write_utf8 size=10: Mean +- std dev: 153 ns +- 1 ns write_utf8 size=100: Mean +- std dev: 174 ns +- 1 ns write_utf8 size=1,000: Mean +- std dev: 279 ns +- 0 ns write_utf8 size=10,000: Mean +- std dev: 1.36 us +- 0.00 us write_ascii size=10: Mean +- std dev: 141 ns +- 0 ns write_ascii size=100: Mean +- std dev: 149 ns +- 0 ns write_ascii size=1,000: Mean +- std dev: 176 ns +- 3 ns write_ascii size=10,000: Mean +- std dev: 690 ns +- 8 ns 

On long strings (10,000 bytes), PyUnicodeWriter_WriteASCII() is up to 2x faster (1.36 us => 690 ns) than PyUnicodeWriter_WriteUTF8().

Details
from_testcapiimportPyUnicodeWriterimportpyperfrange_100=range(100) defbench_write_utf8(text, size): writer=PyUnicodeWriter(0) for_inrange_100: writer.write_utf8(text, size) writer.write_utf8(text, size) writer.write_utf8(text, size) writer.write_utf8(text, size) writer.write_utf8(text, size) writer.write_utf8(text, size) writer.write_utf8(text, size) writer.write_utf8(text, size) writer.write_utf8(text, size) writer.write_utf8(text, size) defbench_write_ascii(text, size): writer=PyUnicodeWriter(0) for_inrange_100: writer.write_ascii(text, size) writer.write_ascii(text, size) writer.write_ascii(text, size) writer.write_ascii(text, size) writer.write_ascii(text, size) writer.write_ascii(text, size) writer.write_ascii(text, size) writer.write_ascii(text, size) writer.write_ascii(text, size) writer.write_ascii(text, size) runner=pyperf.Runner() sizes= (10, 100, 1_000, 10_000) forsizeinsizes: text=b'x'*sizerunner.bench_func(f'write_utf8 size={size:,}', bench_write_utf8, text, size, inner_loops=1_000) forsizeinsizes: text=b'x'*sizerunner.bench_func(f'write_ascii size={size:,}', bench_write_ascii, text, size, inner_loops=1_000)

@encukou
Copy link
Member

Do we know where the bottleneck is for long strings?
Would it make sense have a version of find_first_nonascii that checks and copies in the same loop?

@vstinner
Copy link
MemberAuthor

Do we know where the bottleneck is for long strings?

WriteUTF8() has to check for non-ASCII characters: this check has a cost. That's the bottleneck.

Would it make sense have a version of find_first_nonascii that checks and copies in the same loop?

Maybe, I don't know if it would be faster.

@vstinner
Copy link
MemberAuthor

Would it make sense have a version of find_first_nonascii that checks and copies in the same loop?

I tried but failed to modify the code to copy while reading (checking if the string is encoded to ASCII). The code is quite complicated.

vstinnerand others added 3 commits May 15, 2025 21:41
Copy link
Member

@picnixzpicnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to have this function public. I always preferred using the faster versions of the writer API when I hardcoded strings, but they were private.

Copy link
Member

@ZeroIntensityZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late review, LGTM as well.

@vstinner
Copy link
MemberAuthor

I created capi-workgroup/decisions#65 issue.

The C API Working Group voted in favor of adding the function.

@vstinnervstinner enabled auto-merge (squash) May 29, 2025 14:40
@vstinnervstinner merged commit f49a07b into python:mainMay 29, 2025
39 checks passed
@vstinnervstinner deleted the write_ascii branch May 29, 2025 14:54
vstinner added a commit to vstinner/cpython that referenced this pull request May 31, 2025
…3973) Replace most PyUnicodeWriter_WriteUTF8() calls with PyUnicodeWriter_WriteASCII(). Unrelated change to please the linter: remove an unused import in test_ctypes. Co-authored-by: Peter Bierma <[email protected]> Co-authored-by: Bénédikt Tran <[email protected]> (cherry picked from commit f49a07b)
@bedevere-app
Copy link

GH-134974 is a backport of this pull request to the 3.14 branch.

vstinner added a commit to vstinner/cpython that referenced this pull request Jun 2, 2025
…3973) Replace most PyUnicodeWriter_WriteUTF8() calls with PyUnicodeWriter_WriteASCII(). Co-authored-by: Peter Bierma <[email protected]> Co-authored-by: Bénédikt Tran <[email protected]> (cherry picked from commit f49a07b)
vstinner added a commit that referenced this pull request Jun 9, 2025
…#134974) gh-133968: Add PyUnicodeWriter_WriteASCII() function (#133973) Replace most PyUnicodeWriter_WriteUTF8() calls with PyUnicodeWriter_WriteASCII(). (cherry picked from commit f49a07b) Co-authored-by: Peter Bierma <[email protected]> Co-authored-by: Bénédikt Tran <[email protected]>
Pranjal095 pushed a commit to Pranjal095/cpython that referenced this pull request Jul 12, 2025
…3973) Replace most PyUnicodeWriter_WriteUTF8() calls with PyUnicodeWriter_WriteASCII(). Unrelated change to please the linter: remove an unused import in test_ctypes. Co-authored-by: Peter Bierma <[email protected]> Co-authored-by: Bénédikt Tran <[email protected]>
taegyunkim pushed a commit to taegyunkim/cpython that referenced this pull request Aug 4, 2025
…3973) Replace most PyUnicodeWriter_WriteUTF8() calls with PyUnicodeWriter_WriteASCII(). Unrelated change to please the linter: remove an unused import in test_ctypes. Co-authored-by: Peter Bierma <[email protected]> Co-authored-by: Bénédikt Tran <[email protected]>
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

@vstinner@serhiy-storchaka@encukou@picnixz@ZeroIntensity