gh-119451: Fix a potential denial of service in http.client#119454

serhiy-storchaka · 2024-05-23T08:51:20Z

Reading the whole body of the HTTP response could cause OOM if the Content-Length value is too large even if the server does not send a large amount of data. Now the HTTP client reads large data by chunks, therefore the amount of consumed memory is proportional to the amount of sent data.

Issue: Out-of-memory when reading a HTTP response with large Content-Length #119451

Reading the whole body of the HTTP response could cause OOM if the Content-Length value is too large even if the server does not send a large amount of data. Now the HTTP client reads large data by chunks, therefore the amount of consumed memory is proportional to the amount of sent data.

gpshead · 2024-05-24T19:59:31Z

I've marked this Draft for now as discussion on this on the security response team list is not complete. (we'll summarize that in a public issue once it has settled)

encukou · 2025-01-27T13:43:41Z

See #119514 (comment) for results of the PSRT discussion.

illia-v · 2025-11-24T16:13:05Z

Lib/http/client.py

+iflen(data) <cursize:
+raiseIncompleteRead(data, amt-len(data))
+delta=min(cursize, amt-cursize)
+data+=self.fp.read(cursize)


When multiple 1 MB chunks have to be read and joined, io.BytesIO would consume less memory and do the job significantly faster.
Here's a result of a simple benchmark on Python 3.14.0:
Benchmarking with 2 iterations and chunk size 1048576 bytes... Concatenation (+=): Peak Memory = 4.00 MB, Time = 0.0026 seconds BytesIO: Peak Memory = 3.00 MB, Time = 0.0011 seconds Benchmarking with 10 iterations and chunk size 1048576 bytes... Concatenation (+=): Peak Memory = 20.00 MB, Time = 0.0316 seconds BytesIO: Peak Memory = 11.13 MB, Time = 0.0040 seconds
The benchmarking script
importioimporttimeimporttracemallocdefbenchmark_concatenation(n, chunk_size): tracemalloc.start() start_time=time.time() data=b""foriinrange(n): data+=bytes([i%256]) *chunk_sizeend_time=time.time() current, peak=tracemalloc.get_traced_memory() tracemalloc.stop() returnpeak, end_time-start_timedefbenchmark_bytesio(n, chunk_size): tracemalloc.start() start_time=time.time() buffer=io.BytesIO() foriinrange(n): buffer.write(bytes([i%256]) *chunk_size) # getvalue() creates a copy of the buffer contentresult=buffer.getvalue() end_time=time.time() current, peak=tracemalloc.get_traced_memory() tracemalloc.stop() returnpeak, end_time-start_timedefmain(n): chunk_size=1024*1024# 1 MBprint(f"Benchmarking with {n} iterations and chunk size {chunk_size} bytes...") peak_concat, time_concat=benchmark_concatenation(n, chunk_size) print( f"Concatenation (+=): Peak Memory = {peak_concat/1024/1024:.2f} MB, Time = {time_concat:.4f} seconds" ) peak_bio, time_bio=benchmark_bytesio(n, chunk_size) print( f"BytesIO: Peak Memory = {peak_bio/1024/1024:.2f} MB, Time = {time_bio:.4f} seconds" ) if__name__=="__main__": whileTrue: main(int(input("Enter n: ")))

Your benchmark reads data by chunks with the constant size .
This PR reads it by chunks with geometrically increased size.

That''s true. Here's the changed script with its result
Benchmarking with 2 iterations and initial chunk size 1048576 bytes... Concatenation (+=): Peak Memory = 6.00 MB, Time = 0.0044 seconds BytesIO: Peak Memory = 5.00 MB, Time = 0.0041 seconds Benchmarking with 10 iterations and initial chunk size 1048576 bytes... Concatenation (+=): Peak Memory = 2046.00 MB, Time = 4.5761 seconds BytesIO: Peak Memory = 1535.00 MB, Time = 1.7045 seconds
The updated benchmarking script
importioimporttimeimporttracemallocdefbenchmark_concatenation(n, initial_chunk_size): tracemalloc.start() start_time=time.time() data=b""foriinrange(n): data+=bytes([i%256]) * (initial_chunk_size* (2**i)) end_time=time.time() current, peak=tracemalloc.get_traced_memory() tracemalloc.stop() returnpeak, end_time-start_timedefbenchmark_bytesio(n, initial_chunk_size): tracemalloc.start() start_time=time.time() buffer=io.BytesIO() foriinrange(n): buffer.write(bytes([i%256]) * (initial_chunk_size* (2**i))) # getvalue() creates a copy of the buffer contentresult=buffer.getvalue() end_time=time.time() current, peak=tracemalloc.get_traced_memory() tracemalloc.stop() returnpeak, end_time-start_timedefmain(n): chunk_size=1024*1024# 1 MBprint(f"Benchmarking with {n} iterations and initial chunk size {chunk_size} bytes...") peak_concat, time_concat=benchmark_concatenation(n, chunk_size) print( f"Concatenation (+=): Peak Memory = {peak_concat/1024/1024:.2f} MB, Time = {time_concat:.4f} seconds" ) peak_bio, time_bio=benchmark_bytesio(n, chunk_size) print( f"BytesIO: Peak Memory = {peak_bio/1024/1024:.2f} MB, Time = {time_bio:.4f} seconds" ) if__name__=="__main__": whileTrue: main(int(input("Enter n: ")))

I see now why BytesIO consumes 25% less memory. If you add n new bytes to a buffer of size n, in case of the bytes concatenation you need to allocate a new buffer of size 2*n -- total 4*n bytes, but BytesIO can reallocate the original buffer -- total 3*n bytes. It also benefits from CPython-specifix optimization -- BytesIO.getvalue() just returns the internal bytes buffer without allocating a new bytes object for result.
The time difference is perhaps a derivation of this, although it is not important when we read from internet. But the peak memory consumption is important.

miss-islington-app · 2025-12-01T15:26:11Z

Thanks @serhiy-storchaka for the PR 🌮🎉.. I'm working now to backport this PR to: 3.10, 3.11, 3.12, 3.13, 3.14.
🐍🍒⛏🤖

…thonGH-119454) Reading the whole body of the HTTP response could cause OOM if the Content-Length value is too large even if the server does not send a large amount of data. Now the HTTP client reads large data by chunks, therefore the amount of consumed memory is proportional to the amount of sent data. (cherry picked from commit 5a4c4a0) Co-authored-by: Serhiy Storchaka <[email protected]>

bedevere-app · 2025-12-01T15:26:23Z

GH-142138 is a backport of this pull request to the 3.14 branch.

…thonGH-119454) Reading the whole body of the HTTP response could cause OOM if the Content-Length value is too large even if the server does not send a large amount of data. Now the HTTP client reads large data by chunks, therefore the amount of consumed memory is proportional to the amount of sent data. (cherry picked from commit 5a4c4a0) Co-authored-by: Serhiy Storchaka <[email protected]>

bedevere-app · 2025-12-01T15:26:28Z

GH-142139 is a backport of this pull request to the 3.13 branch.

…thonGH-119454) Reading the whole body of the HTTP response could cause OOM if the Content-Length value is too large even if the server does not send a large amount of data. Now the HTTP client reads large data by chunks, therefore the amount of consumed memory is proportional to the amount of sent data. (cherry picked from commit 5a4c4a0) Co-authored-by: Serhiy Storchaka <[email protected]>

bedevere-app · 2025-12-01T15:26:34Z

GH-142140 is a backport of this pull request to the 3.12 branch.

…thonGH-119454) Reading the whole body of the HTTP response could cause OOM if the Content-Length value is too large even if the server does not send a large amount of data. Now the HTTP client reads large data by chunks, therefore the amount of consumed memory is proportional to the amount of sent data. (cherry picked from commit 5a4c4a0) Co-authored-by: Serhiy Storchaka <[email protected]>

bedevere-app · 2025-12-01T15:26:38Z

GH-142141 is a backport of this pull request to the 3.11 branch.

bedevere-app · 2025-12-01T15:26:43Z

GH-142142 is a backport of this pull request to the 3.10 branch.

bedevere-bot · 2025-12-01T15:45:58Z

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot AMD64 Debian root 3.x (tier-1) has failed when building commit 5a4c4a0.

What do you need to do:

Don't panic.
Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
Go to the page of the buildbot that failed (https://buildbot.python.org/#/builders/345/builds/12794) and take a look at the build logs.
Check if the failure is related to this commit (5a4c4a0) or if it is a false positive.
If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/#/builders/345/builds/12794

Summary of the results of the build (if available):

==

Click to see traceback logs

Traceback (most recent call last): File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/support/__init__.py", line 847, in gc_collect gc.collect() ~~~~~~~~~~^^ResourceWarning: unclosed file <_io.FileIO name=13 mode='wb' closefd=True> Traceback (most recent call last): File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/support/__init__.py", line 847, in gc_collect gc.collect() ~~~~~~~~~~^^ResourceWarning: unclosed file <_io.FileIO name=11 mode='wb' closefd=True>

illia-v · 2025-12-01T16:29:27Z

This is a different kind of error, less critical. You can easily catch OverflowError in the user code, or clip the limit to the reasonable size (it should not be larger than the buffer size) before call. It will not lead to overallocation.

I see, thanks for explaining and fixing the current issue!

…H-119454) (#142138) Co-authored-by: Serhiy Storchaka <[email protected]>

…H-119454) (#142139) gh-119451: Fix a potential denial of service in http.client (GH-119454) Reading the whole body of the HTTP response could cause OOM if the Content-Length value is too large even if the server does not send a large amount of data. Now the HTTP client reads large data by chunks, therefore the amount of consumed memory is proportional to the amount of sent data. (cherry picked from commit 5a4c4a0) Co-authored-by: Serhiy Storchaka <[email protected]>

…thonGH-119454) Reading the whole body of the HTTP response could cause OOM if the Content-Length value is too large even if the server does not send a large amount of data. Now the HTTP client reads large data by chunks, therefore the amount of consumed memory is proportional to the amount of sent data.

…H-119454) (#142140) gh-119451: Fix a potential denial of service in http.client (GH-119454) Reading the whole body of the HTTP response could cause OOM if the Content-Length value is too large even if the server does not send a large amount of data. Now the HTTP client reads large data by chunks, therefore the amount of consumed memory is proportional to the amount of sent data. (cherry picked from commit 5a4c4a0) Co-authored-by: Serhiy Storchaka <[email protected]>

pythongh-119451: Fix a potential denial of service in http.client (pythonGH-119454) Reading the whole body of the HTTP response could cause OOM if the Content-Length value is too large even if the server does not send a large amount of data. Now the HTTP client reads large data by chunks, therefore the amount of consumed memory is proportional to the amount of sent data. (cherry picked from commit 5a4c4a0) Co-authored-by: Serhiy Storchaka <[email protected]>

[3.12] pythongh-119451: Fix a potential denial of service in http.client (pythonGH-119454) (python#142140) pythongh-119451: Fix a potential denial of service in http.client (pythonGH-119454) Reading the whole body of the HTTP response could cause OOM if the Content-Length value is too large even if the server does not send a large amount of data. Now the HTTP client reads large data by chunks, therefore the amount of consumed memory is proportional to the amount of sent data. (cherry picked from commit 5a4c4a0) Co-authored-by: Serhiy Storchaka <[email protected]>

…H-119454) (#142142) gh-119451: Fix a potential denial of service in http.client (GH-119454) Reading the whole body of the HTTP response could cause OOM if the Content-Length value is too large even if the server does not send a large amount of data. Now the HTTP client reads large data by chunks, therefore the amount of consumed memory is proportional to the amount of sent data. (cherry picked from commit 5a4c4a0) Co-authored-by: Serhiy Storchaka <[email protected]>

…H-119454) (#142141) gh-119451: Fix a potential denial of service in http.client (GH-119454) Reading the whole body of the HTTP response could cause OOM if the Content-Length value is too large even if the server does not send a large amount of data. Now the HTTP client reads large data by chunks, therefore the amount of consumed memory is proportional to the amount of sent data. (cherry picked from commit 5a4c4a0) Co-authored-by: Serhiy Storchaka <[email protected]>

bedevere-appbot added the awaiting core review label May 23, 2024

bedevere-appbot mentioned this pull request May 23, 2024
Out-of-memory when reading a HTTP response with large Content-Length #119451
Open

serhiy-storchaka added type-security A security issue needs backport to 3.8 needs backport to 3.10 only security fixes needs backport to 3.11 only security fixes needs backport to 3.12 only security fixes needs backport to 3.13 bugs and security fixes labels May 23, 2024

Add also test for non-truncated large body.
f097fad

gpshead marked this pull request as draft May 24, 2024 19:58

bedevere-appbot removed the awaiting core review label May 24, 2024

hugovk removed the needs backport to 3.8 label Oct 9, 2024

gpshead mentioned this pull request Jan 20, 2025
gh-119511: Fix a potential denial of service in imaplib #119514
Merged

serhiy-storchaka added the needs backport to 3.14 bugs and security fixes label May 8, 2025

ambv removed the needs backport to 3.9 label Oct 31, 2025

serhiy-storchaka changed the title ~~gh-119451: Fix OOM vulnerability in http.client~~gh-119451: Fix a potential denial of service in http.clientNov 18, 2025

serhiy-storchaka added 2 commits November 18, 2025 15:44

Update 2024-05-23-11-47-48.gh-issue-119451.qkJe9-.rst
83cc548

Merge branch 'main' into http-client
749c7cd

serhiy-storchaka marked this pull request as ready for review November 18, 2025 13:44

bedevere-appbot added the awaiting core review label Nov 18, 2025

serhiy-storchaka requested a review from gpshead November 18, 2025 13:54

gpshead approved these changes Nov 24, 2025
View reviewed changes

bedevere-appbot added awaiting merge and removed awaiting core review labels Nov 24, 2025

illia-v reviewed Nov 24, 2025
View reviewed changes

Merge branch 'main' into http-client
0a33a89

serhiy-storchaka deleted the http-client branch December 1, 2025 15:26

bedevere-appbot removed the needs backport to 3.14 bugs and security fixes label Dec 1, 2025

bedevere-appbot removed the needs backport to 3.13 bugs and security fixes label Dec 1, 2025

bedevere-appbot removed the needs backport to 3.12 only security fixes label Dec 1, 2025

bedevere-appbot removed the needs backport to 3.11 only security fixes label Dec 1, 2025

bedevere-appbot removed the needs backport to 3.10 only security fixes label Dec 1, 2025

hugovk pushed a commit that referenced this pull request Dec 1, 2025
[3.14]gh-119451: Fix a potential denial of service in http.client (G…
4ce2790
…H-119454) (#142138) Co-authored-by: Serhiy Storchaka <[email protected]>

frenzymadness mentioned this pull request Jan 16, 2026
00472: CVE-2025-13836 fedora-python/cpython#136
Open

frenzymadness mentioned this pull request Jan 16, 2026
00472: CVE-2025-13836 fedora-python/cpython#137
Open

github-actionsbot mentioned this pull request Jan 26, 2026
Out-of-bounds Read SNYK-DEBIAN11-PYTHON39-14157076 RADAR-base/radar-helm-charts#7245
Open

Uh oh!

gh-119451: Fix a potential denial of service in http.client#119454

gh-119451: Fix a potential denial of service in http.client #119454

Conversation

serhiy-storchaka commented May 23, 2024• edited by bedevere-app botLoading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gpshead commented May 24, 2024

Uh oh!

encukou commented Jan 27, 2025

Uh oh!

illia-vNov 24, 2025

Choose a reason for hiding this comment

Uh oh!

serhiy-storchakaDec 1, 2025

Choose a reason for hiding this comment

Uh oh!

illia-vDec 1, 2025

Choose a reason for hiding this comment

Uh oh!

serhiy-storchakaDec 1, 2025

Choose a reason for hiding this comment

Uh oh!

miss-islington-appbot commented Dec 1, 2025

Uh oh!

bedevere-appbot commented Dec 1, 2025

Uh oh!

bedevere-appbot commented Dec 1, 2025

Uh oh!

bedevere-appbot commented Dec 1, 2025

Uh oh!

bedevere-appbot commented Dec 1, 2025

Uh oh!

bedevere-appbot commented Dec 1, 2025

Uh oh!

bedevere-bot commented Dec 1, 2025

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Uh oh!

illia-v commented Dec 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

serhiy-storchaka commented May 23, 2024•
edited by bedevere-app bot
Loading