Skip to content

Conversation

@cmaloney
Copy link
Contributor

@cmaloneycmaloney commented Jul 4, 2024

goal: Fix buildbot failure https://github.com/python/cpython/pull/120755

Sometimes a large file is truncated (test_largefile). While estimated_size is used as a estimate (the read will get all the bytes in the file when it is wrong), that it is much larger than the actual size of data can result in a significant over allocation and sometimes lead to a MemoryError / running out of memory.

This brings the C implementation to match the Python _pyio implementation.

cc: @vstinner

I've been unable to reproduce the failure locally so far by running ./build/python -bb -E -Wd -m test -r -w -uall test_largefile. My suspicion is that the AMD64 box has limited memory. Working to try and test peak memory usage in the test / if that went down

Sometimes a large file is truncated (test_largefile). While estimated_size is used as a estimate (the read will stil get the number of bytes in the file), that it is much larger than the actual size of data can result in a significant over allocation and sometimes lead to a MemoryError / running out of memory. This brings the C implementation to match the Python _pyio implementation.
@cmaloneycmaloney changed the title gh-120754: Update size_estimated in C truncategh-120754: Update estimated_size in C truncateJul 4, 2024
@cmaloney
Copy link
ContributorAuthor

Validated memory usage decrease by running under https://www.gnu.org/software/time/

/usr/bin/time -v ./python -bb -E -Wd -m test -r -w -uall test_largefile -vvv 

Maximum resident set size (kbytes):2464692 (main) -> 24532 (this pr)

Copy link
Member

@vstinnervstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vstinnervstinner enabled auto-merge (squash) July 4, 2024 09:32
@vstinnervstinner merged commit 06a1c3f into python:mainJul 4, 2024
@cmaloneycmaloney deleted the cmaloney/fix_largefile_amd64 branch July 4, 2024 17:07
noahbkim pushed a commit to hudson-trading/cpython that referenced this pull request Jul 11, 2024
Sometimes a large file is truncated (test_largefile). While estimated_size is used as a estimate (the read will stil get the number of bytes in the file), that it is much larger than the actual size of data can result in a significant over allocation and sometimes lead to a MemoryError / running out of memory. This brings the C implementation to match the Python _pyio implementation.
estyxx pushed a commit to estyxx/cpython that referenced this pull request Jul 17, 2024
Sometimes a large file is truncated (test_largefile). While estimated_size is used as a estimate (the read will stil get the number of bytes in the file), that it is much larger than the actual size of data can result in a significant over allocation and sometimes lead to a MemoryError / running out of memory. This brings the C implementation to match the Python _pyio implementation.
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

@cmaloney@vstinner