Skip to content

Conversation

@morotti
Copy link
Contributor

@morottimorotti commented May 30, 2024

Hello,

As part of other work looking into I/O and buffering optimizations (see github ticket).

I am offering to increase the default buffer size of shutil.copyfileobj() to 256k.
it was set to 16k in the 1990s.
it was raised to 64k in 2019. the discussion at the time mentioned another 5% improvement by raising to 128k and settled for a very conservative setting.

it's 2024 now, I think it should be revisited to match modern hardware. I am measuring 0-15% performance improvement when raising to 256k on various types of disk. there is no downside as far as I can tell.

this function is only intended for sequential copy of full files (or file like objects). it's the typical use case that benefits from larger operations.

for reference, I came across this function while trying to profile pip that is using it to copy files when installing python packages.

… to 256k. it was set to 16k in the 1990s. it was raised to 64k in 2019. the discussion at the time mentioned another 5% improvement by raising to 128k and settled for a very conservative setting. it's 2024 now, I think it should be revisited to match modern hardware. I am measuring 0-15% performance improvement when raising to 256k on various types of disk. there is no downside as far as I can tell. this function is only intended for sequential copy of full files (or file like objects). it's the typical use case that benefits from larger operations. for reference, I came across this function while trying to profile pip that is using it to copy files when installing python packages.
@morotti
Copy link
ContributorAuthor

I think we can skip the NEWS entry. it's not significant.

Copy link
Member

@gpsheadgpshead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the box in github to allow others to push changes to your PR branch.

This needs a NEWS entry.

Misc/NEWS.d/next/Library/2024-10-03-05-00-25.gh-issue-117151.Prdw_W.rst:

The default buffer size used by :func:`shutil.copyfileobj` has been increased from 64k to 256k on non-Windows platforms. It was already larger on Windows. 

@bedevere-app
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@morotti
Copy link
ContributorAuthor

I have made the requested changes; please review again

Thanks for reading the PR. I added a news entry.

I don't have any tickbox to allow maintainers to edit. Maybe because it's an organization repo instead of a personal fork.

@bedevere-app
Copy link

Thanks for making the requested changes!

@gpshead: please review the changes made to this pull request.

@bedevere-appbedevere-appbot requested a review from gpsheadOctober 3, 2024 11:59
@gpsheadgpshead merged commit 6efd95c into python:mainOct 4, 2024
@morottimorotti deleted the shutil-copy-buf-size branch February 5, 2025 11:39
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

@morotti@gpshead@rmmancom