Skip to content

Conversation

@methane
Copy link
Member

@methanemethane commented Sep 1, 2024

Windows:

Current buffer size is 8KiB since multiprocessing is introduced.
It seems small for recent Python usages.

e711caf#diff-2c54a007d7fe1d9ac5ca008fe2d054394c39a4f521eea2cb580101a284d7b7ecR28

macOS/BSD:

They use 64KiB buffer for pipes. Current 16 pages (256KiB) buffer makes ~10% slowdown compared to 64KiB on M1 mac.

Linux:

I don't have 16k/64k page Linux. But when I change the pipe buffer size via fcntl, 256KiB buffer doesn't make notable performance benefit.

64KiB seems good default buffer size.
If it is not suitable, user can try other size by changing multiprocessing.connection.BUFSIZE.

@methane
Copy link
MemberAuthor

Previous discussion:
#121315 (comment)

quick bench code:
https://gist.github.com/methane/a6cb799704a11f2bb2f64b16c0b830cc

64k vs 256k quick bench:

$ time python hello.py # 64k real 0m25.999s user 0m9.753s sys 0m29.687s real 0m25.964s user 0m9.553s sys 0m29.854s $ time python hello.py # 256k real 0m25.927s user 0m9.421s sys 0m29.958s real 0m25.933s user 0m9.330s sys 0m30.108s 

How to change the pipe buffer size:

diff --git a/Lib/multiprocessing/connection.py b/Lib/multiprocessing/connection.py index ede5596e4fc..9c1bfb93728 100644 --- a/Lib/multiprocessing/connection.py +++ b/Lib/multiprocessing/connection.py @@ -544,7 +544,10 @@ def Pipe(duplex=True): c1 = Connection(s1.detach()) c2 = Connection(s2.detach()) else: + import fcntl fd1, fd2 = os.pipe() + fcntl.fcntl(fd1, fcntl.F_SETPIPE_SZ, 256*1024) + fcntl.fcntl(fd2, fcntl.F_SETPIPE_SZ, 256*1024) c1 = Connection(fd1, writable=False) c2 = Connection(fd2, readable=False) 

I confirmed that this setting works by checking n = len(chunk) size.

So I don't think having Linux pipe only optimization is worth enough.
Static 64KiB size seems good enough.

Copy link
Member

@picnixzpicnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really small nitpick. But otherwise LGTM.

Copy link
Member

@vstinnervstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@methane
Copy link
MemberAuthor

I tested this PR on Windows but no speedup. It is because Windows can read whole remaining data at once via this function.

def_get_more_data(self, ov, maxsize):

Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

@methane@gpshead@vstinner@picnixz