Skip to content

Conversation

@gronke
Copy link
Contributor

@gronkegronke commented Sep 21, 2017

Initializing _uuid_generate_time in Lib/uuid.py was found to be slow on FreeBSD systems with many CPU cores. This changes optimizes code paths for Darwin and FreeBSD and delays the initialization of the _uuid_generate_time ctypes implementation to time of it's first use (for instance when calling uuid.uuid1()).

https://bugs.python.org/issue5885

@the-knights-who-say-ni

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).

Unfortunately we couldn't find an account corresponding to your GitHub username on bugs.python.org (b.p.o) to verify you have signed the CLA (this might be simply due to a missing "GitHub Name" entry in your b.p.o account settings). This is necessary for legal reasons before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

Thanks again to your contribution and we look forward to looking at it!

@gronke
Copy link
ContributorAuthor

gronke commented Sep 21, 2017

user@freebsd11 /tmp/test % ls -al total 9 drwxr-xr-x 2 fnord wheel 2 Sep 21 11:14 . drwxrwxrwt 14 root wheel 23 Sep 21 11:09 .. user@freebsd11 /tmp/test % /usr/bin/time -l python3.6 -c "import uuid" 1.96 real 0.48 user 1.47 sys 17540 maximum resident set size 4 average shared memory size 4 average unshared data size 128 average unshared stack size 4331 page reclaims 0 page faults 0 swaps 12 block input operations 1 block output operations 0 messages sent 0 messages received 0 signals received 33 voluntary context switches 22 involuntary context switches user@freebsd11 /tmp/test % fetch https://raw.githubusercontent.com/igalic/cpython/730040ee331f50f666b6109d5ed9545e9df00bea/Lib/uuid.py uuid.py 100% of 24 kB 2882 kBps 00m00s user@freebsd11 /tmp/test % /usr/bin/time -l python3.6 -c "import uuid" 0.04 real 0.02 user 0.02 sys 8496 maximum resident set size 4 average shared memory size 4 average unshared data size 128 average unshared stack size 1342 page reclaims 0 page faults 0 swaps 4 block input operations 1 block output operations 0 messages sent 0 messages received 0 signals received 2 voluntary context switches 1 involuntary context switches user@freebsd11 /tmp/test % sysctl hw.model hw.machine hw.ncpu hw.model: Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz hw.machine: amd64 hw.ncpu: 32 

https://asciinema.org/a/UG8xwIPSME3HB8Y5LTERT6mPC

@gronke
Copy link
ContributorAuthor

@the-knights-who-say-ni the CLA is signed by @igalic and me.

@igalic
Copy link
Contributor

added a NEWS entry now too

Lib/uuid.py Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is redundant: (true expression) is True

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to be explicit that this is a boolean value. Think that's much easier to read.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

especially considering that nothing in this module is explicitly typed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it’s not the python way. _dont_do_something = _detect_thing() is legible.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to adopt suggestions for better naming. After thinking back and forward (pairing with @igalic) we ended with this solution. Personally I'd rather make an exception to the naming pattern for boolean values than complicating the logic here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i didn't notice that the function is actually called _is_this_thing()… fixed according to @merwok's comment.

@igalic
Copy link
Contributor

igalic commented Sep 22, 2017

can someone please explain why the news check is failing?
(i've now regenerated it with blurb add)

@gronke
Copy link
ContributorAuthor

gronke commented Sep 22, 2017

I wanted to compare the performance impact duplicating the ctypes imports:

importtimeimportctypesimportctypes.utildefmeasure(cb, iterations): start=time.time() foriinrange(iterations): cb() end=time.time() return (end-start) defdo(): importctypesimportctypes.utildefdont(): passdefrun(iterations): print("Iterations:", iterations) print("A", measure(do, iterations)) print("B", measure(dont, iterations)) run(1000000) run(100000000)
# python3.6 measure-import-time.py Iterations: 1000000 A 0.5856528282165527 B 0.08484673500061035 Iterations: 100000000 A 58.5674729347229 B 8.484308242797852

Is this something we should optimize for?

Copy link
Member

@merwokmerwok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why news check is failing. Maybe next push will cause a new check.

Lib/uuid.py Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: parens not needed

Lib/uuid.py Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I read correctly, this line is not needed. Declaring global is only needed to assign to a name, not to read from it.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is actually required. I've tried to remove it, but then the function call fails.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! The function sets the global at line 508, that’s why.

Lib/uuid.py Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s a big except that could hide developer errors, not just catch the exact AttributeError (or other) that you’re interested in here.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I didn't expect the Spanish Inquisition!. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

And if you don't make the requested changes, you will be put in the comfy chair!

@gronke
Copy link
ContributorAuthor

@merwok your requested changes were addressed / commented. Thanks for the review!

@gronke
Copy link
ContributorAuthor

I didn't expect the Spanish Inquisition!

@bedevere-bot
Copy link

Nobody expects the Spanish Inquisition!

@merwok: please review the changes made to this pull request.

@merwok
Copy link
Member

Alright I started a build on custom buildbots: http://buildbot.python.org/all/waterfall?category=custom.stable&category=custom.unstable

I don’t feel qualified to merge this; maybe @methane would like to take it?

@serhiy-storchaka
Copy link
Member

Tests are failed on Windows: http://buildbot.python.org/all/builders/AMD64%20Windows7%20SP1%20custom/builds/20/steps/test/logs/stdio

====================================================================== ERROR: test_windll_getnode (test.test_uuid.TestInternals) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\buildbot.python.org\custom.kloth-win64\build\lib\test\test_uuid.py", line 552, in test_windll_getnode node = uuid._windll_getnode() File "C:\buildbot.python.org\custom.kloth-win64\build\lib\uuid.py", line 573, in _windll_getnode if _UuidCreate(_buffer) == 0: TypeError: 'NoneType' object is not callable ---------------------------------------------------------------------- 

Why import ctypes was moved into functions? This is contrary PEP 8.

@merwok
Copy link
Member

That’s why I pinged Inada-san: there was a recent python-dev discussion about improving startup times, with the pros and cons of function-level imports.

@gronke
Copy link
ContributorAuthor

gronke commented Sep 23, 2017

Why import ctypes was moved into functions? This is contrary PEP 8.

The import was in a try/except block and I thought it was more elegant to let the existing exception handling do the job. I did test the performance impact of importing ctypes multiple times on a system that successfully imported it once before - results may vary when no ctypes package is available. After thinking about it again, I'd now rather load it once when importing uuid and directly set _no_uuid_generate_time_lookup in case the import fails. That should be quick enough and not have an impact on the runtime performance.

Tests are failed on Windows

Working on it. Downloading Visual Studio :bowtie:

@gronke
Copy link
ContributorAuthor

@serhiy-storchaka I've fixed the Windows problem in the commit that caused it and moved the imports to the module level in a separate one. I had to move the code a bit so that the import works for the whole module.

igalicand others added 4 commits September 24, 2017 13:52
optimizes uuid ctypes initialization for BSD and Darwin
_uuid_generate_time is lazy-loaded on first use to speed up initialization of applications depending on other uuid features
@serhiy-storchakaserhiy-storchaka self-assigned this Sep 26, 2017
@serhiy-storchakaserhiy-storchaka added the performance Performance or resource usage label Sep 26, 2017
@gronke
Copy link
ContributorAuthor

@serhiy-storchaka are there any changes required or can the awaiting changes label be removed?

@methane
Copy link
Member

Is this PR relating to bpo-5885? It is "execution time", not about "import time".

Anyway, #3795 (split uuid1 only code into _uuid1.py) seems more clean.

@gronke
Copy link
ContributorAuthor

@methane so you're suggesting to break the code path for uuid1() to a separate file? Personally
I would not prefer to break code in multiple files, but we can group uuid1 related code together by moving it further to the end of the file.

So this PR aims to fix a bug. Since the uuid code is used across various versions of Python, the choice was to fix it with small changes.

@methane
Copy link
Member

Would you discuss on https://bugs.python.org/issue11063 ?
I don't have strong opinion and both pull requests seems OK.

@vstinner
Copy link
Member

I suggest to close this PR in favor of @pitrou's PR #3796 which is more complete (add new tests, add a C extension module).

@pitrou already proposed the same thing in bpo-11063: "PR #3684 seems mostly a subset of what I'm proposing."
https://bugs.python.org/issue11063#msg303230

@igalic
Copy link
Contributor

👍

@vstinner
Copy link
Member

I close this PR. Let's focus reviews on @pitrou's PR #3796.

@serhiy-storchakaserhiy-storchaka removed their assignment Dec 6, 2018
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performancePerformance or resource usage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants

@gronke@the-knights-who-say-ni@igalic@bedevere-bot@merwok@serhiy-storchaka@methane@vstinner@brettcannon@Mariatta