Skip to content

Conversation

@Fidget-Spinner
Copy link
Member

@Fidget-SpinnerFidget-Spinner commented Jun 4, 2022

Fidget-Spinnerand others added 2 commits June 4, 2022 20:13
…nGH-93383) Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com> Co-authored-by: Dennis Sweeney <36520290+sweeneyde@users.noreply.github.com>
@Fidget-SpinnerFidget-Spinner changed the title [3.11] Cache result of PyCode_GetCode in codeobject (GH-93383)[3.11] gh-93382: Cache result of PyCode_GetCode in codeobject (GH-93383)Jun 4, 2022
@Fidget-Spinner
Copy link
MemberAuthor

Fidget-Spinner commented Jun 4, 2022

@pablogsal and @nedbat

This PR improves coverage performance on 3.11 by 5-7%. Using Ned's benchmarks

# 3.11 branch at 1497d7fdefff8207b8ccde82e96b6f471416c284 Median for bm_sudoku.py, python3.11, cov=none: 10.476s Median for bm_sudoku.py, python3.11, cov=6.4.1: 69.124s Median for bm_spectral_norm.py, python3.11, cov=none: 9.072s Median for bm_spectral_norm.py, python3.11, cov=6.4.1: 72.143s # 3.11 co_code_cached Median for bm_sudoku.py, python3.11, cov=none: 10.363s Median for bm_sudoku.py, python3.11, cov=6.4.1: 64.726s Median for bm_spectral_norm.py, python3.11, cov=none: 9.325s Median for bm_spectral_norm.py, python3.11, cov=6.4.1: 69.713s 

An observation: bm_spectral_norm improved less than bm_sudoku because the size of its co_code is smaller. So the cost of getting a new one every time isn't as high. I think real-world code sizes are more likely to be like bm_sudoku or larger. So I'd guestimate we will see ~10% improvement in real-world code running coverage in 3.11.

@pablogsal
Copy link
Member

Unfortunately this means that whatever is making Python 3.11 slower with coverage, unfortunately is not only this :(

Great investigation @Fidget-Spinner and thanks for working on this ♥️

I will try to review ASAP but it would be great if @markshannon@iritkatriel@brandtbucher or @ericsnowcurrently can take a look.

Copy link
Member

@brandtbucherbrandtbucher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on mobile right now (probably can't get to a computer today), but here are a few thoughts based on a first look:

@nedbat
Copy link
Member

I also ran the benchmarks using #93493:

covprojpython3.10python3.11gh934933.11 vs 3.10gh93493 vs 3.10
nonebug1339.py0.193 s0.155 s0.143 s0.8030.743
nonebm_sudoku.py10.686 s10.393 s11.867 s0.9731.111
nonebm_spectral_norm.py16.051 s10.940 s10.987 s0.6820.684
6.4.1bug1339.py0.439 s0.842 s0.771 s1.9181.757
6.4.1bm_sudoku.py30.148 s61.392 s61.606 s2.0362.043
6.4.1bm_spectral_norm.py40.672 s79.562 s73.221 s1.9561.800

It's a slight improvement, but isn't solving the problem.

@Fidget-Spinner
Copy link
MemberAuthor

Fidget-Spinner commented Jun 5, 2022

It's a slight improvement, but isn't solving the problem.

Yeah I'm aware. I sent this PR in because 10% improvement on macrobenchmarks is still something! Even cProfile slowed down by 60% when profiling code in 3.11. My hunch is that accessing the full PyFrameObject is signifcantly more expensive now in 3.11. However, I can't fix that because it's part of the tracing Py_tracefunc C API.

@brandtbucher
Copy link
Member

brandtbucher commented Jun 8, 2022

I also ran the benchmarks using #93493:

It's a slight improvement, but isn't solving the problem.

Hm, it looks like in some cases this actually makes things a bit slower (perhaps due to the extra memory consumption)?

Maybe we should pause this PR until @markshannon has finished reworking the line number calculations, which seem to be the bulk of the issue at this point. Or at least see if it slows down pyperformance at all before merging?

@Fidget-Spinner
Copy link
MemberAuthor

Fidget-Spinner commented Jun 9, 2022

I also ran the benchmarks using #93493:

It's a slight improvement, but isn't solving the problem.

Hm, it looks like in some cases this actually makes things a bit slower (perhaps due to the extra memory consumption)?

Maybe we should pause this PR until @markshannon has finished reworking the line number calculations, which seem to be the bulk of the issue at this point. Or at least see if it slows down pyperformance at all before merging?

When I benchmarked on the main/3.12 branch, it didn't make anything slower #93383 (comment). However, I agree on waiting for a while.

When benchmarking with Ned's benchmarks, I saw a slowdown in bm_sudoku but no slowdown in bm_spectral_norm. I'm not sure what's up with that.

@bedevere-bot
Copy link

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

Co-Authored-By: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
@Fidget-Spinner
Copy link
MemberAuthor

I have made the requested changes; please review again Mark.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@markshannon: please review the changes made to this pull request.

Copy link
Member

@markshannonmarkshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Copy link
Contributor

@kumaraditya303kumaraditya303 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pablogsalpablogsal merged commit 852b4d4 into python:3.11Jun 23, 2022
@Fidget-SpinnerFidget-Spinner deleted the co_code_cached_backport branch June 24, 2022 17:54
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants

@Fidget-Spinner@pablogsal@nedbat@brandtbucher@bedevere-bot@markshannon@kumaraditya303