Uh oh!
There was an error while loading. Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork 34k
Quick test to see if removing the tier2 interpreter speeds up tier1#117908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uh oh!
There was an error while loading. Please reload this page.
Conversation
gvanrossum commented Apr 15, 2024 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
… new configure flag
gvanrossum commented Apr 15, 2024
The failure on WASI of |
gvanrossum commented Apr 15, 2024 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
Benchmark results are in:
Talked it over with @brandtbucher. We found:
|
gvanrossum commented Apr 16, 2024
I've fixed the two issues we discovered and am firing off a new benchmark run, only for Win64, without pystats. Since I didn't move the point where the branch forks off main, I'm not sure if this will produce the right results, but it's the thing I'm most curious about. Honestly, the |
gvanrossum commented Apr 16, 2024
Here are the Windows benchmark results. It is now only 1% slower than base -- but still, surprisingly, slower. According to the time plot, both richards and richards_super are 9% slower. What's up with that??? This occurs only on Windows; it was present on the last benchmark too (before I disabled the no-optimization pragma). |
mdboom commented Apr 16, 2024
Those Windows results are puzzling, but we know there is just generally more noise in the Windows results. Maybe this is a fool's errand, but we could try running them again to get a sense of the range of noise in the results? |
gvanrossum commented Apr 16, 2024
Hm, both runs showed similar results (in particular, both Richards being much slower). Mark's hunch is that by making the function larger, we disabled some compiler optimization that was actually a pessimization. While I'm still curious if there's a smoking gun in Richards, I am leaning towards not merging this PR. |
mdboom commented Apr 16, 2024
Yes, a bit frustratingly, it's not a clear win. |
brandtbucher commented Apr 16, 2024
I personally still think that it makes sense to make the tier two interpreter a build-time option (like the JIT) instead of a run-time option. Even if the perf impact is negligible, realistically the only people who are using tier two are building their own interpreter anyways, and it simplifies a lot of tricky edge-cases when testing tier two (for example, with subprocesses not picking up the environment variable or command-line flag). I'm not sure that we have a good argument for shipping all of tier two when pretty much nobody will be using it. |
gvanrossum commented Apr 16, 2024
Agreed. Then maybe I can put more stuff inside |
gvanrossum commented Apr 16, 2024
If we were to go ahead with that idea, the I'm not sure if I care to do all that work, honestly. |
gvanrossum commented Apr 17, 2024 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
Latest Windows benchmark: 0% faster. Richards is no longer special. (What changed: this no longer updates the bitmask/counter for |
gvanrossum commented Apr 17, 2024
I think I'd like to rethink this aftergh-116970 (moving the Tier 2 interpreter out into its own function). I'd like to almost completely skip the |
Because this is just a quick test, I used a hack to define the crucial env var,
_Py_NOTIER2, by hard-coding it in Makefile.pre.in. If this shows enough of an improvement for Tier 1 that we'd like to have this option, that hack should be replaced with a new configure option. (Adding configure options looks like black magic to me, and IIRC requires using a Docker image to build, so I'm putting that off until after we've seen benchmark results.)Because of that hack, this fails all the JIT tests. I'll just be ignoring that.
There's also a weird test failure in the WASI build, which I'll also ignore unless we decide to move ahead with this.