Uh oh!
There was an error while loading. Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork 34k
gh-107674: Improve performance of sys.settrace#117133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uh oh!
There was an error while loading. Please reload this page.
Conversation
gaogaotiantian commented Mar 21, 2024 • edited by bedevere-app bot
Loading Uh oh!
There was an error while loading. Please reload this page.
edited by bedevere-app bot
Uh oh!
There was an error while loading. Please reload this page.
gaogaotiantian commented Mar 27, 2024
Hi @markshannon , could you take a look at this PR when you have some time? It's almost identical to the optimization that's merged and reverted a while ago. The only difference is mentioned above. Thanks! |
brandtbucher commented Mar 29, 2024
I ran some benchmarks on this branch, and it doesn't seem that the "coverage" benchmark (currently our slowest vs. 3.12) is affected at all by this change. Is that expected? |
gaogaotiantian commented Mar 29, 2024
I believe coveragepy uses CTracer by default and the key improvement in this PR is to not instrument code while tracing. If the tracer is done completely in C, I don't believe there will be a significant change - because no Python code is executed during tracing. |
gaogaotiantian commented Apr 15, 2024
Hi @markshannon, I just fixed the conflict for the generated file. The test failure does not seem relevant. Anything you need me to do on this PR? This is basically the original PR that's reverted. |
gvanrossum commented Apr 15, 2024
I happened to look here and I've restarted the failed test -- it looks like a timing-sensitive test, so hopefully that will fix it. If it keeps failing with the same error ( |
gaogaotiantian commented Apr 15, 2024
It seemed like the test passed. I would be surprised if the changes to |
gaogaotiantian commented Apr 20, 2024
I have no idea why the test is failing, seems unrelated. |
markshannon commented Apr 26, 2024
I don't see the need for changes to
Or am I missing something? |
markshannon commented Apr 26, 2024
Now that I've read the original PR, it makes more sense why you want to change I'd be inclined to leave them alone though. For trace functions implemented in C, modifying |
gaogaotiantian commented Apr 26, 2024
The ability to not instrument the code object while tracing is a very significant improvement for Yes, this has no imapct on C functions, but that's for pure C trace functions. It does not have impact on our coverage benchmark because we only used the basic features. Even for coveragepy, there could be calls to a Python function inside the C tracer - which will benefit from this change. And coveragepy is not the only tool out in the market that uses When monitoring is off, there will be one extra branch in the common path ( And if that's still something bothers, we can do some smart circuit shorts to make it exactly the same in non-monitoring case: if (eval_breaker!=version){DEOPT_IF(tstate->tracing==0|| (eval_breaker&_PY_EVAL_EVENTS_MASK)) }That way, in the most common path, the code is exactly the same - most of the time without the monitoring |
markshannon commented Apr 26, 2024
Turning tracing on and off is going to be bad for performance no matter what. I'm not against the idea of not instrumenting tracing code, but I really don't want to negatively impact if (eval_breaker!=version){DEOPT_IF(tstate->tracing==0|| (eval_breaker&_PY_EVAL_EVENTS_MASK)) }adds bulk to the code generated by the JIT significantly. If |
gaogaotiantian commented Apr 26, 2024
It's not only about turning tracing on and off, it could be code crossing the tracing boundary. defgetname(o): returno.namedeftrace(*args): name=getname(o) returntracedeff(): name=getname(o) sys.settrace(trace) for_inrange(10): f()In the example above, If JIT is a concern, then I suppose we can put all the logic in |
markshannon commented Apr 29, 2024
Why will it be deinstrumented? |
markshannon commented Apr 29, 2024
I feel this is getting too complex for this one scenario. Perhaps we should allows tools to explicitly state which functions should not be instrumented? |
gaogaotiantian commented Apr 29, 2024
Ah I was wrong about the re-instrumenting. Okay so let's dial it back a little. Leave the I know The key thing I want to preserve here is to avoid instrumenting the code object when it's only used by tracing functions, that could save plenty of time. |
gaogaotiantian commented May 2, 2024
Okay I'm not sure if we have the time to merge this is - this now has conflicts with tier2 opcodes. There's an assertion in If that's still too much before the freeze, I'll live with only avoiding unnecessary calls to intrumented line callback - no changes to Basically I want to improve this in 3.13 and anything is better than nothing. |
markshannon commented May 3, 2024
I've fixed the assert. The assert was checking "is the instrumentation up to date?" Now it checks "is the instrumentation up to date, or are we tracing?". Which is what we want, I believe. |
gaogaotiantian commented May 3, 2024
Yes, changing the assertion would do as well. I'm not entirely familiar with tier2, but for |
markshannon commented May 3, 2024
For tier 2, Are you OK with me merging this now? |
gaogaotiantian commented May 3, 2024
Then yes I believe the code is in mergable state. Thanks! |
* Check tracing in RESUME_CHECK * Only change to RESUME_CHECK if not tracing
This PR is reverted in #116178 because of a refleak failure. The test was proved to be wrong(not useful anymore?) with
mainand was removed in #116687. So this PR is back!There is a slight change compared to the original PR thanks to @brandtbucher - in
RESUME_CHECK, we should not DEOPT when eval_breaker is different than version if we are tracing. If we are not tracing, that's fine, but if we are tracing, we shoud only check the events, not the code version - otherwise we will just specialize and deopt thatRESUMEinstruction everytime we execute it.This introduces an extra branch check in
RESUME_CHECK, which I think is fine (a check is introduced toRESUMEtoo). There are some techniques that we can use to slightly optimize the fast path (non-tracing) by checking theeval_breaker == versionfirst then check tracing. I don't think that helps a lot and it would make code a bit harder to read, but I can try it if that's desired.