- Notifications
You must be signed in to change notification settings - Fork 774
add fast inference tutorial#1948
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base:main
Are you sure you want to change the base?
add fast inference tutorial #1948
Uh oh!
There was an error while loading. Please reload this page.
Conversation
yiheng-wang-nv commented Feb 28, 2025 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
Signed-off-by: Yiheng Wang <[email protected]>
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
for more information, see https://pre-commit.ci
ericspod commented Mar 2, 2025
This addresses #1865 I assume. |
Signed-off-by: Yiheng Wang <[email protected]>
…g-nv/tutorials into add-infer-accelerate-tutorial
for more information, see https://pre-commit.ci
Signed-off-by: Yiheng Wang <[email protected]>
yiheng-wang-nv commented Mar 7, 2025
for more information, see https://pre-commit.ci
Signed-off-by: Yiheng Wang <[email protected]>
…g-nv/tutorials into add-infer-accelerate-tutorial
for more information, see https://pre-commit.ci
yiheng-wang-nv commented Mar 8, 2025
Signed-off-by: Yiheng Wang <[email protected]>
acceleration/fast_inference_tutorial/fast_inference_tutorial.ipynb Outdated Show resolvedHide resolved
Uh oh!
There was an error while loading. Please reload this page.
| @@ -0,0 +1,635 @@ | |||
| { | |||
KumoLiuMar 19, 2025 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we can also include the .nii.gz benchmark result in the notebook since the original data is nii.gz format.
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @KumoLiu , thanks for the suggestion. .nii.gz files have to be decompressed in CPU, thus using GDS may not have acceleration. I added a section to introduces the limitations on each feature, could you help to review the updates? Thanks!
…pynb Co-authored-by: YunLiu <[email protected]> Signed-off-by: Yiheng Wang <[email protected]>
Signed-off-by: Yiheng Wang <[email protected]>
58d80e3 to 1dccf63Compare
Nic-Ma left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the detailed tutorial, it overall looks good to me.
Do you plan to add the INT8/INT4 quantization in this PR or a separate PR later?
Thanks.
Uh oh!
There was an error while loading. Please reload this page.
yiheng-wang-nv commented Mar 26, 2025
Hi @Nic-Ma , thanks for the suggestion. I think we can consider adding quantization in a separate PR. Before adding it, it may need some time to:
|
Nic-Ma commented Mar 26, 2025
Plan sounds good to me. Thanks. |
| "```bash\n", | ||
| "for benchmark_type in \"original\" \"trt\" \"trt_gpu_transforms\" \"trt_gds_gpu_transforms\" do\n", | ||
| " python run_benchmark.py --benchmark_type \"$benchmark_type\"\n", | ||
| "done\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could instead put this into a cell with %%bash at the top to allow users to run command, or you could do it with Python more directly for those that don't have bash:
forbenchmark_typein ("original", "trt", "trt_gpu_transforms", "trt_gds_gpu_transforms"): !pythonrun_benchmark.py--benchmark_type{benchmark_type}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should also state here that the script contains the same code as what's in this notebook, running it will generate a csv with the results for each type, but if the user wants to run the benchmark here in this notebook then can run the following cell with the commented lines uncommented.
ericspod commented Mar 28, 2025
I've looked at the tutorial and it all looks good to me, however I am wondering about what the results show. It seems to me that GDS has the most impact so the example is just IO bound, using TRT or not has little impact. This is good to demonstrate how to overcome such issues, but it seems to me that the model is so small that it's not relevant to the benchmarks you're showing. If you used a much larger model with many more parameters the actual inference time itself would be significant. Since the inference results aren't considered you could just use a randomly initialised model so you don't need to load pre-trained weights. Thoughts? |
yiheng-wang-nv commented Apr 11, 2025
Thanks @ericspod for the suggestions, I will use a more suitable model to show these features, and then update the PR |
ericspod commented Jun 27, 2025
Hi @yiheng-wang-nv we'd like to get this tutorial through, do we have any progress on using a different model to demonstrate speedup better? Thanks! |
yiheng-wang-nv commented Aug 25, 2025
Hi @ericspod , thanks for the notice. Sorry for late reply, I will do some updates later |
Part of #1865 .
Description
A few sentences describing the changes proposed in this pull request.
Checks
./figurefolder./runner.sh -t <path to .ipynb file>