Skip to content

Conversation

@bowang007
Copy link
Collaborator

Description

This PR shows a simple example about using accelerate library for data parallel inference.

Checklist:

  • My code follows the style guidelines of this project (You can use the linters)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas and hacks
  • I have made corresponding changes to the documentation
  • I have added tests to verify my fix or my feature
  • New and existing unit tests pass locally with my changes
  • I have added the relevant labels to my PR in so that relevant reviewers are notified

Copy link

@github-actionsgithub-actionsbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes that do not conform to Python style guidelines:

--- /home/runner/work/TensorRT/TensorRT/examples/distributed_inference/data_parallel_gpt2.py 2024-05-02 00:29:27.054073+00:00+++ /home/runner/work/TensorRT/TensorRT/examples/distributed_inference/data_parallel_gpt2.py 2024-05-02 00:31:18.785078+00:00@@ -13,12 +13,26 @@ distributed_state = PartialState() model = GPT2LMHeadModel.from_pretrained("gpt2").eval().to(distributed_state.device) -model.forward = torch.compile(model.forward, backend="torch_tensorrt", options={"truncate_long_and_double": True, "enabled_precisions":{torch.float16}, "debug": True}, dynamic=False,)+model.forward = torch.compile(+ model.forward,+ backend="torch_tensorrt",+ options={+ "truncate_long_and_double": True,+ "enabled_precisions":{torch.float16},+ "debug": True,+ },+ dynamic=False,+) with distributed_state.split_between_processes([input_id1, input_id2]) as prompt: cur_input = torch.clone(prompt[0]).to(distributed_state.device) - gen_tokens = model.generate(cur_input, do_sample=True, temperature=0.9, max_length=100,)+ gen_tokens = model.generate(+ cur_input,+ do_sample=True,+ temperature=0.9,+ max_length=100,+ ) gen_text = tokenizer.batch_decode(gen_tokens)[0]

@bowang007bowang007 changed the title feat: data parallel inference samplefeat: data parallel inference examplesMay 2, 2024
Copy link
Collaborator

@narendasannarendasan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Need a requirements.txt
  2. Annotate the script with description of whats happening https://github.com/pytorch/TensorRT/blob/main/examples/dynamo/torch_compile_advanced_usage.py
  3. Add a reference to index.rst so that it gets rendered in the docs:
    tutorials/_rendered_examples/dynamo/torch_compile_stable_diffusion

@github-actionsgithub-actionsbot added the documentation Improvements or additions to documentation label May 7, 2024
Copy link
Collaborator

@narendasannarendasan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bowang007bowang007 merged commit db24b3b into mainMay 17, 2024
@HolyWu
Copy link
Contributor

@bowang007 You didn't properly clean up the merge conflicts, therefore db24b3b had <<<<<<< HEAD, ======= and >>>>>>> dfbf6ea84 (feat: data parallel inference sample) remaining in docsrc/index.rst.

bowang007 added a commit that referenced this pull request May 17, 2024
peri044 pushed a commit that referenced this pull request May 21, 2024
laikhtewari pushed a commit that referenced this pull request May 24, 2024
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signeddocumentationImprovements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

@bowang007@HolyWu@narendasan@facebook-github-bot