Uh oh!
There was an error while loading. Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork 11.9k
Pull requests: vllm-project/vllm
Author
Uh oh!
There was an error while loading. Please reload this page.
Label
Uh oh!
There was an error while loading. Please reload this page.
Projects
Uh oh!
There was an error while loading. Please reload this page.
Milestones
Uh oh!
There was an error while loading. Please reload this page.
Reviews
Assignee
Assigned to nobodyLoading
Uh oh!
There was an error while loading. Please reload this page.
Sort
Pull requests list
[MoE][Refactor 1/N] Separate Online Quantization
#30627 opened Dec 13, 2025 by robertgshaw2-redhatLoading…
5 tasks
[docker] Restructure Dockerfile for more efficient and cache-friendly builds ci/build documentation Improvements or additions to documentation ready ONLY add when PR is ready to merge/full CI is needed
#30626 opened Dec 13, 2025 by amrmahdiLoading…
fix: prevent reasoning output when enable_thinking is false frontend
#30625 opened Dec 13, 2025 by llsj14Loading…
5 tasks
[CI/Build] Ignore max transformers version skipping for initialization tests ready ONLY add when PR is ready to merge/full CI is needed
#30619 opened Dec 13, 2025 by Isotr0pyLoading…
1 of 5 tasks
[BugFix][Hybrid] Fix prefill chunk incorrectly including draft tokens v1
#30618 opened Dec 13, 2025 by peakcrosser7Loading…
3 of 5 tasks
[Docs] Add FlashInfer environment variables to env_vars documentation documentation Improvements or additions to documentation
#30616 opened Dec 13, 2025 by majiayu000Loading…
2 tasks done
[Feature] Default EPLB num_redundant_experts to minimum valid value
#30614 opened Dec 13, 2025 by majiayu000Loading…
2 tasks done
[Bugfix] Add validation for tool requests when tool_parser is unavailable frontend
#30613 opened Dec 13, 2025 by majiayu000Loading…
2 tasks done
[Chore] Remove redundant ONLY add when PR is ready to merge/full CI is needed
RequestPrompt frontend ready #30612 opened Dec 13, 2025 by DarkLight1337Loading…
5 tasks
[ROCm][Perf] Replace cat to bmm's inplace write when aiter enabled rocm Related to AMD ROCm v1
#30611 opened Dec 13, 2025 by ganyi1996ppoLoading…
5 tasks
[FixBug]fix gpt-oss v1/completions response bug frontend gpt-oss Related to GPT-OSS models tool-calling
#30608 opened Dec 13, 2025 by princeprideLoading…
3 of 5 tasks
[Bugfix] Improve DCP error hint in cp_utils v1
#30607 opened Dec 13, 2025 by jliu9515Loading…
3 of 5 tasks
[Bugfix] Fix ScalarType NanRepr enum comparisons
#30605 opened Dec 13, 2025 by NoonePausefergLoading…
3 of 5 tasks
[Quantization] Pass
QuantizationArgs to compress-tensors schema's get_min_capability #30602 opened Dec 13, 2025 by Isotr0pyLoading…
3 of 5 tasks
[LoRA] Set default MXFP4 LoRA backend to Marlin
#30598 opened Dec 13, 2025 by xyang16Loading…
5 tasks
[Bugfix][benchmarks] Fix input token calculation for rerank benchmark metrics performance Performance-related issues
#30596 opened Dec 13, 2025 by Flink-dddLoading…
[docs][fix] Update Arm CPU vLLM wheel installation docs documentation Improvements or additions to documentation
#30594 opened Dec 13, 2025 by fadara01Loading…
5 tasks
[Misc] Improve error messages for unsupported types and parameters kv-connector nvidia performance Performance-related issues
#30593 opened Dec 13, 2025 by BlankRHLoading…
3 of 5 tasks
scheduler: cap prefill token admission under backlog to reduce tail latency v1
#30592 opened Dec 13, 2025 by BenjamindaosonLoading…
5 tasks
Fix edge case Mistral tool parser frontend tool-calling
#30588 opened Dec 13, 2025 by joa-stdnLoading…
[Bugfix] Record request stats when request is aborted by client v1
#30587 opened Dec 13, 2025 by pooyadavoodiLoading…
PreviousNext
ProTip!no:milestone will show everything without a milestone.