- Notifications
You must be signed in to change notification settings - Fork 14.1k
Pull requests: ggml-org/llama.cpp
Author
Uh oh!
There was an error while loading. Please reload this page.
Label
Uh oh!
There was an error while loading. Please reload this page.
Projects
Uh oh!
There was an error while loading. Please reload this page.
Milestones
Uh oh!
There was an error while loading. Please reload this page.
Reviews
Assignee
Assigned to nobodyLoading
Uh oh!
There was an error while loading. Please reload this page.
Sort
Pull requests list
Build CUDA architectures 120 and 121 by default (RTX5000 and GB10) ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#17970 opened Dec 12, 2025 by DaAwesomePLoading…
mtmd, llama: add GLM4V vision-language model support examples ggml changes relating to the ggml tensor library for machine learning model Model specific Nvidia GPU Issues specific to Nvidia GPUs python python script changes
#17967 opened Dec 12, 2025 by eelbazLoading…
CUDA: experimental native mxfp4 support for blackwell [WIP] ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
[DRAFT] CUDA: Improve performance via less synchronizations between token ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
Fix too stringent check on CUDA "fast copy" (can_be_transposed) condition and extend with one more case ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs testing Everything test related
#17759 opened Dec 4, 2025 by bssrdfLoading…
model : add ASR support for LFM2-Audio-1.5B examples ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs python python script changes testing Everything test related
Add Support for Microsoft Phi-3.5 Vision Instruct Models Apple Metal https://en.wikipedia.org/wiki/Metal_(API) Ascend NPU issues specific to Ascend NPUs build Compilation issues devops improvements to build systems and github actions documentation Improvements or additions to documentation examples ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs python python script changes script Script related server testing Everything test related Vulkan Issues specific to the Vulkan backend
Feature/kimi linear support ggml changes relating to the ggml tensor library for machine learning model Model specific Nvidia GPU Issues specific to Nvidia GPUs python python script changes
#17592 opened Nov 29, 2025 by cacaviewLoading…
Add PagedAttention support (experimental, CUDA only) examples ggml changes relating to the ggml tensor library for machine learning model Model specific Nvidia GPU Issues specific to Nvidia GPUs server
#17579 opened Nov 28, 2025 by ericcurtin • Draft
HIP: Add RDNA3 WMMA support to MMF ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#17495 opened Nov 25, 2025 by unverbrauchtLoading…
mtmd: Add DeepSeekOCR Support examples ggml changes relating to the ggml tensor library for machine learning model Model specific Nvidia GPU Issues specific to Nvidia GPUs python python script changes
#17400 opened Nov 20, 2025 by sfallahLoading…
ggml : enhance rel-pos and window ops with CUDA support ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs testing Everything test related
#17383 opened Nov 19, 2025 by bluebreadLoading…
CUDA: add CONV_3D operator support documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#17255 opened Nov 14, 2025 by YaelGitAccountLoading…
CUDA & CPU: support F32 kernel type for changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs testing Everything test related
CONV_TRANSPOSE_2D ggml #17094 opened Nov 8, 2025 by AgainstEntropyLoading…
sampling : add support for backend sampling Apple Metal https://en.wikipedia.org/wiki/Metal_(API) build Compilation issues examples ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs python python script changes server testing Everything test related
#17004 opened Nov 4, 2025 by danbevLoading…
17 of 25 tasks
Mamba2 SSD Apple Metal https://en.wikipedia.org/wiki/Metal_(API) examples ggml changes relating to the ggml tensor library for machine learning model Model specific Nvidia GPU Issues specific to Nvidia GPUs testing Everything test related
#16982 opened Nov 3, 2025 by gabe-l-hart • Draft
CUDA: add implicit conv3d ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs testing Everything test related
#16948 opened Nov 2, 2025 by bssrdfLoading…
Enable CUDA graphs for embed gemma 300m ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#16844 opened Oct 29, 2025 by ArshM17-NVLoading…
Add basic support for MXFP6_MOE quantization examples ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs python python script changes
#16777 opened Oct 26, 2025 by horasalLoading…
Implement and use cuda graph plans ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#16548 opened Oct 13, 2025 by wishstudioLoading…
Add hipblasLt implementation for batched gemm to improve performance for CDNA3 only ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#16457 opened Oct 7, 2025 by peizhang56Loading…
ggml-cuda: Vulkan direct conv 2D ported to CUDA ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#16088 opened Sep 18, 2025 by etasnadiLoading…
Deterministic inference mode (CUDA): RMSNorm, MatMul, Attention, KV-cache documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs script Script related testing Everything test related
--numa mirror: mirror model weights to every Numa node in the system Apple Metal CUDA: print CUDART_VERSION on init ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#15853 opened Sep 7, 2025 by JohannesGaesslerLoading…
PreviousNext
ProTip! What’s not been updated in a month: updated:<2025-11-12.