ggml-org /llama.cppPublic

Notifications You must be signed in to change notification settings
Fork 14.1k
Star 91.2k

Code
Issues335
Pull requests619
Discussions
Actions
Projects10
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Pull requests: ggml-org/llama.cpp

Labels 86 Milestones 0

New pull requestNew

Clear current search query, filters, and sorts

43 Open 519 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Build CUDA architectures 120 and 121 by default (RTX5000 and GB10) ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

#17970 opened Dec 12, 2025 by DaAwesomeP

Loading…

mtmd, llama: add GLM4V vision-language model support examples ggml

changes relating to the ggml tensor library for machine learning

model

Model specific

Nvidia GPU

Issues specific to Nvidia GPUs

python

python script changes

#17967 opened Dec 12, 2025 by eelbaz

Loading…

CUDA: experimental native mxfp4 support for blackwell [WIP] ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

#17906 opened Dec 10, 2025 by am17an • Draft

1 of 2 tasks

[DRAFT] CUDA: Improve performance via less synchronizations between token ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

#17795 opened Dec 5, 2025 by aendk • Draft

Fix too stringent check on CUDA "fast copy" (can_be_transposed) condition and extend with one more case ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

testing

Everything test related

#17759 opened Dec 4, 2025 by bssrdf

Loading…

model : add ASR support for LFM2-Audio-1.5B examples ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

python

python script changes

testing

Everything test related

#17694 opened Dec 2, 2025 by tdakhran • Draft

Add Support for Microsoft Phi-3.5 Vision Instruct Models Apple Metal

https://en.wikipedia.org/wiki/Metal_(API)

Ascend NPU

issues specific to Ascend NPUs

build

Compilation issues

devops

improvements to build systems and github actions

documentation

Improvements or additions to documentation

examples ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

python

python script changes

script

Script related

server testing

Everything test related

Vulkan

Issues specific to the Vulkan backend

#17687 opened Dec 2, 2025 by z-manoj • Draft

Feature/kimi linear support ggml

changes relating to the ggml tensor library for machine learning

model

Model specific

Nvidia GPU

Issues specific to Nvidia GPUs

python

python script changes

#17592 opened Nov 29, 2025 by cacaview

Loading…

Add PagedAttention support (experimental, CUDA only) examples ggml

changes relating to the ggml tensor library for machine learning

model

Model specific

Nvidia GPU

Issues specific to Nvidia GPUs

server

#17579 opened Nov 28, 2025 by ericcurtin • Draft

HIP: Add RDNA3 WMMA support to MMF ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

#17495 opened Nov 25, 2025 by unverbraucht

Loading…

mtmd: Add DeepSeekOCR Support examples ggml

changes relating to the ggml tensor library for machine learning

model

Model specific

Nvidia GPU

Issues specific to Nvidia GPUs

python

python script changes

#17400 opened Nov 20, 2025 by sfallah

Loading…

ggml : enhance rel-pos and window ops with CUDA support ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

testing

Everything test related

#17383 opened Nov 19, 2025 by bluebread

Loading…

CUDA: add CONV_3D operator support documentation

Improvements or additions to documentation

ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

#17255 opened Nov 14, 2025 by YaelGitAccount

Loading…

CUDA & CPU: support F32 kernel type for CONV_TRANSPOSE_2D ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

testing

Everything test related

#17094 opened Nov 8, 2025 by AgainstEntropy

Loading…

sampling : add support for backend sampling Apple Metal

https://en.wikipedia.org/wiki/Metal_(API)

build

Compilation issues

examples ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

python

python script changes

server testing

Everything test related

#17004 opened Nov 4, 2025 by danbev

Loading…

17 of 25 tasks

Mamba2 SSD Apple Metal

https://en.wikipedia.org/wiki/Metal_(API)

examples ggml

changes relating to the ggml tensor library for machine learning

model

Model specific

Nvidia GPU

Issues specific to Nvidia GPUs

testing

Everything test related

#16982 opened Nov 3, 2025 by gabe-l-hart • Draft

CUDA: add implicit conv3d ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

testing

Everything test related

#16948 opened Nov 2, 2025 by bssrdf

Loading…

Enable CUDA graphs for embed gemma 300m ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

#16844 opened Oct 29, 2025 by ArshM17-NV

Loading…

Add basic support for MXFP6_MOE quantization examples ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

python

python script changes

#16777 opened Oct 26, 2025 by horasal

Loading…

Implement and use cuda graph plans ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

#16548 opened Oct 13, 2025 by wishstudio

Loading…

Add hipblasLt implementation for batched gemm to improve performance for CDNA3 only ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

#16457 opened Oct 7, 2025 by peizhang56

Loading…

ggml-cuda: Vulkan direct conv 2D ported to CUDA ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

#16088 opened Sep 18, 2025 by etasnadi

Loading…

Deterministic inference mode (CUDA): RMSNorm, MatMul, Attention, KV-cache documentation

Improvements or additions to documentation

ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

script

Script related

testing

Everything test related

#16016 opened Sep 15, 2025 by creatorrr • Draft

--numa mirror: mirror model weights to every Numa node in the system Apple Metal

https://en.wikipedia.org/wiki/Metal_(API)

Ascend NPU

issues specific to Ascend NPUs

devops

improvements to build systems and github actions

examples ggml

changes relating to the ggml tensor library for machine learning

IBM zDNN

issues specific to IBM zDNN Accelerator

Nvidia GPU

Issues specific to Nvidia GPUs

OpenCL

Issues specific to the OpenCL backend

python

python script changes

SYCL

https://en.wikipedia.org/wiki/SYCL - GPU programming language

testing

Everything test related

Vulkan

Issues specific to the Vulkan backend

#16000 opened Sep 15, 2025 by dbsanfte • Draft

CUDA: print CUDART_VERSION on init ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

#15853 opened Sep 7, 2025 by JohannesGaessler

Loading…

Previous12 Next

PreviousNext

ProTip! What’s not been updated in a month: updated:<2025-11-12.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!