ggml-org /llama.cppPublic

Notifications You must be signed in to change notification settings
Fork 14.1k
Star 91.2k

Code
Issues336
Pull requests616
Discussions
Actions
Projects10
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Pull requests: ggml-org/llama.cpp

Labels 86 Milestones 0

New pull requestNew

Clear current search query, filters, and sorts

616 Open 8,074 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Modern Bert Support model

Model specific

python

python script changes

#15641 opened Aug 28, 2025 by ryan-mangeno

Loading…

llama : add llama_batch_ext android

Issues specific to Android

examples python

python script changes

server

#11875 opened Feb 14, 2025 by ngxson

Loading…

sampling : add support for backend sampling Apple Metal

https://en.wikipedia.org/wiki/Metal_(API)

build

Compilation issues

examples ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

python

python script changes

server testing

Everything test related

#17004 opened Nov 4, 2025 by danbev

Loading…

17 of 25 tasks

llama: Attempt to add ModernBert model

Model specific

python

python script changes

#14014 opened Jun 4, 2025 by huydt84

Loading…

add FP8 support to gguf/llama: build

Compilation issues

examples ggml

changes relating to the ggml tensor library for machine learning

script

Script related

Tensor Encoding Scheme

https://github.com/ggerganov/llama.cpp/wiki/Tensor-Encoding-Schemes

testing

Everything test related

#10055 opened Oct 26, 2024 by Djip007 • Draft

1 of 3 tasks

mtmd: Add DeepSeekOCR Support examples ggml

changes relating to the ggml tensor library for machine learning

model

Model specific

Nvidia GPU

Issues specific to Nvidia GPUs

python

python script changes

#17400 opened Nov 20, 2025 by sfallah

Loading…

Implement SparseK Attention mechanism — new GGML operator with CPU backend (GPU planned next) ggml

changes relating to the ggml tensor library for machine learning

python

python script changes

testing

Everything test related

#16817 opened Oct 28, 2025 by yael-works

Loading…

[Research] Steering vectors research 🔬

#1472 opened May 16, 2023 by SlyEcho • Draft

tool: add convertation of text/parquet to custom format build

Compilation issues

examples

#14622 opened Jul 10, 2025 by lexasub

Loading…

model : add LLADA 2.0 diffusion support examples model

Model specific

python

python script changes

#17454 opened Nov 23, 2025 by wsbagnsv1 • Draft

Feature/kimi linear support ggml

changes relating to the ggml tensor library for machine learning

model

Model specific

Nvidia GPU

Issues specific to Nvidia GPUs

python

python script changes

#17592 opened Nov 29, 2025 by cacaview

Loading…

imatrix: calculate activation-based statistics for new format (GGUF) imatrices examples

#14891 opened Jul 26, 2025 by EAddario

Loading…

Implementation of a sequence repetition penalty sampler enhancement

New feature or request

generation quality

Quality of model output

need feedback

Testing and feedback with results are needed

#2593 opened Aug 12, 2023 by KerfuffleV2 • Draft

llama : second attempt to refactor vision API examples python

python script changes

server

#11292 opened Jan 18, 2025 by ngxson • Draft

1 of 5 tasks

llama-cli: add support for reasoning examples

#16603 opened Oct 16, 2025 by bandoti

Loading…

WIP: Add model merge example demo

Demonstrate some concept or idea, not intended to be merged

help wanted

Needs help from the community

#5741 opened Feb 26, 2024 by ngxson • Draft

cuda : Add conv2d Implicit GEMM ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

testing

Everything test related

#15805 opened Sep 4, 2025 by bssrdf

Loading…

[MPI] Add support for per-node options, thread counts, and layer allocations build

Compilation issues

examples ggml

changes relating to the ggml tensor library for machine learning

server

#3334 opened Sep 26, 2023 by AutonomicPerfectionist • Draft

2 of 5 tasks

Update gpt2 preprocess and add deepseek coder preprocess

#4070 opened Nov 14, 2023 by DOGEwbx

Loading…

Implement llama-pull tool examples

#16423 opened Oct 4, 2025 by ericcurtin

Loading…

Generic Chat templating code with text/json file based config; main chat updated to drive its in-prefix, in-suffix and reverse-prompt from same; chat-apply-template equivalent c-api to allow use by other codes also enhancement

New feature or request

Review Complexity : Medium

Generally require more time to grok but manageable by beginner to medium expertise level

#6834 opened Apr 22, 2024 by hanishkvc • Draft

[Review] Merge PowerInfer with llama.cpp mainline

#4543 opened Dec 20, 2023 by chsasank • Draft

support MiniCPM-V-2 demo

Demonstrate some concept or idea, not intended to be merged

enhancement

New feature or request

examples python

python script changes

Review Complexity : High

Generally require indepth knowledge of LLMs or GPUs

#6919 opened Apr 26, 2024 by Achazwl

Loading…

Layer skipping/self-speculation demo demo

Demonstrate some concept or idea, not intended to be merged

research 🔬

#3565 opened Oct 10, 2023 by KerfuffleV2 • Draft

Server: enable lookup decoding enhancement

New feature or request

examples Review Complexity : Medium

Generally require more time to grok but manageable by beginner to medium expertise level

#6828 opened Apr 22, 2024 by JohannesGaessler

Loading…

Previous12 3 4 5…24 25 Next

PreviousNext

ProTip! What’s not been updated in a month: updated:<2025-11-13.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!