- Notifications
You must be signed in to change notification settings - Fork 14.1k
Pull requests: ggml-org/llama.cpp
Author
Uh oh!
There was an error while loading. Please reload this page.
Label
Uh oh!
There was an error while loading. Please reload this page.
Projects
Uh oh!
There was an error while loading. Please reload this page.
Milestones
Uh oh!
There was an error while loading. Please reload this page.
Reviews
Assignee
Assigned to nobodyLoading
Uh oh!
There was an error while loading. Please reload this page.
Sort
Pull requests list
ggml-hexagon: Implement true Q8_0 quantization on Hexagon NPU for more accurate mixed-precision matmul operations
#17977 opened Dec 12, 2025 by ngdxzyLoading…
common : add llama-completion to completion-bash executables
#17976 opened Dec 12, 2025 by CISCLoading…
common : skip model validation when --completion-bash is requested
#17975 opened Dec 12, 2025 by CISCLoading…
llama_context: synchronize before reallocating output buffer
#17974 opened Dec 12, 2025 by jeffbolznvLoading…
cmake: correct scope - link ws2_32 for MinGW/w64devkit builds in cpp-httplib
#17972 opened Dec 12, 2025 by gustrdLoading…
Build CUDA architectures 120 and 121 by default (RTX5000 and GB10) ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#17970 opened Dec 12, 2025 by DaAwesomePLoading…
mtmd: fix GLM4V vision encoder 2D RoPE implementation examples ggml changes relating to the ggml tensor library for machine learning model Model specific Nvidia GPU Issues specific to Nvidia GPUs python python script changes
#17967 opened Dec 12, 2025 by eelbazLoading…
server: support global section of presets examples server
#17959 opened Dec 12, 2025 by ngxsonLoading…
server: add encoder-decoder model support (T5, BART, MADLAD) examples server
#17956 opened Dec 12, 2025 by TureeLoading…
scripts: add script to compare logprobs of llama.cpp against other frameworks python python script changes script Script related
#17947 opened Dec 11, 2025 by ngxsonLoading…
vulkan: Add perf logger mode with concurrency ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#17944 opened Dec 11, 2025 by jeffbolznvLoading…
vulkan: support get_rows for i32 ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#17941 opened Dec 11, 2025 by jeffbolznvLoading…
CANN: CONV_TRANSPOSE_1D operator: supporting the cases where (op->src[0]->ne[0] - 1) > 255 Ascend NPU issues specific to Ascend NPUs ggml changes relating to the ggml tensor library for machine learning
#17934 opened Dec 11, 2025 by IntellouisLoading…
Webui: Disable attachment button and model selector button when prompt textbox is disabled. examples server
#17925 opened Dec 11, 2025 by dariusjlukasLoading…
Gigachat 3 tool parser and tests testing Everything test related
#17924 opened Dec 11, 2025 by MishushaLoading…
ggml-hexagon: gelu operation ggml changes relating to the ggml tensor library for machine learning
#17921 opened Dec 10, 2025 by joeldushouyu • Draft
Restore clip's cb() to its rightful glory - extract common debugging elements in llama examples
#17914 opened Dec 10, 2025 by pwilkinLoading…
Make
LlamaData utility functions static in llama-run examples #17913 opened Dec 10, 2025 by rauletorrescLoading…
server: fix crash when batch > ubatch with embeddings (#12836) examples server
#17912 opened Dec 10, 2025 by yifant-codeLoading…
PreviousNext
ProTip! Updated in the last three days: updated:>2025-12-09.