-
Notifications
You must be signed in to change notification settings - Fork 1
Pull requests: auroralabs-loci/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
UPSTREAM PR #20019: gguf: add big-endian magic "FUGG" for explicit endianness detection
#1218
opened Mar 2, 2026 by
loci-dev
Loading…
UPSTREAM PR #20009: server: add Qwen3-Reranker instruction support
#1217
opened Mar 2, 2026 by
loci-dev
Loading…
UPSTREAM PR #20023: gguf-py: add type validation to GGUFWriter.add_key_value
#1216
opened Mar 2, 2026 by
loci-dev
Loading…
UPSTREAM PR #12727: Update llama-quant.cpp llama_tensor_get_type with DeepSeek friendly modifications
#1215
opened Mar 2, 2026 by
loci-dev
Loading…
UPSTREAM PR #19999: cuda: fix grid.y overflow in non-contiguous dequantize/convert kernels
#1214
opened Mar 1, 2026 by
loci-dev
Loading…
UPSTREAM PR #19976: vulkan: improve partial offloading performance on AMD
#1213
opened Mar 1, 2026 by
loci-dev
Loading…
UPSTREAM PR #19966: cuda: fix ggml_cuda_cpy crash on partial GPU offload
#1211
opened Feb 28, 2026 by
loci-dev
Loading…
UPSTREAM PR #19850: ggml-webgpu: Support non-contiguous
src0 and overlapping src0/src1 in binary ops
#1209
opened Feb 28, 2026 by
loci-dev
Loading…
UPSTREAM PR #19770: quantize : fail-early on missing imatrix; refactor + optimize
#1208
opened Feb 28, 2026 by
loci-dev
Loading…
UPSTREAM PR #19841: server : add chat truncation to keep chat going
#1207
opened Feb 27, 2026 by
loci-dev
Loading…
5 of 6 tasks
UPSTREAM PR #19873: Mirroring /v1/responses to /responses
#1206
opened Feb 27, 2026 by
loci-dev
Loading…
UPSTREAM PR #19904: ggml : simplify amx tile config init
#1205
opened Feb 26, 2026 by
loci-dev
Loading…
UPSTREAM PR #19867: ggml : fix AMX and improve alignment checks
#1204
opened Feb 25, 2026 by
loci-dev
Loading…
UPSTREAM PR #19797: common : add more aliases for sampler CLI params
#1203
opened Feb 25, 2026 by
loci-dev
Loading…
UPSTREAM PR #19796: Add model metadata loading from huggingface for use with tests requiring real model data
#1201
opened Feb 23, 2026 by
loci-dev
Loading…
UPSTREAM PR #19819: hexagon refactor all Ops to use local context struct
#1200
opened Feb 23, 2026 by
loci-dev
Loading…
UPSTREAM PR #19791: tools : add learning-cache tool for persistent latent context
#1199
opened Feb 22, 2026 by
loci-dev
Loading…
7 of 10 tasks
UPSTREAM PR #19785: jinja: correct stats for tojson and string filters
#1198
opened Feb 22, 2026 by
loci-dev
Loading…
UPSTREAM PR #19773: server : merge contiguous Responses input items into a single assistant message
#1196
opened Feb 21, 2026 by
loci-dev
Loading…
UPSTREAM PR #19750: feat: Ultra-Low-Bit Quantization Kernels (Q1_5_K, Q2_K_S)
#1195
opened Feb 21, 2026 by
loci-dev
Loading…
UPSTREAM PR #19769: WIP: ggml : add NVFP4 quantization type support
#1194
opened Feb 21, 2026 by
loci-dev
Loading…
Previous Next
ProTip!
Follow long discussions with comments:>50.