| CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 14.2k
Releases: ggml-org/llama.cpp
b7541
b07cda6CANN: implement the SSM_CONV operator (#17737)
- CANN: implement SSM_CONV operator
Co-authored-by: Aleksei Lobanov, zeromarblectm@gmail.com
Co-authored-by: Sujin Kang, waterjin326@gmail.com
-
CANN: remove custom error limit for SSM_CONV
-
CANN: merge SSM_CONV tensor shape/strides into one line
Co-authored-by: Sujin Kang, waterjin326@gmail.com
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA 12) - CUDA 12.4 DLLs
- Windows x64 (CUDA 13) - CUDA 13.1 DLLs
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler:
Assets 22
- sha256:8c79a9b226de4b3cacfd1f83d24f962d0773be79f1e7b75c6af4ded7e32ae1d6373 MB
2025-12-26T01:54:41Z - sha256:f96935e7e385e3b2d0189239077c10fe8fd7e95690fea4afec455b1b6c7e3f18384 MB
2025-12-26T01:54:49Z - sha256:832d4fbfb992f1c58cde7117b6321cadd04b3604c90be8dfd079cd0e35022c6541.9 MB
2025-12-26T01:54:56Z - sha256:a2c520b3c8b01596c0f7621b8fcf0561322ae045867bc9b2bc8d11af1d66e02d45.9 MB
2025-12-26T01:54:58Z - sha256:6efb324760781f214f9693df53c139b41989e53ffb27cca3ce9dec2c6d8a2fe241.9 MB
2025-12-26T01:55:00Z - sha256:acaa6e21fbc8c10ceef14d5ddb833c7ec236d35b242f25efaafd71c668c63d0e45.9 MB
2025-12-26T01:55:01Z - sha256:4731cce7822c30bbae794f4af00a3b751513b07d5a56389fe2687fcc0be3f14715.9 MB
2025-12-26T01:55:02Z - sha256:e140540c34d0918ab4ec9b011d1c5062dd38e28465cca51443752775f85382ad41 MB
2025-12-26T01:55:03Z - sha256:3482a8ea14f28b2a4199cefb2586a66d3452f1dbcea6bb02f8f17626841393f421.3 MB
2025-12-26T01:55:05Z - sha256:3b7a1335f38fe8f1e5b66668a1c54e86d16a731d98445ed7abb45b1cb4cc4beb33.1 MB
2025-12-26T01:55:06Z -
2025-12-26T01:12:04Z -
2025-12-26T01:12:04Z - Loading
b7540
85c40c9ggml-cuda: fix regex for arch list (#18371)
-
ggml-cuda: fix regex for arch list
-
make regex exact
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA 12) - CUDA 12.4 DLLs
- Windows x64 (CUDA 13) - CUDA 13.1 DLLs
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler:
Assets 22
b7539
83b3b1ccuda: optimize cumsum cub path (#18362)
-
cuda: optimize cumsum cub path
-
remove heavy perf test
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA 12) - CUDA 12.4 DLLs
- Windows x64 (CUDA 13) - CUDA 13.1 DLLs
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler:
Assets 22
b7538
b0fb0f0ggml-cuda: fix blackwell native builds (#18361)
- ggml-cuda: fix blackwell native builds
Replace 12x in native architectures by 12xa
-
replace for GGML_NATIVE=OFF too
-
only replace for native
-
remove 120f-virtual for default compilation
Co-authored-by: Aman Gupta
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA 12) - CUDA 12.4 DLLs
- Windows x64 (CUDA 13) - CUDA 13.1 DLLs
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler:
Assets 22
b7531
54132f1model : support for LlamaBidirectionalModel architecture (#18220)
-
model: llama-embed-nemotron
-
minor: python lint
-
changed arch-name
-
templated llm_build_llama to be used for both llama and llama-embed arch
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA 12) - CUDA 12.4 DLLs
- Windows x64 (CUDA 13) - CUDA 13.1 DLLs
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler:
Assets 22
b7530
2a9ea20vulkan: fix command buffer corruption in ggml_backend_vk_event_wait (#18302)
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA 12) - CUDA 12.4 DLLs
- Windows x64 (CUDA 13) - CUDA 13.1 DLLs
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler:
Assets 22
b7529
ce7a6dcCANN : refactor ACL graph cache (#17752)
Move the graph property checking code into methods of LRU cache.
Signed-off-by: Wang Weixuan wangweixvan@gmail.com
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA 12) - CUDA 12.4 DLLs
- Windows x64 (CUDA 13) - CUDA 13.1 DLLs
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler:
Assets 22
b7527
7f459c9vulkan: use fewer FA rows for small cache runs (#18280)
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA 12) - CUDA 12.4 DLLs
- Windows x64 (CUDA 13) - CUDA 13.1 DLLs
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler:
Assets 22
b7526
cf2ffc0CANN: Uses yarn_ramp cache in ROPE (#17725)
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA 12) - CUDA 12.4 DLLs
- Windows x64 (CUDA 13) - CUDA 13.1 DLLs
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler:
Assets 22
b7525
10355dccommon: add LLAMA_ARG_OVERRIDE_TENSOR env var for -ot arg (#18267)
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA 12) - CUDA 12.4 DLLs
- Windows x64 (CUDA 13) - CUDA 13.1 DLLs
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler: