CARVIEW |
Select Language
HTTP/2 200
date: Sun, 12 Oct 2025 07:43:06 GMT
content-type: text/html; charset=utf-8
vary: X-PJAX, X-PJAX-Container, Turbo-Visit, Turbo-Frame, X-Requested-With,Accept-Encoding, Accept, X-Requested-With
etag: W/"e6d60888fd62b3c4c25124695d08d5c3"
cache-control: max-age=0, private, must-revalidate
strict-transport-security: max-age=31536000; includeSubdomains; preload
x-frame-options: deny
x-content-type-options: nosniff
x-xss-protection: 0
referrer-policy: no-referrer-when-downgrade
content-security-policy: default-src 'none'; base-uri 'self'; child-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/; connect-src 'self' uploads.github.com www.githubstatus.com collector.github.com raw.githubusercontent.com api.github.com github-cloud.s3.amazonaws.com github-production-repository-file-5c1aeb.s3.amazonaws.com github-production-upload-manifest-file-7fdce7.s3.amazonaws.com github-production-user-asset-6210df.s3.amazonaws.com *.rel.tunnels.api.visualstudio.com wss://*.rel.tunnels.api.visualstudio.com github.githubassets.com objects-origin.githubusercontent.com copilot-proxy.githubusercontent.com proxy.individual.githubcopilot.com proxy.business.githubcopilot.com proxy.enterprise.githubcopilot.com *.actions.githubusercontent.com wss://*.actions.githubusercontent.com productionresultssa0.blob.core.windows.net/ productionresultssa1.blob.core.windows.net/ productionresultssa2.blob.core.windows.net/ productionresultssa3.blob.core.windows.net/ productionresultssa4.blob.core.windows.net/ productionresultssa5.blob.core.windows.net/ productionresultssa6.blob.core.windows.net/ productionresultssa7.blob.core.windows.net/ productionresultssa8.blob.core.windows.net/ productionresultssa9.blob.core.windows.net/ productionresultssa10.blob.core.windows.net/ productionresultssa11.blob.core.windows.net/ productionresultssa12.blob.core.windows.net/ productionresultssa13.blob.core.windows.net/ productionresultssa14.blob.core.windows.net/ productionresultssa15.blob.core.windows.net/ productionresultssa16.blob.core.windows.net/ productionresultssa17.blob.core.windows.net/ productionresultssa18.blob.core.windows.net/ productionresultssa19.blob.core.windows.net/ github-production-repository-image-32fea6.s3.amazonaws.com github-production-release-asset-2e65be.s3.amazonaws.com insights.github.com wss://alive.github.com wss://alive-staging.github.com api.githubcopilot.com api.individual.githubcopilot.com api.business.githubcopilot.com api.enterprise.githubcopilot.com; font-src github.githubassets.com; form-action 'self' github.com gist.github.com copilot-workspace.githubnext.com objects-origin.githubusercontent.com; frame-ancestors 'none'; frame-src viewscreen.githubusercontent.com notebooks.githubusercontent.com; img-src 'self' data: blob: github.githubassets.com media.githubusercontent.com camo.githubusercontent.com identicons.github.com avatars.githubusercontent.com private-avatars.githubusercontent.com github-cloud.s3.amazonaws.com objects.githubusercontent.com release-assets.githubusercontent.com secured-user-images.githubusercontent.com/ user-images.githubusercontent.com/ private-user-images.githubusercontent.com opengraph.githubassets.com marketplace-screenshots.githubusercontent.com/ copilotprodattachments.blob.core.windows.net/github-production-copilot-attachments/ github-production-user-asset-6210df.s3.amazonaws.com customer-stories-feed.github.com spotlights-feed.github.com objects-origin.githubusercontent.com *.githubusercontent.com; manifest-src 'self'; media-src github.com user-images.githubusercontent.com/ secured-user-images.githubusercontent.com/ private-user-images.githubusercontent.com github-production-user-asset-6210df.s3.amazonaws.com gist.github.com; script-src github.githubassets.com; style-src 'unsafe-inline' github.githubassets.com; upgrade-insecure-requests; worker-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/
server: github.com
content-encoding: gzip
accept-ranges: bytes
set-cookie: _gh_sess=XMGJWpglmQaHi%2B6yK4H4iM96x87vu75baTy04qTfmKOzaPVYEq3aNc1%2FNRp0yZEpRzoHxr1XoRRpOqnRs7ajzWNLwNRYS6PolD2uvngwWpI9AO90YzFmfj29PYd7PYyNMKeMEUW%2Bn%2BG3hPEK43k3KMLxrNvI2t1FvnBKQOHmQ8umujsYzSQe6jc3%2Fn%2Fzcp%2FTGj53BE8lfCQL3CmbKTPjBv%2BmTU7H%2Baq6n0XD%2Fdg9Ur%2B%2FLYBI4pLNsk%2Fvdfin1A%2B3cRJhEO8%2Fy8%2BmyGLga%2F1Akw%3D%3D--6F7RKVYj1%2FcySkIb--d2twQHrIuOy7QWl%2B%2BTbG9Q%3D%3D; Path=/; HttpOnly; Secure; SameSite=Lax
set-cookie: _octo=GH1.1.1557987888.1760254986; Path=/; Domain=github.com; Expires=Mon, 12 Oct 2026 07:43:06 GMT; Secure; SameSite=Lax
set-cookie: logged_in=no; Path=/; Domain=github.com; Expires=Mon, 12 Oct 2026 07:43:06 GMT; HttpOnly; Secure; SameSite=Lax
x-github-request-id: E2CC:C7FF0:D7F076:11FF02F:68EB5C0A
Releases · ggml-org/llama.cpp · GitHub
12 Oct 06:36
12 Oct 06:19
Loading
12 Oct 06:16
Loading
11 Oct 19:50
Loading
11 Oct 15:48
Loading
11 Oct 15:14
Loading
11 Oct 11:19
Loading
10 Oct 19:40
Loading
10 Oct 15:13
Loading
10 Oct 15:00
Loading
Skip to content
Navigation Menu
{{ message }}
-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Releases: ggml-org/llama.cpp
Releases · ggml-org/llama.cpp
b6739
41aac5c
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Compare
ggml : Fix FP16 ELU positive branch (#16519) Co-authored-by: Aaron <shelhamer.aaron@gmail.com>
Assets 15
- sha256:8c79a9b226de4b3cacfd1f83d24f962d0773be79f1e7b75c6af4ded7e32ae1d6373 MB
2025-10-12T06:36:28Z - sha256:65c713572a91c3fb61a9c61a485dadf2166d5721f2f7f1f03fb0575c4bf892ff10.4 MB
2025-10-12T06:36:38Z - sha256:b9944fc0a3ccddc417252b311f43c3a154f5809a44c1725436e66fdd75c5306e26.9 MB
2025-10-12T06:36:39Z - sha256:0152c4b9223e9297bcd0b1f50db7963a389243e3090ee48e2acf6ccdb8de120f25.5 MB
2025-10-12T06:36:41Z - sha256:498748cac6e9b61d5b1148c6e5b40ae92257f50d2ba37423dc8cfe87e17e9f1112.4 MB
2025-10-12T06:36:42Z - sha256:a4a10dc6afb984ec72bdb31094de01c4dac75d14c1e43558793d1ae4138c1b8710.6 MB
2025-10-12T06:36:43Z - sha256:d9a21b7b6ce2062f33db891fe74b37c26247b240c0988f167673ef8b87c85b8613.6 MB
2025-10-12T06:36:43Z - sha256:b7c102f80ceccaddb006c949b370b6c82bdb229b6dc9ffd9be9b5ee9838c381c161 MB
2025-10-12T06:36:45Z - sha256:c282f53a094c486a98c1dfe2c8d2753005420fada7449b57f5c37c292569f6fe321 MB
2025-10-12T06:36:50Z - sha256:7aac1a71ab3c1c26df40a2c0085950580c6a389c6678846caa4226fb898dbb6611 MB
2025-10-12T06:36:58Z -
2025-10-12T05:25:37Z -
2025-10-12T05:25:37Z - Loading
b6738
a2fba89
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Compare
hparams : add check for layer index in is_recurrent (#16511) * hparams : add check for layer index in is_recurrent This commit adds a check in the is_recurrent method to ensure that the provided layer index is within the valid range. The motivation for this change is to prevent potential out-of-bounds and also be consistent with other methods in the class that perform similar checks, like is_swa.
Assets 15
b6737
20cc625
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Compare
ggml: Correct SVE implementation in ggml_vec_dot_f16_unroll (#16518) The previous SVE implementation for `ggml_vec_dot_f16_unroll` contained a bug due to a copy-paste error. The wrong variable was used in an FMA instruction, leading to incorrect results. This commit corrects the variable usage and improves the clarity of the code by renaming variables to avoid confusion. Co-authored-by: Aaron <shelhamer.aaron@gmail.com>
Assets 15
b6736
11f0af5
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Compare
CUDA: faster tile FA, add oob checks, more HSs (#16492)
Assets 15
3 people reacted
b6735
a3cb047
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Compare
metal : fix mul-mm condition + fix mul-mv permuted kernels (#16494)
Assets 15
b6733
31d0ff1
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Compare
server / ranking : add sorting and management of top_n (#16403) * server / ranking : add sorting and management of top_n * Make the retro compatible if no top_n will return all results here is a script to make some test ```script URL=${1:-https://127.0.0.1:8181} curl "$URL/v1/rerank" -H "Content-Type: application/json" \ -d '{ "model": "M", "query": "What is the recipe to make bread ?", "return_text" : true, "texts" : true, "top_n": 6, "documents": [ "voici la recette pour faire du pain, il faut de la farine de l eau et du levain et du sel", "it is a bear", "bread recipe : floor, water, yest, salt", "The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China.", "here is the ingedients to bake bread : 500g floor, 350g water, 120g fresh refresh yest, 15g salt", "recipe to make cookies : floor, eggs, water, chocolat", "here is the recipe to make bread : 500g floor, 350g water, 120g fresh refresh yest, 15g salt", "il fait tres beau aujourd hui", "je n ai pas faim, je ne veux pas manger", "je suis a paris" ] }' | jq ``` * use resize() instead for(...) * simplify top_n init since no need to return error result to test : ./tests.sh unit/test_rerank.py -v -x ==================================================== test session starts ===================================================== platform linux -- Python 3.12.3, pytest-8.3.5, pluggy-1.6.0 -- /home/yann/dev/yann/llama.cpp/tools/server/tests/test/bin/python3 cachedir: .pytest_cache rootdir: /home/yann/dev/yann/llama.cpp/tools/server/tests configfile: pytest.ini plugins: anyio-4.11.0 collected 8 items unit/test_rerank.py::test_rerank PASSED [ 12%] unit/test_rerank.py::test_rerank_tei_format PASSED [ 25%] unit/test_rerank.py::test_invalid_rerank_req[documents0] PASSED [ 37%] unit/test_rerank.py::test_invalid_rerank_req[None] PASSED [ 50%] unit/test_rerank.py::test_invalid_rerank_req[123] PASSED [ 62%] unit/test_rerank.py::test_invalid_rerank_req[documents3] PASSED [ 75%] unit/test_rerank.py::test_rerank_usage[Machine learning is-A machine-Learning is-19] PASSED [ 87%] unit/test_rerank.py::test_rerank_usage[Which city?-Machine learning is -Paris, capitale de la-26] PASSED [100%] ===================================================== 8 passed in 4.31s ====================================================== * add rerank top_n unit test here is the result : ./tests.sh unit/test_rerank.py -v -x =================================================================== test session starts =================================================================== platform linux -- Python 3.12.3, pytest-8.3.5, pluggy-1.6.0 -- /home/yann/dev/yann/llama.cpp/tools/server/tests/test/bin/python3 cachedir: .pytest_cache rootdir: /home/yann/dev/yann/llama.cpp/tools/server/tests configfile: pytest.ini plugins: anyio-4.11.0 collected 16 items unit/test_rerank.py::test_rerank PASSED [ 6%] unit/test_rerank.py::test_rerank_tei_format PASSED [ 12%] unit/test_rerank.py::test_invalid_rerank_req[documents0] PASSED [ 18%] unit/test_rerank.py::test_invalid_rerank_req[None] PASSED [ 25%] unit/test_rerank.py::test_invalid_rerank_req[123] PASSED [ 31%] unit/test_rerank.py::test_invalid_rerank_req[documents3] PASSED [ 37%] unit/test_rerank.py::test_rerank_usage[Machine learning is-A machine-Learning is-19] PASSED [ 43%] unit/test_rerank.py::test_rerank_usage[Which city?-Machine learning is -Paris, capitale de la-26] PASSED [ 50%] unit/test_rerank.py::test_rerank_top_n[None-4] PASSED [ 56%] unit/test_rerank.py::test_rerank_top_n[2-2] PASSED [ 62%] unit/test_rerank.py::test_rerank_top_n[4-4] PASSED [ 68%] unit/test_rerank.py::test_rerank_top_n[99-4] PASSED [ 75%] unit/test_rerank.py::test_rerank_tei_top_n[None-4] PASSED [ 81%] unit/test_rerank.py::test_rerank_tei_top_n[2-2] PASSED [ 87%] unit/test_rerank.py::test_rerank_tei_top_n[4-4] PASSED [ 93%] unit/test_rerank.py::test_rerank_tei_top_n[99-4] PASSED [100%] =================================================================== 16 passed in 8.84s =================================================================== * editor config check fix
Assets 15
b6732
97870e6
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Compare
cuda : avoid initializing unused devices (#16510)
Assets 15
b6730
e60f01d
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Compare
server : fix division by zero when reporting stats (#16501)
Assets 15
b6729
81086cd
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Compare
vocab : mark EOT token for Granite models (#16499) * vocab : mark EOT token for Granite models * sampling : fallback to EOS when EOT is not found
Assets 15
b6728
68ee98a
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Compare
server : return HTTP 400 if prompt exceeds context length (#16486) In streaming mode when prompt exceeds context length, the server returns HTTP 200 status code with a JSON error in the body. This is very confusing and inconsistent with all other inference engines which return HTTP 4xx error in this case. This patch fixes this problem and makes the server return HTTP 400 in such cases.
Assets 15
1 person reacted
Previous Next
You can’t perform that action at this time.