CARVIEW |
Select Language
HTTP/2 200
date: Wed, 23 Jul 2025 21:59:59 GMT
content-type: text/html; charset=utf-8
vary: X-PJAX, X-PJAX-Container, Turbo-Visit, Turbo-Frame, X-Requested-With,Accept-Encoding, Accept, X-Requested-With
etag: W/"b6e4620959ae895bddb575f55c8b9eba"
cache-control: max-age=0, private, must-revalidate
strict-transport-security: max-age=31536000; includeSubdomains; preload
x-frame-options: deny
x-content-type-options: nosniff
x-xss-protection: 0
referrer-policy: no-referrer-when-downgrade
content-security-policy: default-src 'none'; base-uri 'self'; child-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/; connect-src 'self' uploads.github.com www.githubstatus.com collector.github.com raw.githubusercontent.com api.github.com github-cloud.s3.amazonaws.com github-production-repository-file-5c1aeb.s3.amazonaws.com github-production-upload-manifest-file-7fdce7.s3.amazonaws.com github-production-user-asset-6210df.s3.amazonaws.com *.rel.tunnels.api.visualstudio.com wss://*.rel.tunnels.api.visualstudio.com objects-origin.githubusercontent.com copilot-proxy.githubusercontent.com proxy.individual.githubcopilot.com proxy.business.githubcopilot.com proxy.enterprise.githubcopilot.com *.actions.githubusercontent.com wss://*.actions.githubusercontent.com productionresultssa0.blob.core.windows.net/ productionresultssa1.blob.core.windows.net/ productionresultssa2.blob.core.windows.net/ productionresultssa3.blob.core.windows.net/ productionresultssa4.blob.core.windows.net/ productionresultssa5.blob.core.windows.net/ productionresultssa6.blob.core.windows.net/ productionresultssa7.blob.core.windows.net/ productionresultssa8.blob.core.windows.net/ productionresultssa9.blob.core.windows.net/ productionresultssa10.blob.core.windows.net/ productionresultssa11.blob.core.windows.net/ productionresultssa12.blob.core.windows.net/ productionresultssa13.blob.core.windows.net/ productionresultssa14.blob.core.windows.net/ productionresultssa15.blob.core.windows.net/ productionresultssa16.blob.core.windows.net/ productionresultssa17.blob.core.windows.net/ productionresultssa18.blob.core.windows.net/ productionresultssa19.blob.core.windows.net/ github-production-repository-image-32fea6.s3.amazonaws.com github-production-release-asset-2e65be.s3.amazonaws.com insights.github.com wss://alive.github.com api.githubcopilot.com api.individual.githubcopilot.com api.business.githubcopilot.com api.enterprise.githubcopilot.com; font-src github.githubassets.com; form-action 'self' github.com gist.github.com copilot-workspace.githubnext.com objects-origin.githubusercontent.com; frame-ancestors 'none'; frame-src viewscreen.githubusercontent.com notebooks.githubusercontent.com; img-src 'self' data: blob: github.githubassets.com media.githubusercontent.com camo.githubusercontent.com identicons.github.com avatars.githubusercontent.com private-avatars.githubusercontent.com github-cloud.s3.amazonaws.com objects.githubusercontent.com release-assets.githubusercontent.com secured-user-images.githubusercontent.com/ user-images.githubusercontent.com/ private-user-images.githubusercontent.com opengraph.githubassets.com copilotprodattachments.blob.core.windows.net/github-production-copilot-attachments/ github-production-user-asset-6210df.s3.amazonaws.com customer-stories-feed.github.com spotlights-feed.github.com objects-origin.githubusercontent.com *.githubusercontent.com; manifest-src 'self'; media-src github.com user-images.githubusercontent.com/ secured-user-images.githubusercontent.com/ private-user-images.githubusercontent.com github-production-user-asset-6210df.s3.amazonaws.com gist.github.com; script-src github.githubassets.com; style-src 'unsafe-inline' github.githubassets.com; upgrade-insecure-requests; worker-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/
server: github.com
content-encoding: gzip
accept-ranges: bytes
set-cookie: _gh_sess=YPh2yO5OgONL%2B7fhmHgjo0My8kyXu7CRcsuUEhTwRJcymbBNaC3mv6QaeRx%2B7fIVQ3tRl1A6Lmf9xGu1br4wT9Sw47u4a7NUfGzRah0ZD2KZqdYa5ZRVTSlI4hyQ%2FScI1w3PF6YgY%2FBcN1R1SNfeRntPxyHoK3SrRxw70riQ2%2BMf2sXYaXQFVUikznCDLjGAJ5wfd%2BEW6MA2KszOWzy%2FnKcUeHdVVUjHuz05zCx7YMHRrfwoGMpyDDrFJu%2BUJ9iFaoQoIyZKG6WNckbnCrT7kA%3D%3D--b8JkcHzCqP81oZM9--%2FsT5UMC8gkkDCz2dnCnytw%3D%3D; Path=/; HttpOnly; Secure; SameSite=Lax
set-cookie: _octo=GH1.1.969058017.1753307998; Path=/; Domain=github.com; Expires=Thu, 23 Jul 2026 21:59:58 GMT; Secure; SameSite=Lax
set-cookie: logged_in=no; Path=/; Domain=github.com; Expires=Thu, 23 Jul 2026 21:59:58 GMT; HttpOnly; Secure; SameSite=Lax
x-github-request-id: BCC2:124FF4:10FBF9F:144C6DD:68815B5E
Release GPTQModel v2.1.0 ยท ModelCloud/GPTQModel ยท GitHub
Loading
Skip to content
Navigation Menu
{{ message }}
-
Notifications
You must be signed in to change notification settings - Fork 101
GPTQModel v2.1.0
Compare
·
105 commits
to main
since this release
37d4b2b
This commit was created on GitHub.com and signed with GitHubโs verified signature.
What's Changed
โจ New QQQ quantization method and inference support!
โจ New Google Gemma 3
day-zero model support.
โจ New Alibaba Ovis 2
VL model support.
โจ New AMD Instella
day-zero model support.
โจ New GSM8K Platinum
and MMLU-Pro
benchmarking suppport.
โจ Peft Lora training with GPTQModel is now 30%+ faster on all gpu and IPEX devices.
โจ Auto detect MoE modules not activated during quantization due to insufficient calibration data.
โจ ROCm
setup.py compat fixes.
โจ Optimum and Peft compat fixes.
โจ Fixed Peft bfloat16 training.
- auto enable flash_attn only when flash-attn was installed by @CSY-ModelCloud in #1372
- Fix rocm compat by @Qubitium in #1373
- fix unnecessary mkdir by @CSY-ModelCloud in #1374
- add test_kernel_output_xpu.py by @CSY-ModelCloud in #1382
- clean test_kernel_output_xpu.py by @CSY-ModelCloud in #1383
- tremove xpu support of triton kernel by @Qubitium in #1384
- [MODEL] Add instella support by @LRL-ModelCloud in #1385
- Fix optimum/peft trainer integration by @CSY-ModelCloud in #1381
- rename peft test file by @CSY-ModelCloud in #1387
- [CI] fix wandb was not installed & update test_olora_finetuning_xpu.py by @CSY-ModelCloud in #1388
- Add lm-eval
GSM8k Platinum
by @Qubitium in #1394 - Remove cuda kernel by @Qubitium in #1396
- fix exllama kernels not compiled by @Qubitium in #1397
- update tests by @Qubitium in #1398
- make the kernel output validation more robust by @Qubitium in #1399
- speed up ci by @Qubitium in #1400
- add fwd counter by @yuchiwang in #1389
- allow triton and ipex to inherit torch kernel and use torch for trainโฆ by @Qubitium in #1401
- fix skip moe modules when fwd count is 0 by @Qubitium in #1404
- fix ipex linear post init for finetune by @jiqing-feng in #1406
- fix optimum compat by @Qubitium in #1408
- [Feature] Add mmlupro API by @CL-ModelCloud in #1405
- add training callback by @CSY-ModelCloud in #1409
- Fix bf16 training by @Qubitium in #1410
- fix bf16 forward for triton by @Qubitium in #1411
- Add QQQ by @Qubitium in #1402
- make IPEX or any kernel that uses Torch for Training to auto switch vโฆ by @Qubitium in #1412
- [CI] xpu inference test by @CL-ModelCloud in #1380
- [FIX] qqq with eora by @ZX-ModelCloud in #1415
- [FIX] device error by @ZX-ModelCloud in #1417
- make quant linear expose internal buffers by @Qubitium in #1418
- Fix bfloat16 kernels by @Qubitium in #1420
- fix qqq bfloat16 forward by @Qubitium in #1423
- Fix ci10 by @Qubitium in #1424
- fix marlin bf16 compat by @Qubitium in #1427
- [CI] no need reinstall requirements by @CSY-ModelCloud in #1426
- [FIX] dynamic save error by @ZX-ModelCloud in #1428
- [FIX] super().post_init() calling order by @ZX-ModelCloud in #1431
- fix bitblas choose IPEX in cuda env by @CSY-ModelCloud in #1432
- Fix exllama is not packable by @Qubitium in #1433
- disable exllama for training by @Qubitium in #1435
- remove TritonV2QuantLinear for xpu test by @CSY-ModelCloud in #1436
- [MODEL] add gemma3 support by @LRL-ModelCloud in #1434
- fix the error when downloading models using modelscope by @mushenL in #1437
- Add QQQ Rotation by @ZX-ModelCloud in #1425
- fix no init.py by @CSY-ModelCloud in #1438
- Fix hardmard import by @Qubitium in #1441
- Eora final by @nbasyl in #1440
- triton is not validated for ipex by @Qubitium in #1445
- Fix exllama adapter by @Qubitium in #1446
- fix rocm compile by @Qubitium in #1447
- [FIX] Correctly obtain the submodule's device by @ZX-ModelCloud in #1448
- fix rocm not compatible with exllama v2 and eora kernel by @Qubitium in #1449
- revert overflow code by @Qubitium in #1450
- add kernel dtype support and add full float15 vs bfloat16 kernel testing by @Qubitium in #1452
- [MODEL] add Ovis2 support and bug fix by @Fusionplay in #1454
- add unit test for ovis2 by @CSY-ModelCloud in #1456
New Contributors
- @yuchiwang made their first contribution in #1389
- @mushenL made their first contribution in #1437
- @nbasyl made their first contribution in #1440
- @Fusionplay made their first contribution in #1454
Full Changelog: v2.0.0...v2.1.0
Assets 61
3 people reacted
You canโt perform that action at this time.