You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Artifact Evaluation for ANT [MICRO'22] and OliVe [ISCA'23]
Citation
If you use ANT or OliVe in your research, please cite our paper:
@inproceedings{guo2022ant,
title={ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization},
author={Guo, Cong and Zhang, Chen and Leng, Jingwen and Liu, Zihan and Yang, Fan and Liu, Yunxin and Guo, Minyi and Zhu, Yuhao},
booktitle={2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO)},
pages={1414--1433},
year={2022},
organization={IEEE}
}
@article{guo2023olive,
author = {Guo, Cong and Tang, Jiaming and Hu, Weiming and Leng, Jingwen and Zhang, Chen and Yang, Fan and Liu, Yunxin and Guo, Minyi and Zhu, Yuhao},
title = {OliVe: Accelerating Large Language Models via Hardware-friendly Outlier-Victim Pair Quantization},
year = {2023},
eprint = {arXiv:2304.07493},
doi = {10.1145/3579371.3589038},
}
This repository contains the source code for reproducing the experiments in the paper "ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization" at MICRO'22 and "OliVe: Accelerating Large Language Models via Hardware-friendly Outlier-Victim Pair Quantization" at ISCA'23.
Project Structure
ant_quantization contains the ANT framework with PyTorch.
olive_quantization contains the OliVe framework with PyTorch.
ant_simulator contains the performance and energy evaluation of ANT.