Codes

  • Medusa. Medusa: a easy-to-use framework which accelerates LLM generation through multiple light-weighted decoding head.
    Key technique: , LLM, speculative decoding.

  • MaxK-GNN. Implementation of “MaxK-GNN: Towards Theoretical Speed Limits for Accelerating Graph Neural Networks Training.”
    Key technique: Graph neural network, GPU kernel design, MaxK nonlinearity.

  • LoT. Implementation of “Learning from Teaching Regularization: Generalizable Correlations Should be Easy to Imitate.”
    Key technique: Generalization, regularization.

  • LinGCN. Implementation of “LinGCN: Structural Linearized Graph Convolutional Network for Homomorphically Encrypted Inference.”
    Key technique: Privacy-Preserving Machine Learning, efficient private inference, machine learning as a service, homomorphic encryption, non-linear pruning, ST-GCN.

  • AutoReP. Implementation of AutoReP ReLU replacement algorithm on CIFAR-10, CIFAR-100, TinyImageNet and ImageNet.
    Key technique: L0 replacement algorithm, polynomial activation function, private inference.

  • Accel-GCN. Implementation of SpMM for GCNs on GPU platform.
    Key technique: degree sorting, block-level partition, combined warp.

  • PASNet (non-fpga). Implementation of PASNet algorithm (without fpga accerlation), with evaluation code on PASNet-A (ResNet18 as backbone), PASNet-B and PASNet-C (ResNet50 as backbone), PASNet-D (MobileNetV2 as backbone) on imagenet.
    Key technique: Neural architecuture search, 2 party-computation, private inference

  • SparseGNN. Implementation of different sparsification algorithm on GNNs, such as sparse training, ADMM, SLR.
    Key technique: Sparse Training, ADMM, SLR, GNN training.