You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
VSAG is a vector indexing library used for similarity search. The indexing algorithm allows users to search through various sizes of vector sets, especially those that cannot fit in memory. The library also provides methods for generating parameters based on vector dimensions and data scale, allowing developers to use it without understanding the algorithm’s principles. VSAG is written in C++ and provides a Python wrapper package called pyvsag.
Performance
The VSAG algorithm SINDI for sparse vector search achieves a breakthrough in performance, substantially outperforming previous state-of-the-art (SOTA) solutions. In our internal tests on a 40-core Intel(R) Xeon(R) Silver 4210R CPU, VSAG's QPS exceeds that of the previous SOTA algorithm, Zilliz, by 166% on the sparse-full(8M) at 98% recall. While the official ann-benchmarks on sparse track runs on an Azure Standard D8lds v5 VM, we plan to submit our results under the official benchmark environment soon to formally validate this performance leap.
sparse-full-inner-product
The VSAG algorithm achieves a significant boost of efficiency and outperforms the previous state-of-the-art (SOTA) by a clear margin. Specifically, VSAG's QPS exceeds that of the previous SOTA algorithm, Glass, by over 100%, and the baseline algorithm, HNSWLIB, by over 300% according to the ann-benchmark result on the GIST dataset at 90% recall.
The test in ann-benchmarks is running on an r6i.16xlarge machine on AWS with --parallelism 31, single-CPU, and hyperthreading disabled.
The result is as follows:
gist-960-euclidean
Getting Started
Integrate with CMake
# CMakeLists.txtcmake_minimum_required(VERSION 3.11)
project (myproject)
set (CMAKE_CXX_STANDARD 11)
# download and compile vsaginclude (FetchContent)
FetchContent_Declare (
vsag
GIT_REPOSITORY https://github.com/antgroup/vsag
GIT_TAG main
)
FetchContent_MakeAvailable (vsag)
include_directories (vsag-cmake-example PRIVATE${vsag_SOURCE_DIR}/include)
# compile executable and link to vsagadd_executable (vsag-cmake-example src/main.cpp)
target_link_libraries (vsag-cmake-example PRIVATE vsag)
# add dependencyadd_dependencies (vsag-cmake-example vsag)
Examples
Currently Python and C++ examples are provided, please explore examples directory for details.
If your system uses VSAG, then feel free to make a pull request to add it to the list.
How to Contribute
Although VSAG is initially developed by the Vector Database Team at Ant Group, it's the work of
the community, and contributions are always welcome!
See CONTRIBUTING for ways to get started.
Community
Thrive together in VSAG community with users and developers from all around the world.
introduce pluggable product quantization(known as PQ) in datacell
v0.16 (ETA: May 2025)
support neon instruction acceleration on ARM platform
support using GPU to accelerate index building
provide an optimizer that supports optimizing search parameters by recall or latency
v0.17 (ETA: Jun. 2025)
support amx instruction acceleration on Intel CPU
support attributes stored in vector index
support graph structure compression
Our Publications
VSAG: An Optimized Search Framework for Graph-based Approximate Nearest Neighbor Search [VLDB (industry), 2025] Xiaoyao Zhong, Haotian Li, Jiabao Jin, Mingyu Yang, Deming Chu, Xiangyu Wang, Zhitao Shen, Wei Jia, George Gu, Yi Xie, Xuemin Lin, Heng Tao Shen, Jingkuan Song, Peng Cheng PDF | DOI
Effective and General Distance Computation for Approximate Nearest Neighbor Search [ICDE, 2025] Mingyu Yang, Wentao Li, Jiabao Jin, Xiaoyao Zhong, Xiangyu Wang, Zhitao Shen, Wei Jia, Wei Wang PDF | DOI
SINDI: an Efficient Index for Approximate Maximum Inner Product Search on Sparse Vectors [arxiv, 2025] Ruoxuan Li, Xiaoyao Zhong, Jiabao Jin, Peng Cheng, Wangze Ni, Lei Chen, Zhitao Shen, Wei Jia, Xiangyu Wang, Xuemin Lin, Heng Tao Shen, Jingkuan Song PDF
EnhanceGraph: A Continuously Enhanced Graph-based Index for High-dimensional Approximate Nearest Neighbor Search [arxiv, 2025] Xiaoyao Zhong, Jiabao Jin, Peng Cheng, Mingyu Yang, Lei Chen, Haoyang Li, Zhitao Shen, Xuemin Lin, Heng Tao Shen, Jingkuan Song PDF
Fast High-dimensional Approximate Nearest Neighbor Search with Efficient Index Time and Space [arxiv, 2025] Mingyu Yang, Wentao Li, Wei Wang PDF
Reference
VSAG referenced the following works during its implementation:
RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search [SIGMOD, 2024]
Jianyang Gao, Cheng Long PDF | DOI | CODE
Quasi-succinct Indices [WSDM, 2013]
Sebastiano Vigna PDF | DOI