| CARVIEW |
Select Language
HTTP/2 301
server: GitHub.com
content-type: text/html
location: https://mrsecant.github.io/GaussianGrasper/
x-github-request-id: 6FD7:2C10E1:8971BF:9A59D7:69524525
accept-ranges: bytes
age: 0
date: Mon, 29 Dec 2025 09:08:53 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210075-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1766999334.737556,VS0,VE204
vary: Accept-Encoding
x-fastly-request-id: 24081f5fdba026afee74d94141895394f271043f
content-length: 162
HTTP/2 200
server: GitHub.com
content-type: text/html; charset=utf-8
last-modified: Fri, 05 Jul 2024 19:18:40 GMT
access-control-allow-origin: *
strict-transport-security: max-age=31556952
etag: W/"66884710-7686"
expires: Mon, 29 Dec 2025 09:18:54 GMT
cache-control: max-age=600
content-encoding: gzip
x-proxy-cache: MISS
x-github-request-id: 16CD:3157C7:880A47:98F116:69524525
accept-ranges: bytes
age: 0
date: Mon, 29 Dec 2025 09:08:54 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210075-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1766999334.971411,VS0,VE219
vary: Accept-Encoding
x-fastly-request-id: 8732286ee84f970b8e056c48d62df980f6d914f3
content-length: 8041
GaussianGrasper
1 Beihang University, 2 EncoSmart, 3 HKU, 4 CASIA, 5 Tsinghua AIR, 6 Imperial College London, 7 Shanghai AI Laboratory, 8 China Mobile Research Institute, 9 LightIllusions, 10 Wuhan University


GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping
Yuhang Zheng1,2,
Xiangyu Chen2,
Yupeng Zheng4,
Songen Gu5,
Runyi Yang6,
Bu Jin4,
Pengfei Li5,
Chengliang Zhong5, Zengmao Wang10, Lina Liu8, Chao Yang7, Dawei Wang3, Zhen Chen2, Xiaoxiao Long3,9*, Meiqing Wang1*,
(*Indicates Corresponding Author)
Chengliang Zhong5, Zengmao Wang10, Lina Liu8, Chao Yang7, Dawei Wang3, Zhen Chen2, Xiaoxiao Long3,9*, Meiqing Wang1*,
(*Indicates Corresponding Author)
1 Beihang University, 2 EncoSmart, 3 HKU, 4 CASIA, 5 Tsinghua AIR, 6 Imperial College London, 7 Shanghai AI Laboratory, 8 China Mobile Research Institute, 9 LightIllusions, 10 Wuhan University
Contributions
- We introduce GaussianGrasper, a robot manipulation system implemented by a 3D Gaussian field endowed with consistent open-vocabulary semantics and accurate geometry to support open-world manipulation tasks guided by language instructions.
- We propose EFD that leverages contrastive learning to efficiently distill CLIP features and augment feature fields with SAM segmentation prior, addressing computational expense and boundary ambiguity challenges.
- We propose a normal-guided grasp module that uses rendered normal to select the best grasp pose from generated grasp pose candidates.
- We demonstrate GaussianGrasper's capability of language-guided manipulation tasks in multiple real-world tabletop scenes and common objects.
Pipeline of GaussianGrasper

The architecture of our proposed method. (a) is our proposed pipeline where we scan multi-view RGBD images for initialization and reconstruct 3D Gaussian field via feature distillation and geometry reconstruction. Subsequently, given a language instruction, we locate the target object via open-vocabulary querying. Grasp pose candidates for grasping the target object are then generated by a pre-trained grasping model. Finally, a normal-guided module that uses surface normal to filter out unfeasible candidates is proposed to select the best grasp pose. (b) elaborates on EFD where we leverage contrastive learning to constrain rendered latent feature L and only sample a few pixels to recover features to the CLIP space via an MLP. Then, the recovered features are used to calculate distillation loss with the CLIP features. (c) shows the normal-guided grasp that utilizes Force-closure theory to filter out unfeasible grasp poses.
Qualitative Results

Baseline: (1) LERF with extra depth supervision; (2) SAM + CLIP
3D Gaussian Field Reconstruction
* LERF is trained with extra depth supervision
Language-guided Grasp
Scene Update
1. Language-guided pick-and-place2. Scan new scene
3. Update the scene4. Continuous grasp on the same object
Acknowledgement
We would like to thank DRL group of CASIA for supporting device. And we would like to thank Prof. Qichao Zhang and Prof. Haoran Li for their valuable and insightful suggestions.
BibTeX
@article{zheng2024gaussiangrasper,
title={GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping},
author={Zheng, Yuhang and Chen, Xiangyu and Zheng, Yupeng and Gu, Songen and Yang, Runyi and Jin, Bu and Li, Pengfei and Zhong, Chengliang and Wang, Zengmao and Liu, Lina and others},
journal={arXiv preprint arXiv:2403.09637},
year={2024}}