I received my BA in Mathematics and MS in Computer Science from Stanford University. At 22, I was hired by Andrej Karpathy and my resume was reviewed by Elon Musk to join Tesla Autopilot’s AI Team. At 21, I shipped generative speech models at NVIDIA, co-created a PhD-level Stanford class on generative adversarial networks, and lectured on my research.
My past work in AI has spanned theoretical and applied topics in 3D world models, self-driving, speech generation, multi-task learning, self-supervised learning, computer vision, reinforcement learning, and natural language processing.
I went on the Robopapers Podcast to discuss EgoZero, the technical philosophy of robot learning from humans, and what it takes to get real-world autonomy.
I helped train and scale CSM’s Implicit Shape Foundation Model, a 3D-native latent diffusion-Transformer model that produces 3D occupancy fields from single image in <1 minute, and its pre- and post-training data infrastructure. My work helped CSM land #1 on the 3D Arena Leaderboard and close initial enterprise contracts.
Implicit Shape Foundation Model
CSM Neural Radiance Field Foundation Model March 2024
I helped train and scale CSM’s Neural Radiance Field (NeRF) Foundation Model, a 2D-native NeRF model that reconstructs 3D occupancy fields from single image in <1 minute. This model plugs into a slow refinement process based on Poole et al., 2022 and grew CSM to over 300k+ users. My work was presented at NVIDIA GTC 2024.
End-to-end Speech Synthesis with Generative Adversarial Networks June 2022
I trained and shipped models to the v9.2, v10.1, v10.2, and v10.3 releases of Tesla’s Full Self-Driving (FSD). My work was also featured in Tesla’s 2021 AI Day event.
Tesla FSD vision at AI Day 2021
NVIDIA Neural Modules September 2020
I trained and shipped lightweight speech vocoders to the v1.0.0 release of NVIDIA’s Neural Modules (NeMo), an open-source speech AI toolkit for developers that has now expanded more broadly to conversational AI and LLMs.
@misc{liu2025egozerorobotlearningsmart,title={EgoZero: Robot Learning from Smart Glasses},author={Liu, Vincent and Adeniji, Ademi and Zhan, Haotian and Bhirangi, Raunaq and Abbeel, Pieter and Pinto, Lerrel},archiveprefix={arXiv},year={2025},eprint={2505.20290},primaryclass={cs.RO},url={https://arxiv.org/abs/2505.20290}}
Feel the Force: Contact-Driven Learning from Humans
Ademi Adeniji, Zhuoran Chen, Vincent Liu, Venkatesh Pattabiraman, and 4 more authors
@misc{adeniji2025feelforcecontactdrivenlearning,title={Feel the Force: Contact-Driven Learning from Humans},author={Adeniji, Ademi and Chen, Zhuoran and Liu, Vincent and Pattabiraman, Venkatesh and Bhirangi, Raunaq and Haldar, Siddhant and Abbeel, Pieter and Pinto, Lerrel},archiveprefix={arXiv},year={2025},eprint={2506.01944},primaryclass={cs.RO},url={https://arxiv.org/abs/2506.01944}}
DABS 2.0: Improved Datasets and Algorithms for Universal Self-Supervision
Alex Tamkin, Gaurab Banerjee, Mohamed Owda, Vincent Liu, and 2 more authors
In Advances in Neural Information Processing Systems, 2022
@inproceedings{NEURIPS2022_fa73aca7,author={Tamkin, Alex and Banerjee, Gaurab and Owda, Mohamed and Liu, Vincent and Rammoorthy, Shashank and Goodman, Noah},booktitle={Advances in Neural Information Processing Systems},editor={Koyejo, S. and Mohamed, S. and Agarwal, A. and Belgrave, D. and Cho, K. and Oh, A.},pages={38358--38372},publisher={Curran Associates, Inc.},title={DABS 2.0: Improved Datasets and Algorithms for Universal Self-Supervision},url={https://proceedings.neurips.cc/paper_files/paper/2022/file/fa73aca7b2af724fafbd4852957cd3e0-Paper-Datasets_and_Benchmarks.pdf},volume={35},year={2022},}
@misc{parmar2022multivariategrouprobustness,author={Parmar, Jupinder and Liu, Vincent and Hashimoto, Tatsu},title={Multivariate Group Robustness},year={2022},url={https://drive.google.com/file/d/1Qsd-REhXLNFzmJc83MN5pq4ZuaTH5K_v/view?usp=sharing},}
End-to-end Speech Synthesis with Generative Adversarial Networks
@article{liu2019recurrentcontrolnetsdeep,title={Recurrent Control Nets for Deep Reinforcement Learning},author={Liu, Vincent and Adeniji, Ademi and Lee, Nate and Zhao, Jason and Srouji, Mario},journal={Stanford Undergraduate Research Journal},year={2019},eprint={1901.01994},archiveprefix={arXiv},primaryclass={cs.LG},url={https://arxiv.org/abs/1901.01994},}
Volumetric Semantic Segmentation of Glioblastoma Tumors from MRI Studies
@misc{adeniji2019volumetricsegmentationtumor,author={Adeniji, Ademi and Liu, Vincent},title={Volumetric Semantic Segmentation of Glioblastoma Tumors from MRI Studies},year={2019},url={https://drive.google.com/file/d/11tmPV9PguRXKn-ZWsSea41p-Edo-U3DB/view},}
Sequence-to-Sequence Generative Argumentative Dialogue Systems with Self-Attention
@misc{adeniji2019sequence,author={Adeniji, Ademi and Lee, Nate and Liu, Vincent},title={Sequence-to-Sequence Generative Argumentative Dialogue Systems with Self-Attention},year={2019},url={https://web.stanford.edu/class/archive/cs/cs224n/cs224n.1194/reports/custom/15844523.pdf},}
CS 236G: Generative Adversarial Networks March 2021
I helped create and teach CS 236G: Generative Adversarial Networks, which also became a Coursera course. I designed advanced material based on papers such as Pix2PixHD, GauGAN, SRGAN, BigGAN, and MUNIT. I created workshops (1, 2) that cover topics such as mode collapse, spectral normalization, orthogonal initialization, mixed precision, distributed training, Pytorch best practices, and also gave a guest lecture on adversarial methods in speech synthesis.
If any of my work sounds interesting, please drop me an email at vincent.liu15@gmail.com!