| CARVIEW |
Select Language
HTTP/2 200
accept-ranges: bytes
age: 0
cache-control: public,max-age=0,must-revalidate
cache-status: "Netlify Edge"; fwd=miss
content-encoding: gzip
content-type: text/html; charset=UTF-8
date: Mon, 29 Dec 2025 03:53:35 GMT
etag: "973082aaf4d399e1581691554cc1d225-ssl-df"
server: Netlify
strict-transport-security: max-age=31536000
vary: Accept-Encoding
x-nf-request-id: 01KDM3TVN178X6JWWZ5BBRJYXQ
Daniel Ho
I’m the Director of Evaluation at 1X Technologies. My goal is to deploy generalist machines that grow from experience and correct their own mistakes. I’m building World Models and large-scale evaluation pipelines towards this mission.
Previously, I worked on robotics, perception, and machine learning as a Senior Software Engineer at Waymo and Everyday Robots / X, The Moonshot Factory (formerly Google[X]).
My research has focused on learning algorithms and representation learning to generalize ML model understanding in robotics, computer vision, and self-driving.
- Trained end-to-end perception models deployed on Waymo cars for rare semenatic event understanding and robust to sensor dropout
- Built RetinaGAN, a sim-to-real domain adaptation approach using perception consistency grounding for door opening and object grasping.
- Worked on the SayCan end-to-end manipulation modeling effort, on sim-to-real transfer and imitation / reinforcement learning training.
- Solved door opening across novel buildings and conference rooms with Task Consistency Loss, Variational Information Bottleneck, and interpretability featured in WIRED
- Invented Population Based Augmentation
its<firstname><lastname> at gmail dotcom
Publications
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances
Large language models can encode a wealth of semantic knowledge about the world. Such knowledge could in principle be extremely useful …
Michael Ahn*, Anthony Brohan*, Noah Brown*, Yevgen Chebotar*, Omar Cortes*, Byron David*, Chelsea Finn*, Keerthana Gopalakrishnan*, Karol Hausman*, Alex Herzog*, Daniel Ho*, Jasmine Hsu*, Julian Ibarz*, Brian Ichter*, Alex Irpan*, Eric Jang*, Rosario Jauregui Ruano*, Kyle Jeffrey*, Sally Jesmonth*, Nikhil Joshi*, Ryan Julian*, Dmitry Kalashnikov*, Yuheng Kuang*, Kuang-Huei Lee*, Sergey Levine*, Yao Lu*, Linda Luu*, Carolina Parada*, Peter Pastor*, Jornell Quiambao*, Kanishka Rao*, Jarek Rettinghouse*, Diego Reyes*, Pierre Sermanet*, Nicolas Sievers*, Clayton Tan*, Alexander Toshev*, Vincent Vanhoucke*, Fei Xia*, Ted Xiao*, Peng Xu*, Sichun Xu*, Mengyuan Yan*
Bayesian Imitation Learning for End-to-End Mobile Manipulation
We take a Bayesian approach to imitation learning from multiple sensor inputs and apply to the task of opening office doors with a mobile manipulator. We show that using the Variational Information Bottleneck to regularize convolutional neural networks improves generalization to held-out domains, reduces the sim-to-real gap in a sensor-agnostic manner, and provides useful estimates of model uncertainty. In a real-world office environment, we achieve 96% task success.
Yuqing Du, Daniel Ho, Alexander A. Alemi, Eric Jang, Mohi Khansari
Practical Imitation Learning in the Real World via Task Consistency Loss
Task Consistency Loss (TCL) is a self-supervised loss that encourages sim and real alignment both at the feature and action-prediction levels, building on top of RetinaGAN. We teach a mobile manipulator to autonomously approach a door, turn the handle to open the door, and enter the room. The imitation learning policy performs control from RGB and depth images and generalizes to doors not encountered in training data. We achieve 72% success across sixteen seen and unseen scenes using only ~16.2 hours of teleoperated demonstrations in sim and real.
Mohi Khansari, Daniel Ho, Yuqing Du, Armando Fuentes, Matthew Bennice, Nicolas Sievers, Sean Kirmani, Yunfei Bai, Eric Jang
SimGAN: Hybrid Simulator Identification for Domain Adaptation via Adversarial Reinforcement Learning
SimGAN tackles domain adaptation by identifying a hybrid physics simulator to match simulated trajectories to those in the target domain. It uses a learned discriminative loss to address the limitations associated with manual loss design. Our hybrid simulator combines neural networks and traditional physics simulaton to balance expressiveness and generalizability, and alleviates the need for a carefully selected parameter set in System ID.
Yifeng Jiang, Tingnan Zhang, Daniel Ho, Yunfei Bai, C. Karen Liu, Sergey Levine, Jie Tan
COCOI: Contact-aware Online Context Inference for Generalizable Non-planar Pushing
Deep reinforcement learning (RL) has shown great potential in solving robot manipulation tasks. However, existing RL policies have limited adaptability to environments with diverse dynamics properties, which is pivotal in solving many contact-rich manipulation tasks. We propose Contact-aware Online COntext Inference (COCOI), a deep RL method that encodes a context embedding of dynamics properties online using contact-rich interactions. We study this method based on a novel and challenging non-planar pushing task.
Zhuo Xu, Wenhao Yu, Alexander Herzog, Wenlong Lu, Chuyuan Fu, Masayoshi Tomizuka, Yunfei Bai, C. Karen Liu, Daniel Ho
RetinaGAN: An Object-aware Approach to Sim-to-Real Transfer
RetinaGAN is a generative adversarial network approach to adapt simulated images to realistic ones with object-detection consistency. Trained unsupervised without task loss dependencies, it preserves general object structure and texture in adapted images. On three real world tasks: grasping, pushing, and door opening, RetinaGAN improves performance for RL-based object instance grasping and continues to be effective even in the limited data regime.
Daniel Ho, Kanishka Rao, Zhuo Xu, Eric Jang, Mohi Khansari, Yunfei Bai
Population Based Augmentation: Efficient Learning of Augmentation Policy Schedules
Population Based Augmentation (PBA) quickly and efficiently learns a state-of-the-art approach to augmenting data for neural network training. PBA matches the previous best result on CIFAR and SVHN but uses one thousand times less compute, enabling researchers and practitioners to effectively learn new augmentation policies using a single workstation GPU. You can use PBA broadly to improve deep learning performance on image recognition tasks.
Daniel Ho, Eric Liang, Ion Stoica, Pieter Abbeel, Xi Chen
Improvements to context based self-supervised learning
We use patch based arragement context to improve self-supervised learning – addressing chromatic aberration, spatial skew, and mid-level feature neglect. The results of our methods combined yield top scores on all standard self-supervised benchmarks.
T. Nathan Mundhenk, Daniel Ho, Barry Y. Chen



