Haonan Chen1, Cheng Zhu1, Yunzhu Li2, Katherine Driggs-Campbell1
1University of Illinois, Urbana-Champaign, 2Columbia University,
- UR5-CB3 or UR5e (with RTDE Interface)
- Alternative: Kinova Gen 3
- Sensors:
- 2× Intel RealSense D415
- USB-C cables and mounting hardware
- Control Interfaces:
- 3Dconnexion SpaceMouse (teleoperation)
- GELLO Controller (teleoperation)
- Custom Components:
-
3D Printed Hammer and Nail
To install a tool on the UR5E, we provide two types of fast tool changers:
-
3D-Printed Clipper
- Requires a connector to attach tools to the Clipper.
- Example: The hammer linked above already includes the connector.
The upper Clipper is connected to the Clipper base using one M4×16 screw and one M4 nut. A clipper gasket is provided to place between the UR5E robot and the Clipper. If the gasket is chosen to use, you should use four M6×30 screws. Without the clipper gasket, four M6×24 screws will work as well. Four M6 screw gasket will be used in both conditions.
- Requires a connector to attach tools to the Clipper.
-
3D Printed Mounter
- Suitable for both 3D-printed and standard tools.
- Secured using one or two 3D-printed screws.
To connect the Mounter with UR5E robot, you should use four M6×12 screws. Four M6 screw gasket will be used here to make it tightly connected.
-
-
We recommend using Mambaforge over the standard Anaconda distribution for a faster installation process. Create your environment using:
-
Install the necessary dependencies:
sudo apt install -y libosmesa6-dev libgl1-mesa-glx libglfw3 patchelf libglm-dev -
Clone the repository:
git clone --recursive https://github.com/Tool-as-Interface/Tool_as_Interface.git cd Tool_as_Interface/ git clone https://github.com/xinyu1205/recognize-anything.git third_party/Grounded-Segment-Anything/recognize-anything
-
Update
mamba, and create and activate the environment:To update
mambaand create the environment, use the following commands:mamba install mamba=1.5.1 -n base -c conda-forge mamba env create -f conda_environment_real.yml mamba activate ti
-
Install packages:
# grounded sam export AM_I_DOCKER=False export BUILD_WITH_CUDA=True export CUDA_HOME=/usr/local/cuda-11.8 pip install https://artifactory.kinovaapps.com:443/artifactory/generic-public/kortex/API/2.6.0/kortex_api-2.6.0.post3-py3-none-any.whl pip install --no-build-isolation -e third_party/Grounded-Segment-Anything/GroundingDINO pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu118/torch2.0.0/index.html pip install third_party/Grounded-Segment-Anything/grounded-sam-osx/transformer_utils # FoundationPose CONDA_ENV_PATH=$(conda info --base)/envs/$(basename "$CONDA_PREFIX") EIGEN_PATH="$CONDA_ENV_PATH/include/eigen3" export CMAKE_PREFIX_PATH="$CMAKE_PREFIX_PATH:$EIGEN_PATH" cd third_party/FoundationPose CMAKE_PREFIX_PATH=$CONDA_PREFIX/lib/python3.10/site-packages/pybind11/share/cmake/pybind11 console build_all_conda.sh cd ../..
-
Download the checkpoints
bash setup_downloads.sh
To ensure everything is installed correctly, run the following commands:
python -c 'import torch; print(torch.__version__); print(torch.cuda.is_available())'
python -c 'import torchvision; print(torchvision.__version__)'
python -c "from groundingdino.util.inference import Model; from segment_anything import sam_model_registry, SamPredictor"mkdir -p dataPlace the hammer_human dataset in the data folder. The directory structure should be:
To obtain the dataset, download the corresponding zip file from the following link:
hammer_human dataset
Once downloaded, extract the contents into the data folder.
Directory structure:
data/
└── hammer_human/
Activate conda environment and login to wandb (if you haven't already).
conda activate ti
wandb loginAdjust the arguments in the calibrate_extrinsics function located in ti/real_world/multi_realsense.py to calibrate multiple cameras. The calibration is performed using a 400×300 mm Charuco board with a checker size of 40 mm (DICT_4X4).
The robot_base_in_world parameter is manually measured and tuned in ti/real_world/multi_realsense.py.
Run the following command to perform the calibration:
python ti/real_world/multi_realsense.pyStart the demonstration collection script. The following script applies to both human play collection and robot demonstration collection.
- Press
Cto start recording. - Press
Sto stop recording. - Press
Backspaceto delete the most recent recording (confirm withy/n).
Run the following script to collect human video demonstrations:
python scripts/collect_human_video.pyOnce the human play data has been collected, process the raw data using:
python scripts/preprocess_human_play.pyIf you want to use GELLO, please calibrate the GELLO offset using the following script:
python ti/devices/gello_software/scripts/gello_get_offset.pyAfter calibration, update the YAML configuration file:
ti/devices/gello_software/gello.yamlFor details on performing the calibration, refer to:
ti/devices/gello_software/README.mdOnce the calibration is complete, update the argument in scripts/demo_real_ur5e.py to select either Spacemouse or GELLO, then run the following command to collect robot demonstration data:
python scripts/demo_real_ur5e.pyTo launch training, run:
python scripts/train_diffusion_policy.py \
--config-name=train_diffusion_policy.yaml Modify eval_real_ur5e_human.py or eval_real_ur5e.py, then launch the evaluation script:
- Press
Cto start evaluation (handing control over to the policy). - Press
Sto stop the current episode.
For eval_real_ur5e_human.py, we assume that the tool and the end-effector (EEF) are rigidly attached. Therefore, the tool pose estimation only needs to be performed once.
- Press
Tonce to estimate the tool's pose in the EEF frame.
python eval_real_ur5e_human.py python eval_real_ur5e.py If you encounter the following error:
AttributeError: module 'collections' has no attribute 'MutableMapping'Resolve it by installing the correct version of protobuf:
pip install protobuf==3.20.1- Policy training implementation is adapted from Diffusion Policy.

