3D_animals.mp4
This repository contains the unified codebase for several projects on articulated 3D animal reconstruction and motion generation, including:
- MagicPony: Learning Articulated 3D Animals in the Wild (CVPR 2023)
- a category-specific single-image 3D animal reconstruction model
- Learning the 3D Fauna of the Web (CVPR 2024)
- a pan-category single-image 3D quadruped reconstruction model
- Ponymation: Learning Articulated 3D Animal Motions from Unlabeled Online Videos (ECCV 2024)
- an articulated 3D animal motion generative model
See INSTALL.md.
We adopt the hybrid SDF-mesh representation from DMTet to represent the 3D shape of the animals. It uses tetrahedral grids to extract meshes from underlying SDF representation.
Download the pre-computed tetrahedral grids:
cd data/tets
sh download_tets.shDownload the preprocessed datasets for each project using the download scripts provided in data/. All datasets should be downloaded in the same directory as the download script, for example:
cd data/magicpony
sh download_horse_combined.shSee the notes below for the details of each dataset.
The pretrained models can be downloaded using the scripts provided in results/. All pretrained models should be downloaded in the same directory as the download script, for example:
cd results/magicpony
sh download_pretrained_horse.shOnce the data is prepared, both training and inference of all models can be executed using a single command:
python run.py --config-name CONFIG_NAMEor for training with DDP using multiple GPUs:
accelerate launch --multi_gpu run.py --config-name CONFIG_NAMECONFIG_NAME can be any of the configs specified in config/, e.g., test_magicpony_horse or train_magicpony_horse.
The simplest use case is to test the pretrained models on test images. To do this, use the configs in configs/ that start with test_*. Open the config files to check the details, including the path of the test images.
Note that only the RGB images are required during testing. The DINO features are not required. The mask images are only required if you wish to finetune the texture with higher precision for visualization (see below).
When running the command with the default test configs, it will automatically save some basic visualizations, including the reconstructed views and 3D meshes. For more advanced and customized visualizations, use scripts/visualize_results.py as explained below.
See the instructions for each specific model below.
We provide some scripts that we used to generate the visualizations on our project pages (MagicPony, 3D-Fauna, Ponymation). To render such visualizations, simply run the following command with the proper test config, e.g.:
python visualization/visualize_results.py --config-name test_magicpony_horseFor 3D-Fauna, use visualize_results_fauna.py instead:
python visualization/visualize_results_fauna.py --config-name test_faunaCheck the #Visualization section in the config files for specific visualization configurations.
The visualization script supports the following render_modes, which can be specified in the config:
input_view: image rendered from the input viewpoint of the reconstructed textured mesh, shading map, gray shape visualization.other_views: image rendered from 12 viewpoints rotating around the vertical axis of the reconstructed textured mesh, gray shape visualization.rotation: video rendered from continuously rotating viewpoints around the vertical axis of the reconstructed textured mesh, gray shape visualization.animation(only supported for quadrupeds): two videos rendered from both a side viewpoint and continuously rotating viewpoints of the reconstructed textured mesh animated by interpolating a sequence of pre-configured articulation parameters.arti_param_dircan be set to./visualization/animation_paramswhich contains a sequence of pre-computed keyframe articulation parameters.canonicalization(only supported for quadrupeds): video of the reconstructed textured mesh morphing from the input pose to a pre-configured canonical pose.
To enable texture finetuning at test time, set finetune_texture: true in the config, and (optionally) adjust the number of finetune iterations finetune_iters and learning rate finetune_lr.
For more precise texture optimization, provide instance masks in the same folder as *_mask.png. Otherwise, the background pixels might be pasted onto the object if shape predictions are not perfectly aligned.
MagicPony learns a category-specific model for single-image articulated 3D reconstruction of an animal species.
We trained MagicPony models on image collections of horses, giraffes, zebras, cows, and birds. The data download scripts in data/magicpony provide access to the following preprocessed datasets:
horse_videosandbird_videoswere released by DOVE.horse_combinedconsists ofhorse_videosand additional images selected from Weizmann Horse Database, PASCAL, and Horse-10.giraffe_coco,zebra_cocoandcow_cocoare filtered subsets of the COCO dataset.
To train MagicPony on the provided horse dataset or bird dataset from scratch, simply use the training configs: train_magicpony_horse or train_magicpony_bird, e.g.:
python run.py --config-name train_magicpony_horseFor multi-GPU training, use the accelerator launch command, e.g.:
accelerator launch --multi_gpu run.py --config-name train_magicpony_horseTo train it on the provided giraffe, zebra, or cow datasets, which are much smaller, please finetune from a pretrained horse model using the finetuning configs: finetune_magicpony_giraffe, finetune_magicpony_zebra, or finetune_magicpony_cow.
3D-Fauna learns a pan-category model for single-image articulated 3D reconstruction of any quadruped species.
The Fauna Dataset, which can be downloaded via the script data/fauna/download_fauna_dataset.sh, consists of video frames and images sourced from the Internet, as well as images from DOVE, APT-36K, Animal3D, and Animals-with-Attributes.
To train 3D-Fauna on the Fauna Dataset, simply run:
python run.py --config-name train_faunaPonymation learns a generative model of articulated 3D motions of an animal species.
Dataset can be downloaded via the script data/ponymation/download_ponymation_dataset.sh, including video data of horse, cow, giraffe, and zebra.
Ponymation is trained in two stages. In the first stage, we pretrain a 3D reconstruction model that takes in a sequence of frames and reconstructs a sequence of articulated 3D shapes of the animal. This stage can be initiated using the stage 1 config train_ponymation_horse_stage1:
python run.py --config-name train_ponymation_horse_stage1After this video reconstruction model is pretrained, we then train a generative model of the articulated 3D motions in the second stage, using the stage 2 config train_ponymation_horse_stage2:
python run.py --config-name train_ponymation_horse_stage2This repo contains code to reconstruct 4D animal videos, i.e., a sequence of articulated 3D shapes of an animal in a video, using 3D-Fauna as the backbone. The code works on animal video dataset processed by Animal Video Processing repo.
python run_4d_reconstruction.pyIf you use this repository or find the papers useful for your research, please consider citing the following publications, as well as the original publications of the datasets used:
@InProceedings{wu2023magicpony,
title = {{MagicPony}: Learning Articulated 3D Animals in the Wild},
author = {Wu, Shangzhe and Li, Ruining and Jakab, Tomas and Rupprecht, Christian and Vedaldi, Andrea},
booktitle = {CVPR},
year = {2023}
}
@InProceedings{li2024fauna,
title = {Learning the 3D Fauna of the Web},
author = {Li, Zizhang and Litvak, Dor and Li, Ruining and Zhang, Yunzhi and Jakab, Tomas and Rupprecht, Christian and Wu, Shangzhe and Vedaldi, Andrea and Wu, Jiajun},
booktitle = {CVPR},
year = {2024}
}
@InProceedings{sun2024ponymation,
title = {{Ponymation}: Learning Articulated 3D Animal Motions from Unlabeled Online Videos},
author = {Sun, Keqiang and Litvak, Dor and Zhang, Yunzhi and Li, Hongsheng and Wu, Jiajun and Wu, Shangzhe},
booktitle = {ECCV},
year = {2024}
}
- Ponymation dataset update
- Data processing script
- Metrics evaluation script