Cross Modal Distillation for Supervision Transfer

Saurabh Gupta, Judy Hoffman, Jitendra Malik

This codebase allows use of RGB-D object detection models from this arXiv tech report.

License

This code base is built on Fast R-CNN. License for Fast R-CNN can be found in LICENSE_fast_rcnn.

Citing

If you find this code base and models useful in your research, please consider citing an appropriate sub-set of the following papers:

@article{gupta2015cross,
  title={Cross Modal Distillation for Supervision Transfer},
  author={Gupta, Saurabh and Hoffman, Judy and Malik, Jitendra},
  journal={arXiv preprint arXiv:1507.00448},
  year={2015}
}
@incollection{gupta2014learning,
  title={Learning rich features from RGB-D images for object detection and segmentation},
  author={Gupta, Saurabh and Girshick, Ross and Arbel{\'a}ez, Pablo and Malik, Jitendra},
  booktitle={Computer Vision--ECCV 2014},
  pages={345--360},
  year={2014},
  publisher={Springer}
}
@article{girshick15fastrcnn,
    Author = {Ross Girshick},
    Title = {Fast R-CNN},
    Journal = {arXiv preprint arXiv:1504.08083},
    Year = {2015}
}
@phdthesis{Hariharan:EECS-2015-193,
	Author = {Hariharan, Bharath},
    Title = {Beyond Bounding Boxes: Precise Localization of Objects in
    Images},
    School = {EECS Department, University of California, Berkeley},
    Year = {2015},
    Month = {Aug},
    URL = {https://www.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-193.html},
   	Number = {UCB/EECS-2015-193}
}

Requirements: software

Requirements for Caffe and pycaffe (see: Caffe installation instructions)

Note: Caffe must be built with support for Python layers!

# In your Makefile.config, make sure to have this line uncommented
WITH_PYTHON_LAYER := 1

Python packages you might not have: cython, python-opencv, easydict

Requirements: hardware

For training smaller networks (CaffeNet, VGG_CNN_M_1024) a good GPU (e.g., Titan, K20, K40, ...) with at least 3G of memory suffices
For training with VGG16, you'll need a K40 (~11G of memory)

Installation (sufficient for the demo)

Clone the repository

# Clone the python code
git clone https://github.com/s-gupta/fast-rcnn.git
git checkout sds-distillation

We'll call the directory that you cloned Fast R-CNN into FRCN_ROOT.

Clone Caffe with roi_pooling_layers:

cd $FRCNN_ROOT
git clone https://github.com/rbgirshick/caffe-fast-rcnn.git caffe-fast-rcnn
cd caffe-fast-rcnn
# caffe-fast-rcnn needs to be on the fast-rcnn branch (or equivalent detached state).
git checkout fast-rcnn

Clone Caffe with SDS layers:

cd $FRCNN_ROOT
git clone https://github.com/bharath272/caffe-sds.git caffe-sds

Build the Cython modules
```
cd $FRCN_ROOT/lib
make
```

Build Caffe and pycaffe, for both caffe-fast-rcnn and caffe-sds:

cd $FRCN_ROOT/caffe-fast-rcnn
# Now follow the Caffe installation instructions here:
#   https://caffe.berkeleyvision.org/installation.html
# If you're experienced with Caffe and have all of the requirements installed
# and your Makefile.config in place, then simply do.
# Make sure caffe is built with PYTHON layers.
make -j8 && make pycaffe
cd $FRCN_ROOT/caffe-sds
# Now follow the Caffe installation instructions here:
#   https://caffe.berkeleyvision.org/installation.html
# If you're experienced with Caffe and have all of the requirements installed
# and your Makefile.config in place, then simply do.
# Make sure caffe is built with PYTHON layers.
make -j8 && make pycaffe

Download models and data

Download the NYUD2 data

cd $FRCN_ROOT
./data/scripts/fetch_nyud2_data.sh

Download the NYUD2 MCG boxes

cd $FRCN_ROOT
./data/scripts/fetch_nyud2_mcg_boxes.sh

Download the NYUD2 Superpixels

cd $FRCN_ROOT
./data/scripts/fetch_nyud2_sp.sh

Download the ImageNet and Supervision Transfer Models

cd $FRCN_ROOT
./data/scripts/fetch_init_models.sh

Fetch NYUD2 Object Detector Models.

cd $FRCN_ROOT
./output/scripts/fetch_nyud2_detectors.sh

Fetch NYUD2 SDS Models.

cd $FRCN_ROOT
./output/scripts/fetch_nyud2_sds.sh

Usage

Look at experiments/sds_test_pretrained_models.sh and experiments/sds_train_models.sh to use pretrained SDS models and train SDS models yourself. Additionally, look at experiments/test_pretrained_models.sh and experiments/train_models.sh to use pretrained models and train your models yourself.
The current code base uses two different versions of caffe (one with roi_pooling layer and another with sds layers). Currently, these two versions aren't merged, and hence when using the detector please change to the roi_pooling layer caffe (caffe-fast-rcnn), and when using the sds model on the detections please change to the sds layer caffe (caffe-sds) in _init_paths.py. I am sorry for this, and will soon merge these two.

Name		Name	Last commit message	Last commit date
Latest commit History 269 Commits
data		data
experiments		experiments
lib		lib
matlab		matlab
output		output
python_utils		python_utils
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
LICENSE_fast_rcnn		LICENSE_fast_rcnn
README.md		README.md
_init_paths.py		_init_paths.py
matplotlibrc		matplotlibrc
todo.txt		todo.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Cross Modal Distillation for Supervision Transfer

License

Citing

Contents

Requirements: software

Requirements: hardware

Installation (sufficient for the demo)

Download models and data

Usage

About

Uh oh!

Releases

Packages

Languages

License

s-gupta/fast-rcnn

Folders and files

Latest commit

History

Repository files navigation

Cross Modal Distillation for Supervision Transfer

License

Citing

Contents

Requirements: software

Requirements: hardware

Installation (sufficient for the demo)

Download models and data

Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages