| CARVIEW |
Select Language
Planning and Syllabus
| Event | Date | Room | Description | Material* |
|---|---|---|---|---|
| Lecture 1 | Monday Nov 18 |
B12D i-51 (64) |
Introduction Research field. Neuroscience, computer vision and machine learning background. Modern deep learning. About this course. |
[slides] |
| Lecture 2 | Wednesday Nov 20 |
B12D i-60 (44) |
Visual representation Global/local visual descriptors, dense/sparse representation, local feature detectors. Encoding/pooling, vocabularies, bag-of-words. VLAD*, Fisher vectors*, embeddings*. HMAX. |
[slides] |
| Lecture 3 | Monday Nov 26 |
Jersey |
Local features and spatial matching Derivatives, scale space and scale selection. Edges, blobs, corners/junctions. Dense optical flow* / sparse feature tracking*. Wide-baseline matching. Geometric models, RANSAC, Hough transform. Fast spatial matching*. |
[slides] |
| Lecture 4 | Wednesday Nov 28 |
B12D i-58 (44) |
Codebooks and kernels Geometry/appearance matching. Bag-of-words. k-means clustering, hierarchical*, approximate*, vocabulary tree*. Soft assignment, max pooling*. Match kernels, Hamming embedding, ASMK*. Pyramid matching, spatial pyramids. Hough pyramids*. |
[slides] |
| Lecture 5 | Monday Dec 2 |
B12D i-51 (64) |
Learning Binary classification. Perceptron, support vector machines, logistic regression. Gradient descent, regularization, loss functions, unified model. Multi-class classification. Linear regression*, basis functions. Neural networks, activation functions. |
[slides] |
| Lecture 6 | Wednesday Dec 4 |
B12D i-51 (64) |
Differentiation Stochastic gradient descent. Numerical gradient approximation. Function decomposition, chain rule, analytical gradient computation, back-propagation. Chaining, splitting and sharing. Common forward and backward flow patterns. Dynamic automatic differentiation. |
[slides][pynet] |
| Lecture 7 | Wednesday Dec 11 |
B12D i-59 (44) |
Convolution and network architectures Convolution, cross-correlation, linearity, equivariance, weight sharing. Feature maps, matrix multiplication, 1x1 convolution. Padded*, strided*, dilated* convolution. Pooling and invariance. Convolutional networks: LeNet-5, AlexNet, ZFNet*, VGG, NiN*, GoogLeNet. |
[slides] |
| Milestone | Sunday Dec 15 |
Oral presentation Selection of papers to present. |
[instructions] | |
| Lecture 8 | Monday Dec 16 |
B12D i-60 (44) |
Optimization and deeper architectures Optimizers: momentum, RMSprop, Adam, second-order*. Initialization: Gaussian matrices, unit variance, orthogonal*, data-dependent*. Normalization: input, batch, layer*, weight*. Deeper networks: residual, identity mappings*, stochastic depth*, densely connected. |
[slides] |
| Lecture 9 | Monday Dec 18 |
B12D i-60 (44) |
Object detection Background: Viola and Jones, DPM, ISM, ESS*, object proposals, non-maximum suppression. Two-stage: R-CNN, SPP, fast/faster R-CNN, RPN. Bounding box regression*. Part-based*: R-FCN, spatial transformers, deformable convolution. Upsampling: FCN, feature pyramids. One-stage: OverFeat, YOLO, SSD*, RetinaNet, focal loss. |
[slides] |
| Lecture 10 | Monday Jan 6 |
B12D i-59 (44) |
Retrieval Local vs. global descriptors for visual search. Pooling from CNN representations: MAC, R-MAC, SPoC*, CroW*. Manifold learning, siamese and triplet architectures. Fine-tuning with constrastive or triplet loss, learning to rank. Graph-based methods, diffusion, unsupervised fine-tuning. |
[slides] |
| Evaluation 1 | Monday Jan 13 |
B12D i-59 (44) | Written exam | |
| Evaluation 2 | Monday Jan 20 |
B12D i-59 (44) | Oral presentations | [instructions] |
*All material licensed CC BY-SA 4.0
Prerequisites
Basic knowledge of Linear Algebra, Calculus, Probabilities, Signal Processing, Machine Learning, Python.
Related courses
Computer vision
Computer Vision, Cornell University (Huttenlocher), 2008.Computer Vision, University of Washington (Steitz and Szeliski), 2008.
Object Recognition and Scene Understanding, MIT (Torralba), 2008.
Learning-based Methods in Vision, Carnegie Mellon University (Efros), 2012.
Computer Vision, University of California Berkeley (Malik), 2015.
Computer Vision, University of Texas (Grauman), 2018.
Computer Vision, New York University (Fergus), 2019.
Computer Vision, University of Illinois (Lazebnik), 2019.
Advances in Computer Vision, MIT (Freeman, Torralba and Isola), 2019.
Computer Vision, Aachen University (Leibe), 2019.
Deep learning
Topics Course on Deep Learning, UC Berkeley (Bruna), 2016.Deep Learning, New York University (LeCun), 2017.
Deep Learning, IDIAP (Fleuret), 2018.
Introduction to Deep Learning, University of Illinois (Lazebnik), 2018.
Deep Learning, Université Paris-Saclay (Grisel and Ollion), 2018.
Deep Learning in Computer Vision, University of Toronto (Fidler), 2018.
Convolutional Neural Networks for Visual Recognition, Stanford University (Li), 2019.
Deep Learning, University of Amsterdam (Gavves), 2019.
Deep Learning: Do-It-Yourself! Hands-on tour to deep learning, ENS Paris (Lelarge et al.), 2019.
Practical Deep Learning for Coders, fast.ai, 2019.