You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Tiramisu combines DensetNet and U-Net for high performance semantic segmentation. In this repository, we attempt to replicate the authors' results on the CamVid dataset.
Tiramisu adopts the UNet design with downsampling, bottleneck, and upsampling paths and skip connections. It replaces convolution and max pooling layers with Dense blocks from the DenseNet architecture. Dense blocks contain residual connections like in ResNet except they concatenate, rather than sum, prior feature maps.
Layers
FCDenseNet103
Authors' Results
Our Results
FCDenseNet67
We trained for 670 epochs (224x224 crops) with 100 epochs fine-tuning (full-size images). The authors mention "global accuracy" of 90.8 for FC-DenseNet67 on Camvid, compared to our 86.8. If we exclude the 'background' class, accuracy increases to ~89%. We think the authors did this, but haven't confirmed.
Dataset
Loss
Accuracy
Validation
.209
92.5
Testset
.435
86.8
FCDenseNet103
We trained for 874 epochs with 50 epochs fine-tuning.
Dataset
Loss
Accuracy
Validation
.178
92.8
Testset
.441
86.6
Predictions
Training
Hyperparameters
WeightInitialization = HeUniform
Optimizer = RMSProp
LR = .001 with exponential decay of 0.995 after each epoch
Data Augmentation = Random Crops, Vertical Flips
ValidationSet with early stopping based on IoU or MeanAccuracy with patience of 100 (50 during finetuning)
WeightDecay = .0001
Finetune with full-size images, LR = .0001
Dropout = 0.2
BatchNorm "we use current batch stats at training, validation, and test time"