PyTorch implementation of The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation. Based off of the very nice https://github.com/bfortuner/pytorch_tiramisu .
Tiramisu combines DensetNet and U-Net for high performance semantic segmentation. In this repository, we attempt to replicate the authors' results on the CamVid dataset.
- train SWAG model
python train.py --data_path [data_path] --model FCDenseNet67 --loss cross_entropy --optimizer SGD --lr_init 1e-2 --batch_size 4 --ft_start 750 --ft_batch_size 1 --epochs 1000 --swa --swa_start=850 --swa_lr=1e-3 --dir [dir]
- train SGD model
python train.py --data_path [data_path] --model FCDenseNet67 --loss cross_entropy --optimizer SGD --lr_init 1e-2 --batch_size 4 --ft_start 750 --ft_batch_size 1 --epochs 1000 --dir [dir]
- run MC Dropout at test time
python eval_ensemble.py --data_path [data_path] --batch_size 4 --method Dropout --loss cross_entropy --N 50 --file [sgd_checkpoint] --save_path [output.npz]
- run SWAG at test time
python eval_ensemble.py --data_path [data_path] --batch_size 4 --method SWAG --scale=0.5 --loss cross_entropy --N 50 --file [swag_checkpoint] --save_path [output.npz]
Download
Specs
- Training: 367 frames
- Validation: 101 frames
- TestSet: 233 frames
- Dimensions: 360x480
- Classes: 11 (+1 background)
Tiramisu adopts the UNet design with downsampling, bottleneck, and upsampling paths and skip connections. It replaces convolution and max pooling layers with Dense blocks from the DenseNet architecture. Dense blocks contain residual connections like in ResNet except they concatenate, rather than sum, prior feature maps.
FCDenseNet67
Note that these results are not in the paper because we haven't quite been able to reproduce the RMSProp or SGD results reported in the Hundred Layer Tiramisu paper. We suspect this to be due to subtle architectural differences. As such, our results here are just a demonstration that SWAG can be applied to other problems.
Dataset | Accuracy | mIOU | ECE |
---|---|---|---|
SGD | 91.06 | 64.59 | 0.09144 |
SWA | 90.88 | 63.26 | 0.09179 |
Dropout | 90.08 | 62.32 | 0.08925 |
SWAG | 91.01 | 63.32 | 0.09144 |
Hyperparameters
- WeightInitialization = HeUniform
- Optimizer = SGD
- Data Augmentation = Random Crops, Horizontal Flips
- WeightDecay = .0001
- Finetune with full-size images, LR = .0001
- Dropout = 0.2
- BatchNorm "we use current batch stats at training, validation, and test time"