One Hundred Layers Tiramisu

PyTorch implementation of The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation. Based off of the very nice https://github.com/bfortuner/pytorch_tiramisu .

Tiramisu combines DensetNet and U-Net for high performance semantic segmentation. In this repository, we attempt to replicate the authors' results on the CamVid dataset.

Commands

train SWAG model

python train.py --data_path [data_path] --model FCDenseNet67 --loss cross_entropy --optimizer SGD --lr_init 1e-2 --batch_size 4 --ft_start 750 --ft_batch_size 1 --epochs 1000 --swa --swa_start=850 --swa_lr=1e-3 --dir [dir]

train SGD model

python train.py --data_path [data_path] --model FCDenseNet67 --loss cross_entropy --optimizer SGD --lr_init 1e-2 --batch_size 4 --ft_start 750 --ft_batch_size 1 --epochs 1000 --dir [dir]

run MC Dropout at test time

python eval_ensemble.py --data_path [data_path] --batch_size 4 --method Dropout --loss cross_entropy --N 50 --file [sgd_checkpoint] --save_path [output.npz]

run SWAG at test time

python eval_ensemble.py --data_path [data_path] --batch_size 4 --method SWAG --scale=0.5 --loss cross_entropy --N 50 --file [swag_checkpoint] --save_path [output.npz]

Dataset

Download

CamVid Website
Download

Specs

Training: 367 frames
Validation: 101 frames
TestSet: 233 frames
Dimensions: 360x480
Classes: 11 (+1 background)

Architecture

Tiramisu adopts the UNet design with downsampling, bottleneck, and upsampling paths and skip connections. It replaces convolution and max pooling layers with Dense blocks from the DenseNet architecture. Dense blocks contain residual connections like in ResNet except they concatenate, rather than sum, prior feature maps.

Our Best Results

FCDenseNet67

Note that these results are not in the paper because we haven't quite been able to reproduce the RMSProp or SGD results reported in the Hundred Layer Tiramisu paper. We suspect this to be due to subtle architectural differences. As such, our results here are just a demonstration that SWAG can be applied to other problems.

Dataset	Accuracy	mIOU	ECE
SGD	91.06	64.59	0.09144
SWA	90.88	63.26	0.09179
Dropout	90.08	62.32	0.08925
SWAG	91.01	63.32	0.09144

Training

Hyperparameters

WeightInitialization = HeUniform
Optimizer = SGD
Data Augmentation = Random Crops, Horizontal Flips
WeightDecay = .0001
Finetune with full-size images, LR = .0001
Dropout = 0.2
BatchNorm "we use current batch stats at training, validation, and test time"

References and Links

bfortuner's repo
FastAI Project Thread
Author's Implementation
https://github.com/bamos/densenet.pytorch
https://github.com/liuzhuang13/DenseNet

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

One Hundred Layers Tiramisu

Commands

Dataset

Architecture

Our Best Results

Training

References and Links

Files

README.md

Latest commit

History

README.md

File metadata and controls

One Hundred Layers Tiramisu

Commands

Dataset

Architecture

Our Best Results

Training

References and Links