This project provides a clean implementation of YOLOv3 in TensorFlow 2.0 beta following the best practices.
- TensorFlow 2.0
-
yolov3
with pre-trained Weights -
yolov3-tiny
with pre-trained Weights - Inference example
- Transfer learning example
- Eager mode training with
tf.GradientTape
- Graph mode training with
model.fit
- Functional model with
tf.keras.layers
- Input pipeline using
tf.data
- Vectorized transformations
- GPU accelerated
- Fully integrated with
absl-py
from abseil.io - Clean implementation
- Following the best practices
- MIT License
pip install -r requirements.txt
conda env create -f environment.yml
conda activate yolov3-tf2
# yolov3
wget https://pjreddie.com/media/files/yolov3.weights -O data/yolov3.weights
python convert.py
# yolov3-tiny
wget https://pjreddie.com/media/files/yolov3-tiny.weights -O data/yolov3-tiny.weights
python convert.py --weights ./data/yolov3-tiny.weights --output ./checkpoints/yolov3-tiny.tf --tiny
# yolov3
python detect.py --image ./data/meme.jpg
# yolov3-tiny
python detect.py --weights ./checkpoints/yolov3-tiny.tf --tiny --image ./data/street.jpg
You need to generate tfrecord
following the TensorFlow Object Detection API.
For example you can use Microsoft VOTT to generate such dataset. You can also use this script to create the PASCAL VOC dataset.
python train.py --batch_size 8 --dataset ~/Data/voc2012.tfrecord --val_dataset ~/Data/voc2012_val.tfrecord --epochs 100 --mode eager_tf --transfer fine_tune
python train.py --batch_size 8 --dataset ~/Data/voc2012.tfrecord --val_dataset ~/Data/voc2012_val.tfrecord --epochs 100 --mode fit --transfer none
python train.py --batch_size 8 --dataset ~/Data/voc2012.tfrecord --val_dataset ~/Data/voc2012_val.tfrecord --epochs 100 --mode fit --transfer no_output
python train.py --batch_size 8 --dataset ~/Data/voc2012.tfrecord --val_dataset ~/Data/voc2012_val.tfrecord --epochs 10 --mode eager_fit --transfer fine_tune --weights ./checkpoints/yolov3-tiny.tf --tiny
- Great addition for existing TensorFlow experts.
- Not very easy to use without some intermediate understanding of TensorFlow graphs.
- It is annoying when you accidentally use incompatible features like
tensor.shape[0]
or some sort of Python control flow that works fine in eager mode, but totally breaks down when you try to compile the model to graph.
- Extremely useful for debugging purpose, you can set breakpoints anywhere.
- You can compile all the Keras fitting functionalities with gradient tape using the
run_eagerly
argument inmodel.compile
. - From my limited testing, GradientTape is definitely a bit slower than the normal graph mode. So I recommend eager GradientTape for debugging and graph mode for real training.
@tf.function
is very cool. It's like an in-between version of eager and graph.- You can step through the function by disabling
tf.function
and then gain performance when you enable it in production.
- Absolutely amazing. If you don't know already, absl.py is officially used by internal projects at Google.
- It standardizes application interface for Python and many other languages. After using it within Google, I was so excited to hear abseil going open source.
- It includes many decades of best practices learned from creating large size scalable applications. I literally have nothing bad to say about it, strongly recommend absl.py to everybody.
- Very hard with pure functional API because the layer ordering is different in tf.keras and Darknet.
- The clean solution here is creating sub-models in Keras.
- Keras is not able to save nested model in h5 format properly, TensorFlow Checkpoint is recommended since its offically supported by TensorFlow.
- It doesn't work very well for transfer learning.
- There are many articles and GitHub Issues all over the Internet.
- I used a simple hack to make it work nicer on transfer learning with small batches.
convert.py:
--output: path to output
(default: './checkpoints/yolov3.tf')
--[no]tiny: yolov3 or yolov3-tiny
(default: 'false')
--weights: path to weights file
(default: './data/yolov3.weights')
detect.py:
--classes: path to classes file
(default: './data/coco.names')
--image: path to input image
(default: './data/girl.png')
--output: path to output image
(default: './output.jpg')
--[no]tiny: yolov3 or yolov3-tiny
(default: 'false')
--weights: path to weights file
(default: './checkpoints/yolov3.tf')
--batch_size: batch size
(default: '8')
(an integer)
--classes: path to classes file
(default: './data/coco.names')
--dataset: path to dataset
(default: '')
--epochs: number of epochs
(default: '2')
(an integer)
--learning_rate: learning rate
(default: '0.001')
(a number)
--mode: <fit|eager_fit|eager_tf>: fit: model.fit, eager_fit: model.fit(run_eagerly=True), eager_tf: custom GradientTape
(default: 'fit')
--size: image size
(default: '416')
(an integer)
--[no]tiny: yolov3 or yolov3-tiny
(default: 'false')
--transfer: <none|darknet|no_output|frozen|fine_tune>: none: Training from scratch, darknet: Transfer darknet, no_output: Transfer all but output, frozen: Transfer and
freeze all, fine_tune: Transfer all and freeze darknet only
(default: 'none')
--val_dataset: path to validation dataset
(default: '')
--weights: path to weights file
(default: './checkpoints/yolov3.tf')
It is pretty much impossible to implement this from the YOLOv3 paper alone. I had to reference the official (very hard to understand) and many un-official (many minor issues) repos to piece together the complete picture.
Implementation | Remarks |
---|---|
https://github.com/pjreddie/darknet | Official YOLOv3 implementation |
https://github.com/AlexeyAB | Explanations of parameters |
https://github.com/qqwweee/keras-yolo3 | Models, loss functions |
https://github.com/YunYang1994/tensorflow-yolov3 | Data transformations, loss functions |
https://github.com/ayooshkathuria/pytorch-yolo-v3 | Models |
https://github.com/broadinstitute/keras-resnet | Batch normalization fix |