diff --git a/README.md b/README.md
index 463cfc2..112b403 100644
--- a/README.md
+++ b/README.md
@@ -1,21 +1,12 @@
# Deformable Convolutional Networks
-The major contributors of this repository include [Yuwen Xiong](https://github.com/Orpine), [Haozhi Qi](https://github.com/Oh233), [Guodong Zhang](https://github.com/gd-zhang), [Yi Li](https://github.com/liyi14), [Jifeng Dai](https://github.com/daijifeng001), [Bin Xiao](https://github.com/leoxiaobin) and [Yichen Wei](https://github.com/YichenWei).
+The major contributors of this repository include [Yuwen Xiong](https://github.com/Orpine), [Haozhi Qi](https://github.com/Oh233), [Guodong Zhang](https://github.com/gd-zhang), [Yi Li](https://github.com/liyi14), [Jifeng Dai](https://github.com/daijifeng001), [Bin Xiao](https://github.com/leoxiaobin), [Han Hu](https://github.com/ancientmooner) and [Yichen Wei](https://github.com/YichenWei).
-## Disclaimer
-
-This is an official implementation for [Deformable Convolutional Networks](https://arxiv.org/abs/1703.06211) (Deformable ConvNets). It is worth noticing that:
- * The original implementation is based on our internal Caffe version on Windows. There are slight differences in the final accuracy and running time due to the plenty details in platform switch.
- * The code is tested on official [MXNet@(commit 62ecb60)](https://github.com/dmlc/mxnet/tree/62ecb60) with the extra operators for Deformable ConvNets.
- * We trained our model based on the ImageNet pre-trained [ResNet-v1-101](https://github.com/KaimingHe/deep-residual-networks) using a [model converter](https://github.com/dmlc/mxnet/tree/430ea7bfbbda67d993996d81c7fd44d3a20ef846/tools/caffe_converter). The converted model produces slightly lower accuracy (Top-1 Error on ImageNet val: 24.0% v.s. 23.6%).
- * By now it only contains Deformable ConvNets with R-FCN. Deformable ConvNets with DeepLab will be released soon.
- * This repository used code from [MXNet rcnn example](https://github.com/dmlc/mxnet/tree/master/example/rcnn) and [mx-rfcn](https://github.com/giorking/mx-rfcn).
## Introduction
-
**Deformable ConvNets** is initially described in an [arxiv tech report](https://arxiv.org/abs/1703.06211).
**R-FCN** is initially described in a [NIPS 2016 paper](https://arxiv.org/abs/1605.06409).
@@ -25,7 +16,15 @@ This is an official implementation for [Deformable Convolutional Networks](https
+## Disclaimer
+
+This is an official implementation for [Deformable Convolutional Networks](https://arxiv.org/abs/1703.06211) (Deformable ConvNets) based on MXNet. It is worth noticing that:
+ * The original implementation is based on our internal Caffe version on Windows. There are slight differences in the final accuracy and running time due to the plenty details in platform switch.
+ * The code is tested on official [MXNet@(commit 62ecb60)](https://github.com/dmlc/mxnet/tree/62ecb60) with the extra operators for Deformable ConvNets.
+ * We trained our model based on the ImageNet pre-trained [ResNet-v1-101](https://github.com/KaimingHe/deep-residual-networks) using a [model converter](https://github.com/dmlc/mxnet/tree/430ea7bfbbda67d993996d81c7fd44d3a20ef846/tools/caffe_converter). The converted model produces slightly lower accuracy (Top-1 Error on ImageNet val: 24.0% v.s. 23.6%).
+ * This repository used code from [MXNet rcnn example](https://github.com/dmlc/mxnet/tree/master/example/rcnn) and [mx-rfcn](https://github.com/giorking/mx-rfcn).
+
## License
© Microsoft, 2017. Licensed under an Apache-2.0 license.
@@ -61,21 +60,34 @@ If you find Deformable ConvNets useful in your research, please consider citing:
|---------------------------------|---------------|---------------|------|---------|---------|-------|-------|-------|
| R-FCN, ResNet-v1-101 | coco trainval | coco test-dev | 32.1 | 54.3 | 33.8 | 12.8 | 34.9 | 46.1 |
| Deformable R-FCN, ResNet-v1-101 | coco trainval | coco test-dev | 35.7 | 56.8 | 38.3 | 15.2 | 38.8 | 51.5 |
+| Faster R-CNN (2fc), ResNet-v1-101 | coco trainval | coco test-dev | 30.3 | 52.1 | 31.4 | 9.9 | 32.2 | 47.4 |
+| Deformable Faster R-CNN (2fc), ResNet-v1-101 | coco trainval | coco test-dev | 35.0 | 55.0 | 38.3 | 14.3 | 37.7 | 52.0 |
+
+
+
+| | training data | testing data | mIoU | time |
+|-----------------------------------|----------------------------|----------------|------|-------|
+| DeepLab, ResNet-v1-101 | Cityscapes train | Cityscapes val | 70.3 | 0.51s |
+| Deformable DeepLab, ResNet-v1-101 | Cityscapes train | Cityscapes val | 75.2 | 0.52s |
+| DeepLab, ResNet-v1-101 | VOC 12 train (augmented) | VOC 12 val | 70.7 | 0.08s |
+| Deformable DeepLab, ResNet-v1-101 | VOC 12 train (augmented) | VOC 12 val | 75.9 | 0.08s |
*Running time is counted on a single Maxwell Titan X GPU (mini-batch size is 1 in inference).*
## Requirements: Software
-1. MXNet from [offical repository](https://github.com/dmlc/mxnet). We tested our code on [MXNet@(commit 62ecb60)](https://github.com/dmlc/mxnet/tree/62ecb60). Due to the rapid development of MXNet, it is recommended to checkout this version if you have any problems. We may maintain this repository periodically if MXNet adds important feature in future release.
+1. MXNet from [the offical repository](https://github.com/dmlc/mxnet). We tested our code on [MXNet@(commit 62ecb60)](https://github.com/dmlc/mxnet/tree/62ecb60). Due to the rapid development of MXNet, it is recommended to checkout this version if you encounter any issues. We may maintain this repository periodically if MXNet adds important feature in future release.
+
+2. Python 2.7. We recommend using Anaconda2
-2. Python packages might missing: cython, opencv-python >= 3.2.0, easydict. If `pip` is set up on your system, those packages should be able to be fetched and installed by running
+3. Python packages might missing: cython, opencv-python >= 3.2.0, easydict. If `pip` is set up on your system, those packages should be able to be fetched and installed by running
```
pip install Cython
pip install opencv-python==3.2.0.6
pip install easydict==1.6
```
-3. For Windows users, Visual Studio 2015 is needed to compile cython module.
+4. For Windows users, Visual Studio 2015 is needed to compile cython module.
## Requirements: Hardware
@@ -91,18 +103,26 @@ git clone https://github.com/msracver/Deformable-ConvNets.git
2. For Windows users, run ``cmd .\init.bat``. For Linux user, run `sh ./init.sh`. The scripts will build cython module automatically and create some folders.
3. Copy operators in `./rfcn/operator_cxx` to `$(YOUR_MXNET_FOLDER)/src/operator/contrib` and recompile MXNet.
4. Please install MXNet following the official guide of MXNet. For advanced users, you may put your Python packge into `./external/mxnet/$(YOUR_MXNET_PACKAGE)`, and modify `MXNET_VERSION` in `./experiments/rfcn/cfgs/*.yaml` to `$(YOUR_MXNET_PACKAGE)`. Thus you can switch among different versions of MXNet quickly.
+5. For Deeplab, we use the argumented VOC 2012 dataset. The argumented annotations are provided by [SBD](http://home.bharathh.info/pubs/codes/SBD/download.html) dataset. For convenience, we provide the converted PNG annotations and the lists of train/val images, please download them from [OneDrive](https://1drv.ms/u/s!Am-5JzdW2XHzhqMRhVImMI1jRrsxDg).
+## Demo & Deformable Model
-## Demo
+We provide trained deformable convnet models, including the deformable R-FCN & Faster R-CNN models trained on COCO trainval, and the deformable DeepLab model trained on CityScapes train.
-1. To use the demo with our trained model (on COCO trainval), please download the model manually from [OneDrive](https://1drv.ms/u/s!AoN7vygOjLIQqmE7XqFVLbeZDfVN), and put it under folder `model/`.
+1. To use the demo with our pre-trained deformable models, please download manually from [OneDrive](https://1drv.ms/u/s!Am-5JzdW2XHzhqMSjehIcCgAhvEAHw), and put it under folder `model/`.
Make sure it looks like this:
```
./model/rfcn_dcn_coco-0000.params
./model/rfcn_coco-0000.params
+ ./model/rcnn_dcn_coco-0000.params
+ ./model/rcnn_coco-0000.params
+ ./model/deeplab_dcn_cityscapes-0000.params
+ ./model/deeplab_cityscapes-0000.params
+ ./model/deform_conv-0000.params
+ ./model/deform_psroi-0000.params
```
-2. To run the demo, run
+2. To run the R-FCN demo, run
```
python ./rfcn/demo.py
```
@@ -110,15 +130,25 @@ git clone https://github.com/msracver/Deformable-ConvNets.git
```
python ./rfcn/demo.py --rfcn_only
```
-
-
-
-We will release the visualizaiton tool which visualizes the deformation effects soon.
+3. To run the DeepLab demo, run
+ ```
+ python ./deeplab/demo.py
+ ```
+ By default it will run Deformable Deeplab and gives several prediction results, to run DeepLab, use
+ ```
+ python ./deeplab/demo.py --deeplab_only
+ ```
+4. To visualize the offset of deformable convolution and deformable psroipooling, run
+ ```
+ python ./rfcn/deform_conv_demo.py
+ python ./rfcn/defrom_psroi_demo.py
+ ```
## Preparation for Training & Testing
-1. Please download COCO and VOC 2007+2012 dataset, and make sure it looks like this:
+For R-FCN/Faster R-CNN\:
+1. Please download COCO and VOC 2007+2012 datasets, and make sure it looks like this:
```
./data/coco/
@@ -131,10 +161,30 @@ We will release the visualizaiton tool which visualizes the deformation effects
./model/pretrained_model/resnet_v1_101-0000.params
```
+For DeepLab\:
+1. Please download Cityscapes and VOC 2012 datasets and make sure it looks like this:
+
+ ```
+ ./data/cityscapes/
+ ./data/VOCdevkit/VOC2012/
+ ```
+2. Please download argumented VOC 2012 annotations/image lists, and put the argumented annotations and the argumented train/val lists into:
+
+ ```
+ ./data/VOCdevkit/VOC2012/SegmentationClass/
+ ./data/VOCdevkit/VOC2012/ImageSets/Main/
+ ```
+ , Respectively.
+
+2. Please download ImageNet-pretrained ResNet-v1-101 model manually from [OneDrive](https://1drv.ms/u/s!Am-5JzdW2XHzhqMEtxf1Ciym8uZ8sg), and put it under folder `./model`. Make sure it looks like this:
+ ```
+ ./model/pretrained_model/resnet_v1_101-0000.params
+ ```
## Usage
-1. All of our experiment settings (GPU #, dataset, etc.) are kept in yaml files at folder `./experiments/rfcn/cfgs`.
-2. Four config files have been provided so far, namely, R-FCN for COCO/VOC and Deformable R-FCN for COCO/VOC, respectively. We use 8 and 4 GPUs to train models on COCO and on VOC, respectively.
+1. All of our experiment settings (GPU #, dataset, etc.) are kept in yaml config files at folder `./experiments/rfcn/cfgs`, `./experiments/faster_rcnn/cfgs` and `./experiments/deeplab/cfgs/`.
+2. Eight config files have been provided so far, namely, R-FCN for COCO/VOC, Deformable R-FCN for COCO/VOC, Faster R-CNN(2fc) for COCO/VOC, Deformable Faster R-CNN(2fc) for COCO/VOC, Deeplab for Cityscapes/VOC and Deformable Deeplab for Cityscapes/VOC, respectively. We use 8 and 4 GPUs to train models on COCO and on VOC for R-FCN, respectively. For deeplab, we use 4 GPUs for all experiments.
+
3. To perform experiments, run the python scripts with the corresponding config file as input. For example, to train and test deformable convnets on COCO with ResNet-v1-101, use the following command
```
python experiments\rfcn\rfcn_end2end_train_test.py --cfg experiments\rfcn\cfgs\resnet_v1_101_coco_trainval_rfcn_dcn_end2end_ohem.yaml
@@ -144,11 +194,35 @@ We will release the visualizaiton tool which visualizes the deformation effects
## Misc.
-MXNet build without CuDNN is recommended.
-
Code has been tested under:
- Ubuntu 14.04 with a Maxwell Titan X GPU and Intel Xeon CPU E5-2620 v2 @ 2.10GHz
- Windows Server 2012 R2 with 8 K40 GPUs and Intel Xeon CPU E5-2650 v2 @ 2.60GHz
- Windows Server 2012 R2 with 4 Pascal Titan X GPUs and Intel Xeon CPU E5-2650 v4 @ 2.30GHz
+## FAQ
+
+Q: It says `AttributeError: 'module' object has no attribute 'DeformableConvolution'`.
+
+A: This is because either
+ - you forget to copy the operators to your MXNet folder
+ - or you copy to the wrong path
+ - or you forget to re-compile
+ - or you install the wrong MXNet
+
+ Please print `mxnet.__path__` to make sure you use correct MXNet
+
+
+Q: I encounter `segment fault` at the beginning.
+
+A: A compatibility issue has been identified between MXNet and opencv-python 3.0+. We suggest that you always `import cv2` first before `import mxnet` in the entry script.
+
+
+Q: I find the training speed becomes slower when training for a long time.
+
+A: It has been identified that MXNet on Windows has this problem. So we recommend to run this program on Linux. You could also stop it and resume the training process to regain the training speed if you encounter this problem.
+
+
+Q: Can you share your caffe implementation?
+
+A: Due to several reasons (code is based on a old, internal Caffe, port to public Caffe needs extra work, time limit, etc.). We do not plan to release our Caffe code. Since current MXNet convolution implementation is very similar to Caffe (almost the same), it is easy to port to Caffe by yourself, the core CUDA code could be kept unchanged. Anyone who wish to do it is welcome to make a pull request.
diff --git a/deeplab/_init_paths.py b/deeplab/_init_paths.py
new file mode 100644
index 0000000..5e9b023
--- /dev/null
+++ b/deeplab/_init_paths.py
@@ -0,0 +1,19 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Zheng Zhang
+# --------------------------------------------------------
+
+import os.path as osp
+import sys
+
+def add_path(path):
+ if path not in sys.path:
+ sys.path.insert(0, path)
+
+this_dir = osp.dirname(__file__)
+
+lib_path = osp.join(this_dir, '..', 'lib')
+add_path(lib_path)
diff --git a/deeplab/config/__init__.py b/deeplab/config/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/deeplab/config/config.py b/deeplab/config/config.py
new file mode 100644
index 0000000..cae1c8d
--- /dev/null
+++ b/deeplab/config/config.py
@@ -0,0 +1,96 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Zheng Zhang
+# --------------------------------------------------------
+
+import yaml
+import numpy as np
+from easydict import EasyDict as edict
+
+config = edict()
+
+config.MXNET_VERSION = ''
+config.output_path = ''
+config.symbol = ''
+config.gpus = ''
+config.CLASS_AGNOSTIC = True
+config.SCALES = [(360, 600)] # first is scale (the shorter side); second is max size
+
+# default training
+config.default = edict()
+config.default.frequent = 1000
+config.default.kvstore = 'device'
+
+# network related params
+config.network = edict()
+config.network.pretrained = '../model/pretrained_model/resnet_v1-101'
+config.network.pretrained_epoch = 0
+config.network.PIXEL_MEANS = np.array([103.06, 115.90, 123.15])
+config.network.IMAGE_STRIDE = 0
+config.network.FIXED_PARAMS = ['conv1', 'bn_conv1', 'res2', 'bn2', 'gamma', 'beta']
+
+# dataset related params
+config.dataset = edict()
+config.dataset.dataset = 'cityscapes'
+config.dataset.image_set = 'leftImg8bit_train'
+config.dataset.test_image_set = 'leftImg8bit_val'
+config.dataset.root_path = '../data'
+config.dataset.dataset_path = '../data/cityscapes'
+config.dataset.NUM_CLASSES = 19
+config.dataset.annotation_prefix = 'gtFine'
+
+config.TRAIN = edict()
+config.TRAIN.lr = 0
+config.TRAIN.lr_step = ''
+config.TRAIN.warmup = False
+config.TRAIN.warmup_lr = 0
+config.TRAIN.warmup_step = 0
+config.TRAIN.momentum = 0.9
+config.TRAIN.wd = 0.0005
+config.TRAIN.begin_epoch = 0
+config.TRAIN.end_epoch = 0
+config.TRAIN.model_prefix = 'deeplab'
+
+# whether resume training
+config.TRAIN.RESUME = False
+# whether flip image
+config.TRAIN.FLIP = True
+# whether shuffle image
+config.TRAIN.SHUFFLE = True
+# whether use OHEM
+config.TRAIN.ENABLE_OHEM = False
+# size of images for each device, 2 for rcnn, 1 for rpn and e2e
+config.TRAIN.BATCH_IMAGES = 1
+
+config.TEST = edict()
+# size of images for each device
+config.TEST.BATCH_IMAGES = 1
+
+# Test Model Epoch
+config.TEST.test_epoch = 0
+
+def update_config(config_file):
+ exp_config = None
+ with open(config_file) as f:
+ exp_config = edict(yaml.load(f))
+ for k, v in exp_config.items():
+ if k in config:
+ if isinstance(v, dict):
+ if k == 'TRAIN':
+ if 'BBOX_WEIGHTS' in v:
+ v['BBOX_WEIGHTS'] = np.array(v['BBOX_WEIGHTS'])
+ elif k == 'network':
+ if 'PIXEL_MEANS' in v:
+ v['PIXEL_MEANS'] = np.array(v['PIXEL_MEANS'])
+ for vk, vv in v.items():
+ config[k][vk] = vv
+ else:
+ if k == 'SCALES':
+ config[k][0] = (tuple(v))
+ else:
+ config[k] = v
+ else:
+ raise ValueError("key must exist in config.py")
diff --git a/deeplab/core/DataParallelExecutorGroup.py b/deeplab/core/DataParallelExecutorGroup.py
new file mode 100644
index 0000000..15c8469
--- /dev/null
+++ b/deeplab/core/DataParallelExecutorGroup.py
@@ -0,0 +1,603 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Zheng Zhang
+# --------------------------------------------------------
+
+import logging
+import numpy as np
+
+from mxnet import context as ctx
+from mxnet import ndarray as nd
+from mxnet.io import DataDesc
+from mxnet.executor_manager import _split_input_slice
+
+def _load_general(data, targets, major_axis):
+ """Load a list of arrays into a list of arrays specified by slices"""
+ for d_src, d_targets in zip(data, targets):
+ if isinstance(d_targets, nd.NDArray):
+ d_src.copyto(d_targets)
+ elif isinstance(d_src, (list, tuple)):
+ for src, dst in zip(d_src, d_targets):
+ src.copyto(dst)
+ else:
+ raise NotImplementedError
+
+
+def _load_data(batch, targets, major_axis):
+ """Load data into sliced arrays"""
+ _load_general(batch.data, targets, major_axis)
+
+
+def _load_label(batch, targets, major_axis):
+ """Load label into sliced arrays"""
+ _load_general(batch.label, targets, major_axis)
+
+
+def _merge_multi_context(outputs, major_axis):
+ """Merge outputs that lives on multiple context into one, so that they look
+ like living on one context.
+ """
+ rets = []
+ for tensors, axis in zip(outputs, major_axis):
+ if axis >= 0:
+ rets.append(nd.concatenate(tensors, axis=axis, always_copy=False))
+ else:
+ # negative axis means the there is no batch_size axis, and all the
+ # results should be the same on each device. We simply take the
+ # first one, without checking they are actually the same
+ rets.append(tensors[0])
+ return rets
+
+
+
+class DataParallelExecutorGroup(object):
+ """DataParallelExecutorGroup is a group of executors that lives on a group of devices.
+ This is a helper class used to implement data parallelization. Each mini-batch will
+ be split and run on the devices.
+
+ Parameters
+ ----------
+ symbol : Symbol
+ The common symbolic computation graph for all executors.
+ contexts : list
+ A list of contexts.
+ workload : list
+ If not `None`, could be a list of numbers that specify the workload to be assigned
+ to different context. Larger number indicate heavier workload.
+ data_shapes : list
+ Should be a list of (name, shape) tuples, for the shapes of data. Note the order is
+ important and should be the same as the order that the `DataIter` provide the data.
+ label_shapes : list
+ Should be a list of (name, shape) tuples, for the shapes of label. Note the order is
+ important and should be the same as the order that the `DataIter` provide the label.
+ param_names : list
+ A list of strings, indicating the names of parameters (e.g. weights, filters, etc.)
+ in the computation graph.
+ for_training : bool
+ Indicate whether the executors should be bind for training. When not doing training,
+ the memory for gradients will not be allocated.
+ inputs_need_grad : bool
+ Indicate whether the gradients for the input data should be computed. This is currently
+ not used. It will be useful for implementing composition of modules.
+ shared_group : DataParallelExecutorGroup
+ Default is `None`. This is used in bucketing. When not `None`, it should be a executor
+ group corresponding to a different bucket. In other words, it will correspond to a different
+ symbol but with the same set of parameters (e.g. unrolled RNNs with different lengths).
+ In this case, many memory will be shared.
+ logger : Logger
+ Default is `logging`.
+ fixed_param_names: list of str
+ Indicate parameters to be fixed during training. Parameters in this list will not allocate
+ space for gradient, nor do gradient calculation.
+ grad_req : str, list of str, dict of str to str
+ Requirement for gradient accumulation. Can be 'write', 'add', or 'null'
+ (default to 'write').
+ Can be specified globally (str) or for each argument (list, dict).
+ """
+ def __init__(self, symbol, contexts, workload, data_shapes, label_shapes, param_names,
+ for_training, inputs_need_grad, shared_group=None, logger=logging,
+ fixed_param_names=None, grad_req='write', state_names=None):
+ self.param_names = param_names
+ self.arg_names = symbol.list_arguments()
+ self.aux_names = symbol.list_auxiliary_states()
+
+ self.symbol = symbol
+ self.contexts = contexts
+ self.workload = workload
+
+ self.for_training = for_training
+ self.inputs_need_grad = inputs_need_grad
+
+ self.logger = logger
+ #In the future we should have a better way to profile memory per device (haibin)
+ # self._total_exec_bytes = 0
+ self.fixed_param_names = fixed_param_names
+ if self.fixed_param_names is None:
+ self.fixed_param_names = []
+
+ self.state_names = state_names
+ if self.state_names is None:
+ self.state_names = []
+
+ if not for_training:
+ grad_req = 'null'
+
+ # data_shapes = [x if isinstance(x, DataDesc) else DataDesc(*x) for x in data_shapes]
+ # if label_shapes is not None:
+ # label_shapes = [x if isinstance(x, DataDesc) else DataDesc(*x) for x in label_shapes]
+
+ data_names = [x.name for x in data_shapes[0]]
+
+ if isinstance(grad_req, str):
+ self.grad_req = {}
+ for k in self.arg_names:
+ if k in self.param_names:
+ self.grad_req[k] = 'null' if k in self.fixed_param_names else grad_req
+ elif k in data_names:
+ self.grad_req[k] = grad_req if self.inputs_need_grad else 'null'
+ else:
+ self.grad_req[k] = 'null'
+ elif isinstance(grad_req, (list, tuple)):
+ assert len(grad_req) == len(self.arg_names)
+ self.grad_req = dict(zip(self.arg_names, grad_req))
+ elif isinstance(grad_req, dict):
+ self.grad_req = {}
+ for k in self.arg_names:
+ if k in self.param_names:
+ self.grad_req[k] = 'null' if k in self.fixed_param_names else 'write'
+ elif k in data_names:
+ self.grad_req[k] = 'write' if self.inputs_need_grad else 'null'
+ else:
+ self.grad_req[k] = 'null'
+ self.grad_req.update(grad_req)
+ else:
+ raise ValueError("grad_req must be one of str, list, tuple, or dict.")
+
+ if shared_group is not None:
+ self.shared_data_arrays = shared_group.shared_data_arrays
+ else:
+ self.shared_data_arrays = [{} for _ in contexts]
+
+ # initialize some instance variables
+ self.batch_size = len(data_shapes)
+ self.slices = None
+ self.execs = []
+ self._default_execs = None
+ self.data_arrays = None
+ self.label_arrays = None
+ self.param_arrays = None
+ self.state_arrays = None
+ self.grad_arrays = None
+ self.aux_arrays = None
+ self.input_grad_arrays = None
+
+ self.data_shapes = None
+ self.label_shapes = None
+ self.data_layouts = None
+ self.label_layouts = None
+ self.output_layouts = [DataDesc.get_batch_axis(self.symbol[name].attr('__layout__'))
+ for name in self.symbol.list_outputs()]
+ self.bind_exec(data_shapes, label_shapes, shared_group)
+
+ def decide_slices(self, data_shapes):
+ """Decide the slices for each context according to the workload.
+
+ Parameters
+ ----------
+ data_shapes : list
+ list of (name, shape) specifying the shapes for the input data or label.
+ """
+ assert len(data_shapes) > 0
+ major_axis = [DataDesc.get_batch_axis(x.layout) for x in data_shapes]
+
+ for (name, shape), axis in zip(data_shapes, major_axis):
+ if axis == -1:
+ continue
+
+ batch_size = shape[axis]
+ if self.batch_size is not None:
+ assert batch_size == self.batch_size, ("all data must have the same batch size: "
+ + ("batch_size = %d, but " % self.batch_size)
+ + ("%s has shape %s" % (name, shape)))
+ else:
+ self.batch_size = batch_size
+ self.slices = _split_input_slice(self.batch_size, self.workload)
+
+ return major_axis
+
+ def _collect_arrays(self):
+ """Collect internal arrays from executors."""
+ # convenient data structures
+ # self.data_arrays = [[(self.slices[i], e.arg_dict[name]) for i, e in enumerate(self.execs)]
+ # for name, _ in self.data_shapes]
+ self.data_arrays = [[e.arg_dict[name] for name, _ in self.data_shapes[0]] for e in self.execs]
+
+ self.state_arrays = [[e.arg_dict[name] for e in self.execs]
+ for name in self.state_names]
+
+ if self.label_shapes is not None:
+ # self.label_arrays = [[(self.slices[i], e.arg_dict[name])
+ # for i, e in enumerate(self.execs)]
+ # for name, _ in self.label_shapes]
+ self.label_arrays = [[e.arg_dict[name] for name, _ in self.label_shapes[0]] for e in self.execs]
+ else:
+ self.label_arrays = None
+
+ self.param_arrays = [[exec_.arg_arrays[i] for exec_ in self.execs]
+ for i, name in enumerate(self.arg_names)
+ if name in self.param_names]
+ if self.for_training:
+ self.grad_arrays = [[exec_.grad_arrays[i] for exec_ in self.execs]
+ for i, name in enumerate(self.arg_names)
+ if name in self.param_names]
+ else:
+ self.grad_arrays = None
+
+ data_names = [x[0] for x in self.data_shapes]
+ if self.inputs_need_grad:
+ self.input_grad_arrays = [[exec_.grad_arrays[i] for exec_ in self.execs]
+ for i, name in enumerate(self.arg_names)
+ if name in data_names]
+ else:
+ self.input_grad_arrays = None
+
+ self.aux_arrays = [[exec_.aux_arrays[i] for exec_ in self.execs]
+ for i in range(len(self.aux_names))]
+
+ def bind_exec(self, data_shapes, label_shapes, shared_group=None, reshape=False):
+ """Bind executors on their respective devices.
+
+ Parameters
+ ----------
+ data_shapes : list
+ label_shapes : list
+ shared_group : DataParallelExecutorGroup
+ reshape : bool
+ """
+ assert reshape or not self.execs
+ # self.batch_size = None
+
+ # calculate workload and bind executors
+ # self.data_layouts = self.decide_slices(data_shapes)
+ # if label_shapes is not None:
+ # # call it to make sure labels has the same batch size as data
+ # self.label_layouts = self.decide_slices(label_shapes)
+
+ for i in range(len(self.contexts)):
+ # data_shapes_i = self._sliced_shape(data_shapes, i, self.data_layouts)
+ data_shapes_i = data_shapes[i]
+ if label_shapes is not None:
+ label_shapes_i = label_shapes[i]
+ # label_shapes_i = self._sliced_shape(label_shapes, i, self.label_layouts)
+ else:
+ label_shapes_i = []
+
+ if reshape:
+ self.execs[i] = self._default_execs[i].reshape(
+ allow_up_sizing=True, **dict(data_shapes_i + label_shapes_i))
+ else:
+ self.execs.append(self._bind_ith_exec(i, data_shapes_i, label_shapes_i,
+ shared_group))
+
+ self.data_shapes = data_shapes
+ self.label_shapes = label_shapes
+ self._collect_arrays()
+
+ def reshape(self, data_shapes, label_shapes):
+ """Reshape executors.
+
+ Parameters
+ ----------
+ data_shapes : list
+ label_shapes : list
+ """
+ if self._default_execs is None:
+ self._default_execs = [i for i in self.execs]
+ for i in range(len(self.contexts)):
+ self.execs[i] = self._default_execs[i].reshape(
+ allow_up_sizing=True, **dict(data_shapes[i] + (label_shapes[i] if label_shapes is not None else []))
+ )
+ self.data_shapes = data_shapes
+ self.label_shapes = label_shapes
+ self._collect_arrays()
+
+
+ def set_params(self, arg_params, aux_params):
+ """Assign, i.e. copy parameters to all the executors.
+
+ Parameters
+ ----------
+ arg_params : dict
+ A dictionary of name to `NDArray` parameter mapping.
+ aux_params : dict
+ A dictionary of name to `NDArray` auxiliary variable mapping.
+ """
+ for exec_ in self.execs:
+ exec_.copy_params_from(arg_params, aux_params)
+
+ def get_params(self, arg_params, aux_params):
+ """ Copy data from each executor to `arg_params` and `aux_params`.
+
+ Parameters
+ ----------
+ arg_params : list of NDArray
+ target parameter arrays
+ aux_params : list of NDArray
+ target aux arrays
+
+ Notes
+ -----
+ - This function will inplace update the NDArrays in arg_params and aux_params.
+ """
+ for name, block in zip(self.param_names, self.param_arrays):
+ weight = sum(w.copyto(ctx.cpu()) for w in block) / len(block)
+ weight.astype(arg_params[name].dtype).copyto(arg_params[name])
+ for name, block in zip(self.aux_names, self.aux_arrays):
+ weight = sum(w.copyto(ctx.cpu()) for w in block) / len(block)
+ weight.astype(aux_params[name].dtype).copyto(aux_params[name])
+
+ def forward(self, data_batch, is_train=None):
+ """Split `data_batch` according to workload and run forward on each devices.
+
+ Parameters
+ ----------
+ data_batch : DataBatch
+ Or could be any object implementing similar interface.
+ is_train : bool
+ The hint for the backend, indicating whether we are during training phase.
+ Default is `None`, then the value `self.for_training` will be used.
+ Returns
+ -------
+
+ """
+ _load_data(data_batch, self.data_arrays, self.data_layouts)
+ if is_train is None:
+ is_train = self.for_training
+
+ if self.label_arrays is not None:
+ assert not is_train or data_batch.label
+ if data_batch.label:
+ _load_label(data_batch, self.label_arrays, self.label_layouts)
+
+ for exec_ in self.execs:
+ exec_.forward(is_train=is_train)
+
+ def get_outputs(self, merge_multi_context=True):
+ """Get outputs of the previous forward computation.
+
+ Parameters
+ ----------
+ merge_multi_context : bool
+ Default is `True`. In the case when data-parallelism is used, the outputs
+ will be collected from multiple devices. A `True` value indicate that we
+ should merge the collected results so that they look like from a single
+ executor.
+
+ Returns
+ -------
+ If `merge_multi_context` is `True`, it is like `[out1, out2]`. Otherwise, it
+ is like `[[out1_dev1, out1_dev2], [out2_dev1, out2_dev2]]`. All the output
+ elements are `NDArray`.
+ """
+ outputs = [[exec_.outputs[i] for exec_ in self.execs]
+ for i in range(len(self.execs[0].outputs))]
+ if merge_multi_context:
+ outputs = _merge_multi_context(outputs, self.output_layouts)
+ return outputs
+
+ def get_states(self, merge_multi_context=True):
+ """Get states from all devices
+
+ Parameters
+ ----------
+ merge_multi_context : bool
+ Default is `True`. In the case when data-parallelism is used, the states
+ will be collected from multiple devices. A `True` value indicate that we
+ should merge the collected results so that they look like from a single
+ executor.
+
+ Returns
+ -------
+ If `merge_multi_context` is `True`, it is like `[out1, out2]`. Otherwise, it
+ is like `[[out1_dev1, out1_dev2], [out2_dev1, out2_dev2]]`. All the output
+ elements are `NDArray`.
+ """
+ assert not merge_multi_context, \
+ "merge_multi_context=True is not supported for get_states yet."
+ return self.state_arrays
+
+ def set_states(self, states=None, value=None):
+ """Set value for states. Only one of states & value can be specified.
+
+ Parameters
+ ----------
+ states : list of list of NDArrays
+ source states arrays formatted like [[state1_dev1, state1_dev2],
+ [state2_dev1, state2_dev2]].
+ value : number
+ a single scalar value for all state arrays.
+ """
+ if states is not None:
+ assert value is None, "Only one of states & value can be specified."
+ _load_general(states, self.state_arrays, (0,)*len(states))
+ else:
+ assert value is not None, "At least one of states & value must be specified."
+ assert states is None, "Only one of states & value can be specified."
+ for d_dst in self.state_arrays:
+ for dst in d_dst:
+ dst[:] = value
+
+ def get_input_grads(self, merge_multi_context=True):
+ """Get the gradients with respect to the inputs of the module.
+
+ Parameters
+ ----------
+ merge_multi_context : bool
+ Default is `True`. In the case when data-parallelism is used, the outputs
+ will be collected from multiple devices. A `True` value indicate that we
+ should merge the collected results so that they look like from a single
+ executor.
+
+ Returns
+ -------
+ If `merge_multi_context` is `True`, it is like `[grad1, grad2]`. Otherwise, it
+ is like `[[grad1_dev1, grad1_dev2], [grad2_dev1, grad2_dev2]]`. All the output
+ elements are `NDArray`.
+ """
+ assert self.inputs_need_grad
+ if merge_multi_context:
+ return _merge_multi_context(self.input_grad_arrays, self.data_layouts)
+ return self.input_grad_arrays
+
+ def backward(self, out_grads=None):
+ """Run backward on all devices. A backward should be called after
+ a call to the forward function. Backward cannot be called unless
+ `self.for_training` is `True`.
+
+ Parameters
+ ----------
+ out_grads : NDArray or list of NDArray, optional
+ Gradient on the outputs to be propagated back.
+ This parameter is only needed when bind is called
+ on outputs that are not a loss function.
+ """
+ assert self.for_training, 're-bind with for_training=True to run backward'
+ if out_grads is None:
+ out_grads = []
+
+ # for i, (exec_, islice) in enumerate(zip(self.execs, self.slices)):
+ for i, exec_ in enumerate(self.execs):
+ out_grads_slice = []
+ exec_.backward(out_grads=out_grads_slice)
+
+ def update_metric(self, eval_metric, labels):
+ """Accumulate the performance according to `eval_metric` on all devices.
+
+ Parameters
+ ----------
+ eval_metric : EvalMetric
+ The metric used for evaluation.
+ labels : list of NDArray
+ Typically comes from `label` of a `DataBatch`.
+ """
+ for texec, labels in zip(self.execs, labels):
+ eval_metric.update(labels, texec.outputs)
+
+ def _bind_ith_exec(self, i, data_shapes, label_shapes, shared_group):
+ """Internal utility function to bind the i-th executor.
+ """
+ shared_exec = None if shared_group is None else shared_group.execs[i]
+ context = self.contexts[i]
+ shared_data_arrays = self.shared_data_arrays[i]
+
+ input_shapes = dict(data_shapes)
+ if label_shapes is not None:
+ input_shapes.update(dict(label_shapes))
+
+ arg_shapes, _, aux_shapes = self.symbol.infer_shape(**input_shapes)
+ assert arg_shapes is not None, "shape inference failed"
+
+ input_types = {x.name: x.dtype for x in data_shapes}
+ if label_shapes is not None:
+ input_types.update({x.name: x.dtype for x in label_shapes})
+ arg_types, _, aux_types = self.symbol.infer_type(**input_types)
+ assert arg_types is not None, "type inference failed"
+
+ arg_arrays = []
+ grad_arrays = {} if self.for_training else None
+
+ def _get_or_reshape(name, shared_data_arrays, arg_shape, arg_type, context, logger):
+ """Internal helper to get a memory block or re-use by re-shaping"""
+ if name in shared_data_arrays:
+ arg_arr = shared_data_arrays[name]
+
+ if np.prod(arg_arr.shape) >= np.prod(arg_shape):
+ # nice, we can directly re-use this data blob
+ assert arg_arr.dtype == arg_type
+ arg_arr = arg_arr.reshape(arg_shape)
+ else:
+ logger.warning(('bucketing: data "%s" has a shape %s' % (name, arg_shape)) +
+ (', which is larger than already allocated ') +
+ ('shape %s' % (arg_arr.shape,)) +
+ ('. Need to re-allocate. Consider putting ') +
+ ('default_bucket_key to') +
+ (' be the bucket taking the largest input for better ') +
+ ('memory sharing.'))
+ arg_arr = nd.zeros(arg_shape, context, dtype=arg_type)
+
+ # replace existing shared array because the new one is bigger
+ shared_data_arrays[name] = arg_arr
+ else:
+ arg_arr = nd.zeros(arg_shape, context, dtype=arg_type)
+ shared_data_arrays[name] = arg_arr
+
+ return arg_arr
+
+ # create or borrow arguments and gradients
+ for j in range(len(self.arg_names)):
+ name = self.arg_names[j]
+ if name in self.param_names: # model parameters
+ if shared_exec is None:
+ arg_arr = nd.zeros(arg_shapes[j], context, dtype=arg_types[j])
+ if self.grad_req[name] != 'null':
+ grad_arr = nd.zeros(arg_shapes[j], context, dtype=arg_types[j])
+ grad_arrays[name] = grad_arr
+ else:
+ arg_arr = shared_exec.arg_dict[name]
+ assert arg_arr.shape == arg_shapes[j]
+ assert arg_arr.dtype == arg_types[j]
+ if self.grad_req[name] != 'null':
+ grad_arrays[name] = shared_exec.grad_dict[name]
+ else: # data, label, or states
+ arg_arr = _get_or_reshape(name, shared_data_arrays, arg_shapes[j], arg_types[j],
+ context, self.logger)
+
+ # data might also need grad if inputs_need_grad is True
+ if self.grad_req[name] != 'null':
+ grad_arrays[name] = _get_or_reshape('grad of ' + name, shared_data_arrays,
+ arg_shapes[j], arg_types[j], context,
+ self.logger)
+
+ arg_arrays.append(arg_arr)
+
+ # create or borrow aux variables
+ if shared_exec is None:
+ aux_arrays = [nd.zeros(s, context, dtype=t) for s, t in zip(aux_shapes, aux_types)]
+ else:
+ for j, arr in enumerate(shared_exec.aux_arrays):
+ assert aux_shapes[j] == arr.shape
+ assert aux_types[j] == arr.dtype
+ aux_arrays = shared_exec.aux_arrays[:]
+
+ executor = self.symbol.bind(ctx=context, args=arg_arrays,
+ args_grad=grad_arrays, aux_states=aux_arrays,
+ grad_req=self.grad_req, shared_exec=shared_exec)
+ # Get the total bytes allocated for this executor
+ # self._total_exec_bytes += int(executor.debug_str().split('\n')[-3].split()[1])
+ return executor
+
+ def _sliced_shape(self, shapes, i, major_axis):
+ """Get the sliced shapes for the i-th executor.
+
+ Parameters
+ ----------
+ shapes : list of (str, tuple)
+ The original (name, shape) pairs.
+ i : int
+ Which executor we are dealing with.
+ """
+ sliced_shapes = []
+ for desc, axis in zip(shapes, major_axis):
+ shape = list(desc.shape)
+ if axis >= 0:
+ shape[axis] = self.slices[i].stop - self.slices[i].start
+ sliced_shapes.append(DataDesc(desc.name, tuple(shape), desc.dtype, desc.layout))
+ return sliced_shapes
+
+ def install_monitor(self, mon):
+ """Install monitor on all executors"""
+ for exe in self.execs:
+ mon.install(exe)
diff --git a/deeplab/core/__init__.py b/deeplab/core/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/deeplab/core/callback.py b/deeplab/core/callback.py
new file mode 100644
index 0000000..f970d5e
--- /dev/null
+++ b/deeplab/core/callback.py
@@ -0,0 +1,45 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Zheng Zhang
+# --------------------------------------------------------
+
+import time
+import logging
+import mxnet as mx
+
+class Speedometer(object):
+ def __init__(self, batch_size, frequent=50):
+ self.batch_size = batch_size
+ self.frequent = frequent
+ self.init = False
+ self.tic = 0
+ self.last_count = 0
+
+ def __call__(self, param):
+ """Callback to Show speed."""
+ count = param.nbatch
+ if self.last_count > count:
+ self.init = False
+ self.last_count = count
+
+ if self.init:
+ if count % self.frequent == 0:
+ speed = self.frequent * self.batch_size / (time.time() - self.tic)
+ s = ''
+ if param.eval_metric is not None:
+ name, value = param.eval_metric.get()
+ s = "Epoch[%d] Batch [%d]\tSpeed: %.2f samples/sec\tTrain-" % (param.epoch, count, speed)
+ for n, v in zip(name, value):
+ s += "%s=%f,\t" % (n, v)
+ else:
+ s = "Iter[%d] Batch [%d]\tSpeed: %.2f samples/sec" % (param.epoch, count, speed)
+
+ logging.info(s)
+ print(s)
+ self.tic = time.time()
+ else:
+ self.init = True
+ self.tic = time.time()
diff --git a/deeplab/core/loader.py b/deeplab/core/loader.py
new file mode 100644
index 0000000..c796fae
--- /dev/null
+++ b/deeplab/core/loader.py
@@ -0,0 +1,266 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Zheng Zhang
+# --------------------------------------------------------
+
+import numpy as np
+import mxnet as mx
+import random
+import math
+
+from mxnet.executor_manager import _split_input_slice
+from utils.image import tensor_vstack
+from segmentation.segmentation import get_segmentation_train_batch, get_segmentation_test_batch
+from PIL import Image
+from multiprocessing import Pool
+
+class TestDataLoader(mx.io.DataIter):
+ def __init__(self, segdb, config, batch_size=1, shuffle=False):
+ super(TestDataLoader, self).__init__()
+
+ # save parameters as properties
+ self.segdb = segdb
+ self.batch_size = batch_size
+ self.shuffle = shuffle
+ self.config = config
+
+ # infer properties from roidb
+ self.size = len(self.segdb)
+ self.index = np.arange(self.size)
+
+ # decide data and label names (only for training)
+ self.data_name = ['data']
+ self.label_name = None
+
+ # status variable for synchronization between get_data and get_label
+ self.cur = 0
+ self.data = None
+ self.label = []
+ self.im_info = None
+
+ # get first batch to fill in provide_data and provide_label
+ self.reset()
+ self.get_batch()
+
+ @property
+ def provide_data(self):
+ return [[(k, v.shape) for k, v in zip(self.data_name, self.data[i])] for i in xrange(len(self.data))]
+
+ @property
+ def provide_label(self):
+ return [None for i in xrange(len(self.data))]
+
+ @property
+ def provide_data_single(self):
+ return [(k, v.shape) for k, v in zip(self.data_name, self.data[0])]
+
+ @property
+ def provide_label_single(self):
+ return None
+
+ def reset(self):
+ self.cur = 0
+ if self.shuffle:
+ np.random.shuffle(self.index)
+
+ def iter_next(self):
+ return self.cur < self.size
+
+ def next(self):
+ if self.iter_next():
+ self.get_batch()
+ self.cur += self.batch_size
+ return mx.io.DataBatch(data=self.data, label=self.label,
+ pad=self.getpad(), index=self.getindex(),
+ provide_data=self.provide_data, provide_label=self.provide_label)
+ else:
+ raise StopIteration
+
+ def getindex(self):
+ return self.cur / self.batch_size
+
+ def getpad(self):
+ if self.cur + self.batch_size > self.size:
+ return self.cur + self.batch_size - self.size
+ else:
+ return 0
+
+ def get_batch(self):
+ cur_from = self.cur
+ cur_to = min(cur_from + self.batch_size, self.size)
+ segdb = [self.segdb[self.index[i]] for i in range(cur_from, cur_to)]
+
+ data, label, im_info = get_segmentation_test_batch(segdb, self.config)
+
+ self.data = [[mx.nd.array(data[i][name]) for name in self.data_name] for i in xrange(len(data))]
+ self.im_info = im_info
+
+class TrainDataLoader(mx.io.DataIter):
+ def __init__(self, sym, segdb, config, batch_size=1, crop_height = 768, crop_width = 1024, shuffle=False, ctx=None, work_load_list=None):
+ """
+ This Iter will provide seg data to Deeplab network
+ :param sym: to infer shape
+ :param segdb: must be preprocessed
+ :param config: config file
+ :param batch_size: must divide BATCH_SIZE(128)
+ :param crop_height: the height of cropped image
+ :param crop_width: the width of cropped image
+ :param shuffle: bool
+ :param ctx: list of contexts
+ :param work_load_list: list of work load
+ :return: DataLoader
+ """
+ super(TrainDataLoader, self).__init__()
+
+ # save parameters as properties
+ self.sym = sym
+ self.segdb = segdb
+ self.config = config
+ self.batch_size = batch_size
+ if self.config.TRAIN.ENABLE_CROP:
+ self.crop_height = crop_height
+ self.crop_width = crop_width
+ else:
+ self.crop_height = None
+ self.crop_width = None
+
+ self.shuffle = shuffle
+ self.ctx = ctx
+
+ if self.ctx is None:
+ self.ctx = [mx.cpu()]
+ self.work_load_list = work_load_list
+
+ # infer properties from segdb
+ self.size = len(segdb)
+ self.index = np.arange(self.size)
+
+ # decide data and label names
+ self.data_name = ['data']
+ self.label_name = ['label']
+
+ # status variable for synchronization between get_data and get_label
+ self.cur = 0
+ self.batch = None
+ self.data = None
+ self.label = None
+
+ # init multi-process pool
+ self.pool = Pool(processes = len(self.ctx))
+
+ # get first batch to fill in provide_data and provide_label
+ self.reset()
+ self.get_batch_parallel()
+ random.seed()
+
+ @property
+ def provide_data(self):
+ return [[(k, v.shape) for k, v in zip(self.data_name, self.data[i])] for i in xrange(len(self.data))]
+
+ @property
+ def provide_label(self):
+ return [[(k, v.shape) for k, v in zip(self.label_name, self.label[i])] for i in xrange(len(self.data))]
+
+ @property
+ def provide_data_single(self):
+ return [(k, v.shape) for k, v in zip(self.data_name, self.data[0])]
+
+ @property
+ def provide_label_single(self):
+ return [(k, v.shape) for k, v in zip(self.label_name, self.label[0])]
+
+ def reset(self):
+ self.cur = 0
+ if self.shuffle:
+ np.random.shuffle(self.index)
+
+ def iter_next(self):
+ return self.cur + self.batch_size <= self.size
+
+ def next(self):
+ if self.iter_next():
+ self.get_batch_parallel()
+ self.cur += self.batch_size
+ return mx.io.DataBatch(data=self.data, label=self.label,
+ pad=self.getpad(), index=self.getindex(),
+ provide_data=self.provide_data, provide_label=self.provide_label)
+ else:
+ raise StopIteration
+
+ def getindex(self):
+ return self.cur / self.batch_size
+
+ def getpad(self):
+ if self.cur + self.batch_size > self.size:
+ return self.cur + self.batch_size - self.size
+ else:
+ return 0
+
+ def infer_shape(self, max_data_shape=None, max_label_shape=None):
+ """ Return maximum data and label shape for single gpu """
+ if max_data_shape is None:
+ max_data_shape = []
+ if max_label_shape is None:
+ max_label_shape = []
+
+ max_shapes = dict(max_data_shape + max_label_shape)
+ _, label_shape, _ = self.sym.infer_shape(**max_shapes)
+ label_shape = [(self.label_name[0], label_shape)]
+ return max_data_shape, label_shape
+
+ def get_batch_parallel(self):
+ cur_from = self.cur
+ cur_to = min(cur_from + self.batch_size, self.size)
+ segdb = [self.segdb[self.index[i]] for i in range(cur_from, cur_to)]
+
+ # decide multi device slice
+ work_load_list = self.work_load_list
+ ctx = self.ctx
+ if work_load_list is None:
+ work_load_list = [1] * len(ctx)
+ assert isinstance(work_load_list, list) and len(work_load_list) == len(ctx), \
+ "Invalid settings for work load. "
+ slices = _split_input_slice(self.batch_size, work_load_list)
+
+ multiprocess_results = []
+
+ for idx, islice in enumerate(slices):
+ isegdb = [segdb[i] for i in range(islice.start, islice.stop)]
+ multiprocess_results.append(self.pool.apply_async(parfetch, (self.config, self.crop_width, self.crop_height, isegdb)))
+
+ rst = [multiprocess_result.get() for multiprocess_result in multiprocess_results]
+
+ all_data = [_['data'] for _ in rst]
+ all_label = [_['label'] for _ in rst]
+ self.data = [[mx.nd.array(data[key]) for key in self.data_name] for data in all_data]
+ self.label = [[mx.nd.array(label[key]) for key in self.label_name] for label in all_label]
+
+def parfetch(config, crop_width, crop_height, isegdb):
+ # get testing data for multigpu
+ data, label = get_segmentation_train_batch(isegdb, config)
+ if config.TRAIN.ENABLE_CROP:
+ data_internal = data['data']
+ label_internal = label['label']
+
+ sx = math.floor(random.random() * (data_internal.shape[3] - crop_width + 1))
+ sy = math.floor(random.random() * (data_internal.shape[2] - crop_height + 1))
+ sx = (int)(sx)
+ sy = (int)(sy)
+ assert(sx >= 0 and sx < data_internal.shape[3] - crop_width + 1)
+ assert(sy >= 0 and sy < data_internal.shape[2] - crop_height + 1)
+
+ ex = (int)(sx + crop_width - 1)
+ ey = (int)(sy + crop_height - 1)
+
+ data_internal = data_internal[:, :, sy : ey + 1, sx : ex + 1]
+ label_internal = label_internal[:, :, sy : ey + 1, sx : ex + 1]
+
+ data['data'] = data_internal
+ label['label'] = label_internal
+ assert (data['data'].shape[2] == crop_height) and (data['data'].shape[3] == crop_width)
+ assert (label['label'].shape[2] == crop_height) and (label['label'].shape[3] == crop_width)
+
+ return {'data': data, 'label': label}
diff --git a/deeplab/core/metric.py b/deeplab/core/metric.py
new file mode 100644
index 0000000..b3eb4b8
--- /dev/null
+++ b/deeplab/core/metric.py
@@ -0,0 +1,39 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Zheng Zhang
+# --------------------------------------------------------
+
+import mxnet as mx
+import numpy as np
+
+class FCNLogLossMetric(mx.metric.EvalMetric):
+ def __init__(self, show_interval):
+ super(FCNLogLossMetric, self).__init__('FCNLogLoss')
+ self.show_interval = show_interval
+ self.sum_metric = 0
+ self.num_inst = 0
+
+ def update(self, labels, preds):
+ pred = preds[0]
+ label = labels[0]
+
+ # label (b, p)
+ label = label.asnumpy().astype('int32').reshape((-1))
+ # pred (b, c, p) or (b, c, h, w) --> (b, p, c) --> (b*p, c)
+ pred = pred.asnumpy().reshape((pred.shape[0], pred.shape[1], -1)).transpose((0, 2, 1))
+ pred = pred.reshape((label.shape[0], -1))
+
+ # filter with keep_inds
+ keep_inds = np.where(label != 255)[0]
+ label = label[keep_inds]
+ cls = pred[keep_inds, label]
+
+ cls += 1e-14
+ cls_loss = -1 * np.log(cls)
+ cls_loss = np.sum(cls_loss)
+
+ self.sum_metric += cls_loss
+ self.num_inst += label.shape[0]
diff --git a/deeplab/core/module.py b/deeplab/core/module.py
new file mode 100644
index 0000000..8eff831
--- /dev/null
+++ b/deeplab/core/module.py
@@ -0,0 +1,1069 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Zheng Zhang
+# --------------------------------------------------------
+
+"""A `MutableModule` implement the `BaseModule` API, and allows input shape
+varying with training iterations. If shapes vary, executors will rebind,
+using shared arrays from the initial module binded with maximum shape.
+"""
+
+import time
+import logging
+import warnings
+
+from mxnet import context as ctx
+from mxnet.initializer import Uniform, InitDesc
+from mxnet.module.base_module import BaseModule, _check_input_names, _parse_data_desc, _as_list
+from mxnet.model import _create_kvstore, _initialize_kvstore, _update_params, _update_params_on_kvstore, load_checkpoint, BatchEndParam
+from mxnet import metric
+
+from .DataParallelExecutorGroup import DataParallelExecutorGroup
+from mxnet import ndarray as nd
+from mxnet import optimizer as opt
+
+
+class Module(BaseModule):
+ """Module is a basic module that wrap a `Symbol`. It is functionally the same
+ as the `FeedForward` model, except under the module API.
+
+ Parameters
+ ----------
+ symbol : Symbol
+ data_names : list of str
+ Default is `('data')` for a typical model used in image classification.
+ label_names : list of str
+ Default is `('softmax_label')` for a typical model used in image
+ classification.
+ logger : Logger
+ Default is `logging`.
+ context : Context or list of Context
+ Default is `cpu()`.
+ work_load_list : list of number
+ Default `None`, indicating uniform workload.
+ fixed_param_names: list of str
+ Default `None`, indicating no network parameters are fixed.
+ state_names : list of str
+ states are similar to data and label, but not provided by data iterator.
+ Instead they are initialized to 0 and can be set by set_states()
+ """
+ def __init__(self, symbol, data_names=('data',), label_names=('softmax_label',),
+ logger=logging, context=ctx.cpu(), work_load_list=None,
+ fixed_param_names=None, state_names=None):
+ super(Module, self).__init__(logger=logger)
+
+ if isinstance(context, ctx.Context):
+ context = [context]
+ self._context = context
+ if work_load_list is None:
+ work_load_list = [1] * len(self._context)
+ assert len(work_load_list) == len(self._context)
+ self._work_load_list = work_load_list
+
+ self._symbol = symbol
+
+ data_names = list(data_names) if data_names is not None else []
+ label_names = list(label_names) if label_names is not None else []
+ state_names = list(state_names) if state_names is not None else []
+ fixed_param_names = list(fixed_param_names) if fixed_param_names is not None else []
+
+ _check_input_names(symbol, data_names, "data", True)
+ _check_input_names(symbol, label_names, "label", False)
+ _check_input_names(symbol, state_names, "state", True)
+ _check_input_names(symbol, fixed_param_names, "fixed_param", True)
+
+ arg_names = symbol.list_arguments()
+ input_names = data_names + label_names + state_names
+ self._param_names = [x for x in arg_names if x not in input_names]
+ self._fixed_param_names = fixed_param_names
+ self._aux_names = symbol.list_auxiliary_states()
+ self._data_names = data_names
+ self._label_names = label_names
+ self._state_names = state_names
+ self._output_names = symbol.list_outputs()
+
+ self._arg_params = None
+ self._aux_params = None
+ self._params_dirty = False
+
+ self._optimizer = None
+ self._kvstore = None
+ self._update_on_kvstore = None
+ self._updater = None
+ self._preload_opt_states = None
+ self._grad_req = None
+
+ self._exec_group = None
+ self._data_shapes = None
+ self._label_shapes = None
+
+ @staticmethod
+ def load(prefix, epoch, load_optimizer_states=False, **kwargs):
+ """Create a model from previously saved checkpoint.
+
+ Parameters
+ ----------
+ prefix : str
+ path prefix of saved model files. You should have
+ "prefix-symbol.json", "prefix-xxxx.params", and
+ optionally "prefix-xxxx.states", where xxxx is the
+ epoch number.
+ epoch : int
+ epoch to load.
+ load_optimizer_states : bool
+ whether to load optimizer states. Checkpoint needs
+ to have been made with save_optimizer_states=True.
+ data_names : list of str
+ Default is `('data')` for a typical model used in image classification.
+ label_names : list of str
+ Default is `('softmax_label')` for a typical model used in image
+ classification.
+ logger : Logger
+ Default is `logging`.
+ context : Context or list of Context
+ Default is `cpu()`.
+ work_load_list : list of number
+ Default `None`, indicating uniform workload.
+ fixed_param_names: list of str
+ Default `None`, indicating no network parameters are fixed.
+ """
+ sym, args, auxs = load_checkpoint(prefix, epoch)
+ mod = Module(symbol=sym, **kwargs)
+ mod._arg_params = args
+ mod._aux_params = auxs
+ mod.params_initialized = True
+ if load_optimizer_states:
+ mod._preload_opt_states = '%s-%04d.states'%(prefix, epoch)
+ return mod
+
+ def save_checkpoint(self, prefix, epoch, save_optimizer_states=False):
+ """Save current progress to checkpoint.
+ Use mx.callback.module_checkpoint as epoch_end_callback to save during training.
+
+ Parameters
+ ----------
+ prefix : str
+ The file prefix to checkpoint to
+ epoch : int
+ The current epoch number
+ save_optimizer_states : bool
+ Whether to save optimizer states for continue training
+ """
+ self._symbol.save('%s-symbol.json'%prefix)
+ param_name = '%s-%04d.params' % (prefix, epoch)
+ self.save_params(param_name)
+ logging.info('Saved checkpoint to \"%s\"', param_name)
+ if save_optimizer_states:
+ state_name = '%s-%04d.states' % (prefix, epoch)
+ self.save_optimizer_states(state_name)
+ logging.info('Saved optimizer state to \"%s\"', state_name)
+
+ def _reset_bind(self):
+ """Internal function to reset binded state."""
+ self.binded = False
+ self._exec_group = None
+ self._data_shapes = None
+ self._label_shapes = None
+
+ @property
+ def data_names(self):
+ """A list of names for data required by this module."""
+ return self._data_names
+
+ @property
+ def label_names(self):
+ """A list of names for labels required by this module."""
+ return self._label_names
+
+ @property
+ def output_names(self):
+ """A list of names for the outputs of this module."""
+ return self._output_names
+
+ @property
+ def data_shapes(self):
+ """Get data shapes.
+ Returns
+ -------
+ A list of `(name, shape)` pairs.
+ """
+ assert self.binded
+ return self._data_shapes
+
+ @property
+ def label_shapes(self):
+ """Get label shapes.
+ Returns
+ -------
+ A list of `(name, shape)` pairs. The return value could be `None` if
+ the module does not need labels, or if the module is not binded for
+ training (in this case, label information is not available).
+ """
+ assert self.binded
+ return self._label_shapes
+
+ @property
+ def output_shapes(self):
+ """Get output shapes.
+ Returns
+ -------
+ A list of `(name, shape)` pairs.
+ """
+ assert self.binded
+ return self._exec_group.get_output_shapes()
+
+ def get_params(self):
+ """Get current parameters.
+ Returns
+ -------
+ `(arg_params, aux_params)`, each a dictionary of name to parameters (in
+ `NDArray`) mapping.
+ """
+ assert self.binded and self.params_initialized
+
+ if self._params_dirty:
+ self._sync_params_from_devices()
+ return (self._arg_params, self._aux_params)
+
+ def init_params(self, initializer=Uniform(0.01), arg_params=None, aux_params=None,
+ allow_missing=False, force_init=False):
+ """Initialize the parameters and auxiliary states.
+
+ Parameters
+ ----------
+ initializer : Initializer
+ Called to initialize parameters if needed.
+ arg_params : dict
+ If not None, should be a dictionary of existing arg_params. Initialization
+ will be copied from that.
+ aux_params : dict
+ If not None, should be a dictionary of existing aux_params. Initialization
+ will be copied from that.
+ allow_missing : bool
+ If true, params could contain missing values, and the initializer will be
+ called to fill those missing params.
+ force_init : bool
+ If true, will force re-initialize even if already initialized.
+ """
+ if self.params_initialized and not force_init:
+ warnings.warn("Parameters already initialized and force_init=False. "
+ "init_params call ignored.", stacklevel=2)
+ return
+ assert self.binded, 'call bind before initializing the parameters'
+
+ def _impl(name, arr, cache):
+ """Internal helper for parameter initialization"""
+ if cache is not None:
+ if name in cache:
+ cache_arr = cache[name]
+
+ # just in case the cached array is just the target itself
+ if cache_arr is not arr:
+ cache_arr.copyto(arr)
+ else:
+ if not allow_missing:
+ raise RuntimeError("%s is not presented" % name)
+ if initializer != None:
+ initializer(name, arr)
+ else:
+ initializer(name, arr)
+
+ attrs = self._symbol.attr_dict()
+ for name, arr in self._arg_params.items():
+ desc = InitDesc(name, attrs.get(name, None))
+ _impl(desc, arr, arg_params)
+
+ for name, arr in self._aux_params.items():
+ desc = InitDesc(name, attrs.get(name, None))
+ _impl(desc, arr, aux_params)
+
+ self.params_initialized = True
+ self._params_dirty = False
+
+ # copy the initialized parameters to devices
+ self._exec_group.set_params(self._arg_params, self._aux_params)
+
+ def set_params(self, arg_params, aux_params, allow_missing=False, force_init=True):
+ """Assign parameter and aux state values.
+
+ Parameters
+ ----------
+ arg_params : dict
+ Dictionary of name to value (`NDArray`) mapping.
+ aux_params : dict
+ Dictionary of name to value (`NDArray`) mapping.
+ allow_missing : bool
+ If true, params could contain missing values, and the initializer will be
+ called to fill those missing params.
+ force_init : bool
+ If true, will force re-initialize even if already initialized.
+
+ Examples
+ --------
+ An example of setting module parameters::
+ >>> sym, arg_params, aux_params = \
+ >>> mx.model.load_checkpoint(model_prefix, n_epoch_load)
+ >>> mod.set_params(arg_params=arg_params, aux_params=aux_params)
+ """
+ if not allow_missing:
+ self.init_params(initializer=None, arg_params=arg_params, aux_params=aux_params,
+ allow_missing=allow_missing, force_init=force_init)
+ return
+
+ if self.params_initialized and not force_init:
+ warnings.warn("Parameters already initialized and force_init=False. "
+ "set_params call ignored.", stacklevel=2)
+ return
+
+ self._exec_group.set_params(arg_params, aux_params)
+
+ # because we didn't update self._arg_params, they are dirty now.
+ self._params_dirty = True
+ self.params_initialized = True
+
+ def bind(self, data_shapes, label_shapes=None, for_training=True,
+ inputs_need_grad=False, force_rebind=False, shared_module=None,
+ grad_req='write'):
+ """Bind the symbols to construct executors. This is necessary before one
+ can perform computation with the module.
+
+ Parameters
+ ----------
+ data_shapes : list of (str, tuple)
+ Typically is `data_iter.provide_data`.
+ label_shapes : list of (str, tuple)
+ Typically is `data_iter.provide_label`.
+ for_training : bool
+ Default is `True`. Whether the executors should be bind for training.
+ inputs_need_grad : bool
+ Default is `False`. Whether the gradients to the input data need to be computed.
+ Typically this is not needed. But this might be needed when implementing composition
+ of modules.
+ force_rebind : bool
+ Default is `False`. This function does nothing if the executors are already
+ binded. But with this `True`, the executors will be forced to rebind.
+ shared_module : Module
+ Default is `None`. This is used in bucketing. When not `None`, the shared module
+ essentially corresponds to a different bucket -- a module with different symbol
+ but with the same sets of parameters (e.g. unrolled RNNs with different lengths).
+ """
+ # force rebinding is typically used when one want to switch from
+ # training to prediction phase.
+ if force_rebind:
+ self._reset_bind()
+
+ if self.binded:
+ self.logger.warning('Already binded, ignoring bind()')
+ return
+
+ self.for_training = for_training
+ self.inputs_need_grad = inputs_need_grad
+ self.binded = True
+ self._grad_req = grad_req
+
+ if not for_training:
+ assert not inputs_need_grad
+ else:
+ pass
+ # this is not True, as some module might not contains a loss function
+ # that consumes the labels
+ # assert label_shapes is not None
+
+ # self._data_shapes, self._label_shapes = _parse_data_desc(
+ # self.data_names, self.label_names, data_shapes, label_shapes)
+ self._data_shapes, self._label_shapes = zip(*[_parse_data_desc(self.data_names, self.label_names, data_shape, label_shape)
+ for data_shape, label_shape in zip(data_shapes, label_shapes)])
+ if self._label_shapes.count(None) == len(self._label_shapes):
+ self._label_shapes = None
+
+ if shared_module is not None:
+ assert isinstance(shared_module, Module) and \
+ shared_module.binded and shared_module.params_initialized
+ shared_group = shared_module._exec_group
+ else:
+ shared_group = None
+
+ self._exec_group = DataParallelExecutorGroup(self._symbol, self._context,
+ self._work_load_list, self._data_shapes,
+ self._label_shapes, self._param_names,
+ for_training, inputs_need_grad,
+ shared_group, logger=self.logger,
+ fixed_param_names=self._fixed_param_names,
+ grad_req=grad_req,
+ state_names=self._state_names)
+ # self._total_exec_bytes = self._exec_group._total_exec_bytes
+ if shared_module is not None:
+ self.params_initialized = True
+ self._arg_params = shared_module._arg_params
+ self._aux_params = shared_module._aux_params
+ elif self.params_initialized:
+ # if the parameters are already initialized, we are re-binding
+ # so automatically copy the already initialized params
+ self._exec_group.set_params(self._arg_params, self._aux_params)
+ else:
+ assert self._arg_params is None and self._aux_params is None
+ param_arrays = [
+ nd.zeros(x[0].shape, dtype=x[0].dtype)
+ for x in self._exec_group.param_arrays
+ ]
+ self._arg_params = {name:arr for name, arr in zip(self._param_names, param_arrays)}
+
+ aux_arrays = [
+ nd.zeros(x[0].shape, dtype=x[0].dtype)
+ for x in self._exec_group.aux_arrays
+ ]
+ self._aux_params = {name:arr for name, arr in zip(self._aux_names, aux_arrays)}
+
+ if shared_module is not None and shared_module.optimizer_initialized:
+ self.borrow_optimizer(shared_module)
+
+
+ def reshape(self, data_shapes, label_shapes=None):
+ """Reshape the module for new input shapes.
+
+ Parameters
+ ----------
+ data_shapes : list of (str, tuple)
+ Typically is `data_iter.provide_data`.
+ label_shapes : list of (str, tuple)
+ Typically is `data_iter.provide_label`.
+ """
+ assert self.binded
+ # self._data_shapes, self._label_shapes = _parse_data_desc(
+ # self.data_names, self.label_names, data_shapes, label_shapes)
+ self._data_shapes, self._label_shapes = zip(*[_parse_data_desc(self.data_names, self.label_names, data_shape, label_shape)
+ for data_shape, label_shape in zip(data_shapes, label_shapes)])
+
+ self._exec_group.reshape(self._data_shapes, self._label_shapes)
+
+
+ def init_optimizer(self, kvstore='local', optimizer='sgd',
+ optimizer_params=(('learning_rate', 0.01),), force_init=False):
+ """Install and initialize optimizers.
+
+ Parameters
+ ----------
+ kvstore : str or KVStore
+ Default `'local'`.
+ optimizer : str or Optimizer
+ Default `'sgd'`
+ optimizer_params : dict
+ Default `(('learning_rate', 0.01),)`. The default value is not a dictionary,
+ just to avoid pylint warning of dangerous default values.
+ force_init : bool
+ Default `False`, indicating whether we should force re-initializing the
+ optimizer in the case an optimizer is already installed.
+ """
+ assert self.binded and self.params_initialized
+
+ if self.optimizer_initialized and not force_init:
+ self.logger.warning('optimizer already initialized, ignoring...')
+ return
+
+ (kvstore, update_on_kvstore) = \
+ _create_kvstore(kvstore, len(self._context), self._arg_params)
+
+ batch_size = self._exec_group.batch_size
+ if kvstore and 'dist' in kvstore.type and '_sync' in kvstore.type:
+ batch_size *= kvstore.num_workers
+ rescale_grad = 1.0/batch_size
+
+ if isinstance(optimizer, str):
+ idx2name = {}
+ if update_on_kvstore:
+ idx2name.update(enumerate(self._exec_group.param_names))
+ else:
+ for k in range(len(self._context)):
+ idx2name.update({i*len(self._context)+k: n
+ for i, n in enumerate(self._exec_group.param_names)})
+ optimizer_params = dict(optimizer_params)
+ if 'rescale_grad' not in optimizer_params:
+ optimizer_params['rescale_grad'] = rescale_grad
+ optimizer = opt.create(optimizer,
+ sym=self.symbol, param_idx2name=idx2name,
+ **optimizer_params)
+ else:
+ assert isinstance(optimizer, opt.Optimizer)
+ if optimizer.rescale_grad != rescale_grad:
+ #pylint: disable=no-member
+ warnings.warn(
+ "Optimizer created manually outside Module but rescale_grad " +
+ "is not normalized to 1.0/batch_size/num_workers (%s vs. %s). "%(
+ optimizer.rescale_grad, rescale_grad) +
+ "Is this intended?", stacklevel=2)
+
+ self._optimizer = optimizer
+ self._kvstore = kvstore
+ self._update_on_kvstore = update_on_kvstore
+ self._updater = None
+
+ if kvstore:
+ # copy initialized local parameters to kvstore
+ _initialize_kvstore(kvstore=kvstore,
+ param_arrays=self._exec_group.param_arrays,
+ arg_params=self._arg_params,
+ param_names=self._param_names,
+ update_on_kvstore=update_on_kvstore)
+ if update_on_kvstore:
+ kvstore.set_optimizer(self._optimizer)
+ else:
+ self._updater = opt.get_updater(optimizer)
+
+ self.optimizer_initialized = True
+
+ if self._preload_opt_states is not None:
+ self.load_optimizer_states(self._preload_opt_states)
+ self._preload_opt_states = None
+
+ def borrow_optimizer(self, shared_module):
+ """Borrow optimizer from a shared module. Used in bucketing, where exactly the same
+ optimizer (esp. kvstore) is used.
+
+ Parameters
+ ----------
+ shared_module : Module
+ """
+ assert shared_module.optimizer_initialized
+ self._optimizer = shared_module._optimizer
+ self._kvstore = shared_module._kvstore
+ self._update_on_kvstore = shared_module._update_on_kvstore
+ self._updater = shared_module._updater
+ self.optimizer_initialized = True
+
+ def forward(self, data_batch, is_train=None):
+ """Forward computation.
+
+ Parameters
+ ----------
+ data_batch : DataBatch
+ Could be anything with similar API implemented.
+ is_train : bool
+ Default is `None`, which means `is_train` takes the value of `self.for_training`.
+ """
+ assert self.binded and self.params_initialized
+ self._exec_group.forward(data_batch, is_train)
+
+ def backward(self, out_grads=None):
+ """Backward computation.
+
+ Parameters
+ ----------
+ out_grads : NDArray or list of NDArray, optional
+ Gradient on the outputs to be propagated back.
+ This parameter is only needed when bind is called
+ on outputs that are not a loss function.
+ """
+ assert self.binded and self.params_initialized
+ self._exec_group.backward(out_grads=out_grads)
+
+ def update(self):
+ """Update parameters according to the installed optimizer and the gradients computed
+ in the previous forward-backward batch.
+ """
+ assert self.binded and self.params_initialized and self.optimizer_initialized
+
+ self._params_dirty = True
+ if self._update_on_kvstore:
+ _update_params_on_kvstore(self._exec_group.param_arrays,
+ self._exec_group.grad_arrays,
+ self._kvstore)
+ else:
+ _update_params(self._exec_group.param_arrays,
+ self._exec_group.grad_arrays,
+ updater=self._updater,
+ num_device=len(self._context),
+ kvstore=self._kvstore)
+
+ def get_outputs(self, merge_multi_context=True):
+ """Get outputs of the previous forward computation.
+
+ Parameters
+ ----------
+ merge_multi_context : bool
+ Default is `True`. In the case when data-parallelism is used, the outputs
+ will be collected from multiple devices. A `True` value indicate that we
+ should merge the collected results so that they look like from a single
+ executor.
+
+ Returns
+ -------
+ If `merge_multi_context` is `True`, it is like `[out1, out2]`. Otherwise, it
+ is like `[[out1_dev1, out1_dev2], [out2_dev1, out2_dev2]]`. All the output
+ elements are `NDArray`.
+ """
+ assert self.binded and self.params_initialized
+ return self._exec_group.get_outputs(merge_multi_context=merge_multi_context)
+
+ def get_input_grads(self, merge_multi_context=True):
+ """Get the gradients with respect to the inputs of the module.
+
+ Parameters
+ ----------
+ merge_multi_context : bool
+ Default is `True`. In the case when data-parallelism is used, the outputs
+ will be collected from multiple devices. A `True` value indicate that we
+ should merge the collected results so that they look like from a single
+ executor.
+
+ Returns
+ -------
+ If `merge_multi_context` is `True`, it is like `[grad1, grad2]`. Otherwise, it
+ is like `[[grad1_dev1, grad1_dev2], [grad2_dev1, grad2_dev2]]`. All the output
+ elements are `NDArray`.
+ """
+ assert self.binded and self.params_initialized and self.inputs_need_grad
+ return self._exec_group.get_input_grads(merge_multi_context=merge_multi_context)
+
+ def get_states(self, merge_multi_context=True):
+ """Get states from all devices
+
+ Parameters
+ ----------
+ merge_multi_context : bool
+ Default is `True`. In the case when data-parallelism is used, the states
+ will be collected from multiple devices. A `True` value indicate that we
+ should merge the collected results so that they look like from a single
+ executor.
+
+ Returns
+ -------
+ If `merge_multi_context` is `True`, it is like `[out1, out2]`. Otherwise, it
+ is like `[[out1_dev1, out1_dev2], [out2_dev1, out2_dev2]]`. All the output
+ elements are `NDArray`.
+ """
+ assert self.binded and self.params_initialized
+ return self._exec_group.get_states(merge_multi_context=merge_multi_context)
+
+ def set_states(self, states=None, value=None):
+ """Set value for states. Only one of states & value can be specified.
+
+ Parameters
+ ----------
+ states : list of list of NDArrays
+ source states arrays formatted like [[state1_dev1, state1_dev2],
+ [state2_dev1, state2_dev2]].
+ value : number
+ a single scalar value for all state arrays.
+ """
+ assert self.binded and self.params_initialized
+ self._exec_group.set_states(states, value)
+
+ def update_metric(self, eval_metric, labels):
+ """Evaluate and accumulate evaluation metric on outputs of the last forward computation.
+
+ Parameters
+ ----------
+ eval_metric : EvalMetric
+ labels : list of NDArray
+ Typically `data_batch.label`.
+ """
+ self._exec_group.update_metric(eval_metric, labels)
+
+ def _sync_params_from_devices(self):
+ """Synchronize parameters from devices to CPU. This function should be called after
+ calling `update` that updates the parameters on the devices, before one can read the
+ latest parameters from `self._arg_params` and `self._aux_params`.
+ """
+ self._exec_group.get_params(self._arg_params, self._aux_params)
+ self._params_dirty = False
+
+ def save_optimizer_states(self, fname):
+ """Save optimizer (updater) state to file
+
+ Parameters
+ ----------
+ fname : str
+ Path to output states file.
+ """
+ assert self.optimizer_initialized
+
+ if self._update_on_kvstore:
+ self._kvstore.save_optimizer_states(fname)
+ else:
+ with open(fname, 'wb') as fout:
+ fout.write(self._updater.get_states())
+
+ def load_optimizer_states(self, fname):
+ """Load optimizer (updater) state from file
+
+ Parameters
+ ----------
+ fname : str
+ Path to input states file.
+ """
+ assert self.optimizer_initialized
+
+ if self._update_on_kvstore:
+ self._kvstore.load_optimizer_states(fname)
+ else:
+ self._updater.set_states(open(fname, 'rb').read())
+
+ def install_monitor(self, mon):
+ """ Install monitor on all executors """
+ assert self.binded
+ self._exec_group.install_monitor(mon)
+
+
+
+
+class MutableModule(BaseModule):
+ """A mutable module is a module that supports variable input data.
+
+ Parameters
+ ----------
+ symbol : Symbol
+ data_names : list of str
+ label_names : list of str
+ logger : Logger
+ context : Context or list of Context
+ work_load_list : list of number
+ max_data_shapes : list of (name, shape) tuple, designating inputs whose shape vary
+ max_label_shapes : list of (name, shape) tuple, designating inputs whose shape vary
+ fixed_param_prefix : list of str, indicating fixed parameters
+ """
+ def __init__(self, symbol, data_names, label_names,
+ logger=logging, context=ctx.cpu(), work_load_list=None,
+ max_data_shapes=None, max_label_shapes=None, fixed_param_prefix=None):
+ super(MutableModule, self).__init__(logger=logger)
+ self._symbol = symbol
+ self._data_names = data_names
+ self._label_names = label_names
+ self._context = context
+ self._work_load_list = work_load_list
+
+ self._curr_module = None
+ self._max_data_shapes = max_data_shapes
+ self._max_label_shapes = max_label_shapes
+ self._fixed_param_prefix = fixed_param_prefix
+
+ fixed_param_names = list()
+ if fixed_param_prefix is not None:
+ for name in self._symbol.list_arguments():
+ for prefix in self._fixed_param_prefix:
+ if prefix in name:
+ fixed_param_names.append(name)
+ self._fixed_param_names = fixed_param_names
+ self._preload_opt_states = None
+
+ def _reset_bind(self):
+ self.binded = False
+ self._curr_module = None
+
+ @property
+ def data_names(self):
+ return self._data_names
+
+ @property
+ def output_names(self):
+ return self._symbol.list_outputs()
+
+ @property
+ def data_shapes(self):
+ assert self.binded
+ return self._curr_module.data_shapes
+
+ @property
+ def label_shapes(self):
+ assert self.binded
+ return self._curr_module.label_shapes
+
+ @property
+ def output_shapes(self):
+ assert self.binded
+ return self._curr_module.output_shapes
+
+ def get_params(self):
+ assert self.binded and self.params_initialized
+ return self._curr_module.get_params()
+
+ def init_params(self, initializer=Uniform(0.01), arg_params=None, aux_params=None,
+ allow_missing=False, force_init=False):
+ if self.params_initialized and not force_init:
+ return
+ assert self.binded, 'call bind before initializing the parameters'
+ self._curr_module.init_params(initializer=initializer, arg_params=arg_params,
+ aux_params=aux_params, allow_missing=allow_missing,
+ force_init=force_init)
+ self.params_initialized = True
+
+ def bind(self, data_shapes, label_shapes=None, for_training=True,
+ inputs_need_grad=False, force_rebind=False, shared_module=None, grad_req='write'):
+ # in case we already initialized params, keep it
+ if self.params_initialized:
+ arg_params, aux_params = self.get_params()
+
+ # force rebinding is typically used when one want to switch from
+ # training to prediction phase.
+ if force_rebind:
+ self._reset_bind()
+
+ if self.binded:
+ self.logger.warning('Already binded, ignoring bind()')
+ return
+
+ assert shared_module is None, 'shared_module for MutableModule is not supported'
+
+ self.for_training = for_training
+ self.inputs_need_grad = inputs_need_grad
+ self.binded = True
+
+ max_shapes_dict = dict()
+ if self._max_data_shapes is not None:
+ max_shapes_dict.update(dict(self._max_data_shapes[0]))
+ if self._max_label_shapes is not None:
+ max_shapes_dict.update(dict(self._max_label_shapes[0]))
+
+ max_data_shapes = list()
+ for name, shape in data_shapes[0]:
+ if name in max_shapes_dict:
+ max_data_shapes.append((name, max_shapes_dict[name]))
+ else:
+ max_data_shapes.append((name, shape))
+
+ max_label_shapes = list()
+ if not label_shapes.count(None) == len(label_shapes):
+ for name, shape in label_shapes[0]:
+ if name in max_shapes_dict:
+ max_label_shapes.append((name, max_shapes_dict[name]))
+ else:
+ max_label_shapes.append((name, shape))
+
+ if len(max_label_shapes) == 0:
+ max_label_shapes = None
+
+ module = Module(self._symbol, self._data_names, self._label_names, logger=self.logger,
+ context=self._context, work_load_list=self._work_load_list,
+ fixed_param_names=self._fixed_param_names)
+ module.bind([max_data_shapes for _ in xrange(len(self._context))], [max_label_shapes for _ in xrange(len(self._context))],
+ for_training, inputs_need_grad, force_rebind=False, shared_module=None)
+ self._curr_module = module
+
+ # copy back saved params, if already initialized
+ if self.params_initialized:
+ self.set_params(arg_params, aux_params)
+
+ def save_checkpoint(self, prefix, epoch, save_optimizer_states=False):
+ """Save current progress to checkpoint.
+ Use mx.callback.module_checkpoint as epoch_end_callback to save during training.
+
+ Parameters
+ ----------
+ prefix : str
+ The file prefix to checkpoint to
+ epoch : int
+ The current epoch number
+ save_optimizer_states : bool
+ Whether to save optimizer states for continue training
+ """
+ self._curr_module.save_checkpoint(prefix, epoch, save_optimizer_states)
+
+ def init_optimizer(self, kvstore='local', optimizer='sgd',
+ optimizer_params=(('learning_rate', 0.01),), force_init=False):
+ assert self.binded and self.params_initialized
+ if self.optimizer_initialized and not force_init:
+ self.logger.warning('optimizer already initialized, ignoring.')
+ return
+
+ self._curr_module._preload_opt_states = self._preload_opt_states
+ self._curr_module.init_optimizer(kvstore, optimizer, optimizer_params,
+ force_init=force_init)
+ self.optimizer_initialized = True
+
+ def fit(self, train_data, eval_data=None, eval_metric='acc',
+ epoch_end_callback=None, batch_end_callback=None, kvstore='local',
+ optimizer='sgd', optimizer_params=(('learning_rate', 0.01),),
+ eval_end_callback=None,
+ eval_batch_end_callback=None, initializer=Uniform(0.01),
+ arg_params=None, aux_params=None, allow_missing=False,
+ force_rebind=False, force_init=False, begin_epoch=0, num_epoch=None,
+ validation_metric=None, monitor=None, prefix=None):
+ """Train the module parameters.
+
+ Parameters
+ ----------
+ train_data : DataIter
+ eval_data : DataIter
+ If not `None`, will be used as validation set and evaluate the performance
+ after each epoch.
+ eval_metric : str or EvalMetric
+ Default `'acc'`. The performance measure used to display during training.
+ epoch_end_callback : function or list of function
+ Each callback will be called with the current `epoch`, `symbol`, `arg_params`
+ and `aux_params`.
+ batch_end_callback : function or list of function
+ Each callback will be called with a `BatchEndParam`.
+ kvstore : str or KVStore
+ Default `'local'`.
+ optimizer : str or Optimizer
+ Default `'sgd'`
+ optimizer_params : dict
+ Default `(('learning_rate', 0.01),)`. The parameters for the optimizer constructor.
+ The default value is not a `dict`, just to avoid pylint warning on dangerous
+ default values.
+ eval_end_callback : function or list of function
+ These will be called at the end of each full evaluation, with the metrics over
+ the entire evaluation set.
+ eval_batch_end_callback : function or list of function
+ These will be called at the end of each minibatch during evaluation
+ initializer : Initializer
+ Will be called to initialize the module parameters if not already initialized.
+ arg_params : dict
+ Default `None`, if not `None`, should be existing parameters from a trained
+ model or loaded from a checkpoint (previously saved model). In this case,
+ the value here will be used to initialize the module parameters, unless they
+ are already initialized by the user via a call to `init_params` or `fit`.
+ `arg_params` has higher priority to `initializer`.
+ aux_params : dict
+ Default `None`. Similar to `arg_params`, except for auxiliary states.
+ allow_missing : bool
+ Default `False`. Indicate whether we allow missing parameters when `arg_params`
+ and `aux_params` are not `None`. If this is `True`, then the missing parameters
+ will be initialized via the `initializer`.
+ force_rebind : bool
+ Default `False`. Whether to force rebinding the executors if already binded.
+ force_init : bool
+ Default `False`. Indicate whether we should force initialization even if the
+ parameters are already initialized.
+ begin_epoch : int
+ Default `0`. Indicate the starting epoch. Usually, if we are resuming from a
+ checkpoint saved at a previous training phase at epoch N, then we should specify
+ this value as N+1.
+ num_epoch : int
+ Number of epochs to run training.
+
+ Examples
+ --------
+ An example of using fit for training::
+ >>> #Assume training dataIter and validation dataIter are ready
+ >>> mod.fit(train_data=train_dataiter, eval_data=val_dataiter,
+ optimizer_params={'learning_rate':0.01, 'momentum': 0.9},
+ num_epoch=10)
+ """
+ assert num_epoch is not None, 'please specify number of epochs'
+
+ self.bind(data_shapes=train_data.provide_data, label_shapes=train_data.provide_label,
+ for_training=True, force_rebind=force_rebind)
+ if monitor is not None:
+ self.install_monitor(monitor)
+ self.init_params(initializer=initializer, arg_params=arg_params, aux_params=aux_params,
+ allow_missing=allow_missing, force_init=force_init)
+ self.init_optimizer(kvstore=kvstore, optimizer=optimizer,
+ optimizer_params=optimizer_params)
+
+ if validation_metric is None:
+ validation_metric = eval_metric
+ if not isinstance(eval_metric, metric.EvalMetric):
+ eval_metric = metric.create(eval_metric)
+
+ ################################################################################
+ # training loop
+ ################################################################################
+ for epoch in range(begin_epoch, num_epoch):
+ tic = time.time()
+ eval_metric.reset()
+ for nbatch, data_batch in enumerate(train_data):
+ if monitor is not None:
+ monitor.tic()
+ self.forward_backward(data_batch)
+ self.update()
+ self.update_metric(eval_metric, data_batch.label)
+
+ if monitor is not None:
+ monitor.toc_print()
+
+ if batch_end_callback is not None:
+ batch_end_params = BatchEndParam(epoch=epoch, nbatch=nbatch,
+ eval_metric=eval_metric,
+ locals=locals())
+ for callback in _as_list(batch_end_callback):
+ callback(batch_end_params)
+
+ # one epoch of training is finished
+ for name, val in eval_metric.get_name_value():
+ self.logger.info('Epoch[%d] Train-%s=%f', epoch, name, val)
+ toc = time.time()
+ self.logger.info('Epoch[%d] Time cost=%.3f', epoch, (toc-tic))
+
+ # sync aux params across devices
+ arg_params, aux_params = self.get_params()
+ self.set_params(arg_params, aux_params)
+
+ if epoch_end_callback is not None:
+ for callback in _as_list(epoch_end_callback):
+ callback(epoch, self.symbol, arg_params, aux_params)
+
+ #----------------------------------------
+ # evaluation on validation set
+ if eval_data:
+ res = self.score(eval_data, validation_metric,
+ score_end_callback=eval_end_callback,
+ batch_end_callback=eval_batch_end_callback, epoch=epoch)
+ #TODO: pull this into default
+ for name, val in res:
+ self.logger.info('Epoch[%d] Validation-%s=%f', epoch, name, val)
+
+ # end of 1 epoch, reset the data-iter for another epoch
+ train_data.reset()
+
+
+ def forward(self, data_batch, is_train=None):
+ assert self.binded and self.params_initialized
+
+ # get current_shapes
+ if self._curr_module.label_shapes is not None:
+ current_shapes = [dict(self._curr_module.data_shapes[i] + self._curr_module.label_shapes[i]) for i in xrange(len(self._context))]
+ else:
+ current_shapes = [dict(self._curr_module.data_shapes[i]) for i in xrange(len(self._context))]
+
+ # get input_shapes
+ if is_train:
+ input_shapes = [dict(data_batch.provide_data[i] + data_batch.provide_label[i]) for i in xrange(len(self._context))]
+ else:
+ input_shapes = [dict(data_batch.provide_data[i]) for i in xrange(len(data_batch.provide_data))]
+
+ # decide if shape changed
+ shape_changed = len(current_shapes) != len(input_shapes)
+ for pre, cur in zip(current_shapes, input_shapes):
+ for k, v in pre.items():
+ if v != cur[k]:
+ shape_changed = True
+
+ if shape_changed:
+ # self._curr_module.reshape(data_batch.provide_data, data_batch.provide_label)
+ module = Module(self._symbol, self._data_names, self._label_names,
+ logger=self.logger, context=[self._context[i] for i in xrange(len(data_batch.provide_data))],
+ work_load_list=self._work_load_list,
+ fixed_param_names=self._fixed_param_names)
+ module.bind(data_batch.provide_data, data_batch.provide_label, self._curr_module.for_training,
+ self._curr_module.inputs_need_grad, force_rebind=False,
+ shared_module=self._curr_module)
+ self._curr_module = module
+
+ self._curr_module.forward(data_batch, is_train=is_train)
+
+ def backward(self, out_grads=None):
+ assert self.binded and self.params_initialized
+ self._curr_module.backward(out_grads=out_grads)
+
+ def update(self):
+ assert self.binded and self.params_initialized and self.optimizer_initialized
+ self._curr_module.update()
+
+ def get_outputs(self, merge_multi_context=True):
+ assert self.binded and self.params_initialized
+ return self._curr_module.get_outputs(merge_multi_context=merge_multi_context)
+ def get_input_grads(self, merge_multi_context=True):
+ assert self.binded and self.params_initialized and self.inputs_need_grad
+ return self._curr_module.get_input_grads(merge_multi_context=merge_multi_context)
+
+ def update_metric(self, eval_metric, labels):
+ assert self.binded and self.params_initialized
+ self._curr_module.update_metric(eval_metric, labels)
+
+ def install_monitor(self, mon):
+ """ Install monitor on all executors """
+ assert self.binded
+ self._curr_module.install_monitor(mon)
diff --git a/deeplab/core/tester.py b/deeplab/core/tester.py
new file mode 100644
index 0000000..7309cac
--- /dev/null
+++ b/deeplab/core/tester.py
@@ -0,0 +1,123 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Zheng Zhang
+# --------------------------------------------------------
+
+import cPickle
+import os
+import time
+import mxnet as mx
+import numpy as np
+
+from PIL import Image
+from module import MutableModule
+from config.config import config
+from utils import image
+from utils.PrefetchingIter import PrefetchingIter
+
+
+class Predictor(object):
+ def __init__(self, symbol, data_names, label_names,
+ context=mx.cpu(), max_data_shapes=None,
+ provide_data=None, provide_label=None,
+ arg_params=None, aux_params=None):
+ self._mod = MutableModule(symbol, data_names, label_names,
+ context=context, max_data_shapes=max_data_shapes)
+ self._mod.bind(provide_data, provide_label, for_training=False)
+ self._mod.init_params(arg_params=arg_params, aux_params=aux_params)
+
+ def predict(self, data_batch):
+ self._mod.forward(data_batch)
+ # [dict(zip(self._mod.output_names, _)) for _ in zip(*self._mod.get_outputs(merge_multi_context=False))]
+ return [dict(zip(self._mod.output_names, _)) for _ in zip(*self._mod.get_outputs(merge_multi_context=False))]
+
+def pred_eval(predictor, test_data, imdb, vis=False, ignore_cache=None, logger=None):
+ """
+ wrapper for calculating offline validation for faster data analysis
+ in this example, all threshold are set by hand
+ :param predictor: Predictor
+ :param test_data: data iterator, must be non-shuffle
+ :param imdb: image database
+ :param vis: controls visualization
+ :param ignore_cache: ignore the saved cache file
+ :param logger: the logger instance
+ :return:
+ """
+ res_file = os.path.join(imdb.result_path, imdb.name + '_segmentations.pkl')
+ if os.path.exists(res_file) and not ignore_cache:
+ with open(res_file , 'rb') as fid:
+ evaluation_results = cPickle.load(fid)
+ print 'evaluate segmentation: \n'
+ if logger:
+ logger.info('evaluate segmentation: \n')
+
+ meanIU = evaluation_results['meanIU']
+ IU_array = evaluation_results['IU_array']
+ print 'IU_array:\n'
+ if logger:
+ logger.info('IU_array:\n')
+ for i in range(len(IU_array)):
+ print '%.5f'%IU_array[i]
+ if logger:
+ logger.info('%.5f'%IU_array[i])
+ print 'meanIU:%.5f'%meanIU
+ if logger:
+ logger.info( 'meanIU:%.5f'%meanIU)
+ return
+
+ assert vis or not test_data.shuffle
+ if not isinstance(test_data, PrefetchingIter):
+ test_data = PrefetchingIter(test_data)
+
+ num_images = imdb.num_images
+ all_segmentation_result = [[] for _ in xrange(num_images)]
+ idx = 0
+
+ data_time, net_time, post_time = 0.0, 0.0, 0.0
+ t = time.time()
+ for data_batch in test_data:
+ t1 = time.time() - t
+ t = time.time()
+ output_all = predictor.predict(data_batch)
+ output_all = [mx.ndarray.argmax(output['softmax_output'], axis=1).asnumpy() for output in output_all]
+ t2 = time.time() - t
+ t = time.time()
+
+ all_segmentation_result[idx: idx+test_data.batch_size] = [output.astype('int8') for output in output_all]
+
+ idx += test_data.batch_size
+ t3 = time.time() - t
+ t = time.time()
+
+ data_time += t1
+ net_time += t2
+ post_time += t3
+ print 'testing {}/{} data {:.4f}s net {:.4f}s post {:.4f}s'.format(idx, imdb.num_images, data_time / idx * test_data.batch_size, net_time / idx * test_data.batch_size, post_time / idx * test_data.batch_size)
+ if logger:
+ logger.info('testing {}/{} data {:.4f}s net {:.4f}s post {:.4f}s'.format(idx, imdb.num_images, data_time / idx * test_data.batch_size, net_time / idx * test_data.batch_size, post_time / idx * test_data.batch_size))
+
+ evaluation_results = imdb.evaluate_segmentations(all_segmentation_result)
+
+ if not os.path.exists(res_file) or ignore_cache:
+ with open(res_file, 'wb') as f:
+ cPickle.dump(evaluation_results, f, protocol=cPickle.HIGHEST_PROTOCOL)
+
+ print 'evaluate segmentation: \n'
+ if logger:
+ logger.info('evaluate segmentation: \n')
+
+ meanIU = evaluation_results['meanIU']
+ IU_array = evaluation_results['IU_array']
+ print 'IU_array:\n'
+ if logger:
+ logger.info('IU_array:\n')
+ for i in range(len(IU_array)):
+ print '%.5f'%IU_array[i]
+ if logger:
+ logger.info('%.5f'%IU_array[i])
+ print 'meanIU:%.5f'%meanIU
+ if logger:
+ logger.info( 'meanIU:%.5f'%meanIU)
diff --git a/deeplab/demo.py b/deeplab/demo.py
new file mode 100644
index 0000000..9ce80b3
--- /dev/null
+++ b/deeplab/demo.py
@@ -0,0 +1,165 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Written by Zheng Zhang
+# --------------------------------------------------------
+
+import _init_paths
+
+import argparse
+import os
+import sys
+import logging
+import pprint
+import cv2
+from config.config import config, update_config
+from utils.image import resize, transform
+from PIL import Image
+import numpy as np
+
+# get config
+os.environ['PYTHONUNBUFFERED'] = '1'
+os.environ['MXNET_CUDNN_AUTOTUNE_DEFAULT'] = '0'
+os.environ['MXNET_ENABLE_GPU_P2P'] = '0'
+cur_path = os.path.abspath(os.path.dirname(__file__))
+update_config(cur_path + '/../experiments/deeplab/cfgs/deeplab_cityscapes_demo.yaml')
+
+sys.path.insert(0, os.path.join(cur_path, '../external/mxnet', config.MXNET_VERSION))
+import mxnet as mx
+from core.tester import pred_eval, Predictor
+from symbols import *
+from utils.load_model import load_param
+from utils.tictoc import tic, toc
+
+def parse_args():
+ parser = argparse.ArgumentParser(description='Show Deformable ConvNets demo')
+ # general
+ parser.add_argument('--deeplab_only', help='whether use Deeplab only (w/o Deformable ConvNets)', default=False, action='store_true')
+
+ args = parser.parse_args()
+ return args
+
+args = parse_args()
+
+def getpallete(num_cls):
+ """
+ this function is to get the colormap for visualizing the segmentation mask
+ :param num_cls: the number of visulized class
+ :return: the pallete
+ """
+ n = num_cls
+ pallete_raw = np.zeros((n, 3)).astype('uint8')
+ pallete = np.zeros((n, 3)).astype('uint8')
+
+ pallete_raw[6, :] = [111, 74, 0]
+ pallete_raw[7, :] = [ 81, 0, 81]
+ pallete_raw[8, :] = [128, 64, 128]
+ pallete_raw[9, :] = [244, 35, 232]
+ pallete_raw[10, :] = [250, 170, 160]
+ pallete_raw[11, :] = [230, 150, 140]
+ pallete_raw[12, :] = [ 70, 70, 70]
+ pallete_raw[13, :] = [102, 102, 156]
+ pallete_raw[14, :] = [190, 153, 153]
+ pallete_raw[15, :] = [180, 165, 180]
+ pallete_raw[16, :] = [150, 100, 100]
+ pallete_raw[17, :] = [150, 120, 90]
+ pallete_raw[18, :] = [153, 153, 153]
+ pallete_raw[19, :] = [153, 153, 153]
+ pallete_raw[20, :] = [250, 170, 30]
+ pallete_raw[21, :] = [220, 220, 0]
+ pallete_raw[22, :] = [107, 142, 35]
+ pallete_raw[23, :] = [152, 251, 152]
+ pallete_raw[24, :] = [ 70, 130, 180]
+ pallete_raw[25, :] = [220, 20, 60]
+ pallete_raw[26, :] = [255, 0, 0]
+ pallete_raw[27, :] = [ 0, 0, 142]
+ pallete_raw[28, :] = [ 0, 0, 70]
+ pallete_raw[29, :] = [ 0, 60, 100]
+ pallete_raw[30, :] = [ 0, 0, 90]
+ pallete_raw[31, :] = [ 0, 0, 110]
+ pallete_raw[32, :] = [ 0, 80, 100]
+ pallete_raw[33, :] = [ 0, 0, 230]
+ pallete_raw[34, :] = [119, 11, 32]
+
+ train2regular = [7, 8, 11, 12, 13, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 31, 32, 33]
+
+ for i in range(len(train2regular)):
+ pallete[i, :] = pallete_raw[train2regular[i]+1, :]
+
+ pallete = pallete.reshape(-1)
+
+ return pallete
+
+def main():
+ # get symbol
+ pprint.pprint(config)
+ config.symbol = 'resnet_v1_101_deeplab_dcn' if not args.deeplab_only else 'resnet_v1_101_deeplab'
+ sym_instance = eval(config.symbol + '.' + config.symbol)()
+ sym = sym_instance.get_symbol(config, is_train=False)
+
+ # set up class names
+ num_classes = 19
+
+ # load demo data
+ image_names = ['frankfurt_000001_073088_leftImg8bit.png', 'lindau_000024_000019_leftImg8bit.png']
+ data = []
+ for im_name in image_names:
+ assert os.path.exists(cur_path + '/../demo/' + im_name), ('%s does not exist'.format('../demo/' + im_name))
+ im = cv2.imread(cur_path + '/../demo/' + im_name, cv2.IMREAD_COLOR | cv2.IMREAD_IGNORE_ORIENTATION)
+ target_size = config.SCALES[0][0]
+ max_size = config.SCALES[0][1]
+ im, im_scale = resize(im, target_size, max_size, stride=config.network.IMAGE_STRIDE)
+ im_tensor = transform(im, config.network.PIXEL_MEANS)
+ im_info = np.array([[im_tensor.shape[2], im_tensor.shape[3], im_scale]], dtype=np.float32)
+ data.append({'data': im_tensor, 'im_info': im_info})
+
+
+ # get predictor
+ data_names = ['data']
+ label_names = ['softmax_label']
+ data = [[mx.nd.array(data[i][name]) for name in data_names] for i in xrange(len(data))]
+ max_data_shape = [[('data', (1, 3, max([v[0] for v in config.SCALES]), max([v[1] for v in config.SCALES])))]]
+ provide_data = [[(k, v.shape) for k, v in zip(data_names, data[i])] for i in xrange(len(data))]
+ provide_label = [None for i in xrange(len(data))]
+ arg_params, aux_params = load_param(cur_path + '/../model/' + ('deeplab_dcn_cityscapes' if not args.deeplab_only else 'deeplab_cityscapes'), 0, process=True)
+ predictor = Predictor(sym, data_names, label_names,
+ context=[mx.gpu(0)], max_data_shapes=max_data_shape,
+ provide_data=provide_data, provide_label=provide_label,
+ arg_params=arg_params, aux_params=aux_params)
+
+ # warm up
+ for j in xrange(2):
+ data_batch = mx.io.DataBatch(data=[data[0]], label=[], pad=0, index=0,
+ provide_data=[[(k, v.shape) for k, v in zip(data_names, data[0])]],
+ provide_label=[None])
+ output_all = predictor.predict(data_batch)
+ output_all = [mx.ndarray.argmax(output['softmax_output'], axis=1).asnumpy() for output in output_all]
+
+ # test
+ for idx, im_name in enumerate(image_names):
+ data_batch = mx.io.DataBatch(data=[data[idx]], label=[], pad=0, index=idx,
+ provide_data=[[(k, v.shape) for k, v in zip(data_names, data[idx])]],
+ provide_label=[None])
+
+ tic()
+ output_all = predictor.predict(data_batch)
+ output_all = [mx.ndarray.argmax(output['softmax_output'], axis=1).asnumpy() for output in output_all]
+ pallete = getpallete(256)
+
+ segmentation_result = np.uint8(np.squeeze(output_all))
+ segmentation_result = Image.fromarray(segmentation_result)
+ segmentation_result.putpalette(pallete)
+ print 'testing {} {:.4f}s'.format(im_name, toc())
+ pure_im_name, ext_im_name = os.path.splitext(im_name)
+ segmentation_result.save(cur_path + '/../demo/seg_' + pure_im_name + '.png')
+ # visualize
+ im_raw = cv2.imread(cur_path + '/../demo/' + im_name)
+ seg_res = cv2.imread(cur_path + '/../demo/seg_' + pure_im_name + '.png')
+ cv2.imshow('Raw Image', im_raw)
+ cv2.imshow('segmentation_result', seg_res)
+ cv2.waitKey(0)
+ print 'done'
+
+if __name__ == '__main__':
+ main()
diff --git a/deeplab/function/__init__.py b/deeplab/function/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/deeplab/function/reeval.py b/deeplab/function/reeval.py
new file mode 100644
index 0000000..0fdd77f
--- /dev/null
+++ b/deeplab/function/reeval.py
@@ -0,0 +1,54 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Zheng Zhang
+# --------------------------------------------------------
+
+import argparse
+import cPickle
+import os
+import mxnet as mx
+
+from config.config import config, generate_config
+from dataset import *
+
+
+def reeval(args):
+ # load imdb
+ imdb = eval(args.dataset)(args.image_set, args.root_path, args.dataset_path)
+
+ # load detection results
+ cache_file = os.path.join(imdb.cache_path, imdb.name, 'detections.pkl')
+ with open(cache_file) as f:
+ detections = cPickle.load(f)
+
+ # eval
+ imdb.evaluate_detections(detections)
+
+
+def parse_args():
+ parser = argparse.ArgumentParser(description='imdb test')
+ # general
+ parser.add_argument('--network', help='network name', default=default.network, type=str)
+ parser.add_argument('--dataset', help='dataset name', default=default.dataset, type=str)
+ args, rest = parser.parse_known_args()
+ generate_config(args.network, args.dataset)
+ parser.add_argument('--image_set', help='image_set name', default=default.image_set, type=str)
+ parser.add_argument('--root_path', help='output data folder', default=default.root_path, type=str)
+ parser.add_argument('--dataset_path', help='dataset path', default=default.dataset_path, type=str)
+ # other
+ parser.add_argument('--no_shuffle', help='disable random shuffle', action='store_true')
+ args = parser.parse_args()
+ return args
+
+
+def main():
+ args = parse_args()
+ print 'Called with argument:', args
+ reeval(args)
+
+
+if __name__ == '__main__':
+ main()
diff --git a/deeplab/function/test_deeplab.py b/deeplab/function/test_deeplab.py
new file mode 100644
index 0000000..0853c41
--- /dev/null
+++ b/deeplab/function/test_deeplab.py
@@ -0,0 +1,78 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Zheng Zhang
+# --------------------------------------------------------
+
+import argparse
+import pprint
+import logging
+import time
+import os
+import mxnet as mx
+
+from config.config import config, generate_config, update_config
+from config.dataset_conf import dataset
+from config.network_conf import network
+from symbols import *
+from dataset import *
+from core.loader import TestDataLoader
+from core.tester import Predictor, pred_eval
+from utils.load_model import load_param
+
+def test_deeplab(network, dataset, image_set, root_path, dataset_path,
+ ctx, prefix, epoch,
+ vis, logger=None, output_path=None):
+ if not logger:
+ assert False, 'require a logger'
+
+ # print config
+ pprint.pprint(config)
+ logger.info('testing config:{}\n'.format(pprint.pformat(config)))
+
+ # load symbol and testing data
+ sym = eval('get_' + network + '_test')(num_classes=config.dataset.NUM_CLASSES)
+ imdb = eval(dataset)(image_set, root_path, dataset_path, result_path=output_path)
+ segdb = imdb.gt_segdb()
+
+ # get test data iter
+ test_data = TestDataLoader(segdb, batch_size=len(ctx))
+
+ # load model
+ # arg_params, aux_params = load_param(prefix, epoch, convert=True, ctx=ctx, process=True)
+ arg_params, aux_params = load_param(prefix, epoch, process=True)
+
+ # infer shape
+ data_shape_dict = dict(test_data.provide_data_single)
+ arg_shape, _, aux_shape = sym.infer_shape(**data_shape_dict)
+ arg_shape_dict = dict(zip(sym.list_arguments(), arg_shape))
+ aux_shape_dict = dict(zip(sym.list_auxiliary_states(), aux_shape))
+
+ # check parameters
+ for k in sym.list_arguments():
+ if k in data_shape_dict or k in ['softmax_label']:
+ continue
+ assert k in arg_params, k + ' not initialized'
+ assert arg_params[k].shape == arg_shape_dict[k], \
+ 'shape inconsistent for ' + k + ' inferred ' + str(arg_shape_dict[k]) + ' provided ' + str(arg_params[k].shape)
+ for k in sym.list_auxiliary_states():
+ assert k in aux_params, k + ' not initialized'
+ assert aux_params[k].shape == aux_shape_dict[k], \
+ 'shape inconsistent for ' + k + ' inferred ' + str(aux_shape_dict[k]) + ' provided ' + str(aux_params[k].shape)
+
+ # decide maximum shape
+ data_names = [k[0] for k in test_data.provide_data_single]
+ label_names = ['softmax_label']
+ max_data_shape = [[('data', (1, 3, max([v[0] for v in config.SCALES]), max([v[1] for v in config.SCALES])))]]
+
+ # create predictor
+ predictor = Predictor(sym, data_names, label_names,
+ context=ctx, max_data_shapes=max_data_shape,
+ provide_data=test_data.provide_data, provide_label=test_data.provide_label,
+ arg_params=arg_params, aux_params=aux_params)
+
+ # start detection
+ pred_eval(predictor, test_data, imdb, vis=vis, logger=logger)
+
diff --git a/deeplab/symbols/__init__.py b/deeplab/symbols/__init__.py
new file mode 100644
index 0000000..54c71c0
--- /dev/null
+++ b/deeplab/symbols/__init__.py
@@ -0,0 +1,2 @@
+import resnet_v1_101_deeplab
+import resnet_v1_101_deeplab_dcn
diff --git a/deeplab/symbols/resnet_v1_101_deeplab.py b/deeplab/symbols/resnet_v1_101_deeplab.py
new file mode 100644
index 0000000..fc41aee
--- /dev/null
+++ b/deeplab/symbols/resnet_v1_101_deeplab.py
@@ -0,0 +1,828 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Written by Zheng Zhang
+# --------------------------------------------------------
+
+import cPickle
+import mxnet as mx
+from utils.symbol import Symbol
+
+class resnet_v1_101_deeplab(Symbol):
+ def __init__(self):
+ """
+ Use __init__ to define parameter network needs
+ """
+ self.eps = 1e-5
+ self.use_global_stats = True
+ self.workspace = 4096
+ self.units = (3, 4, 23, 3) # use for 101
+ self.filter_list = [256, 512, 1024, 2048]
+
+ def get_resnet_conv(self, data):
+ conv1 = mx.symbol.Convolution(name='conv1', data=data, num_filter=64, pad=(3, 3), kernel=(7, 7), stride=(2, 2),
+ no_bias=True)
+ bn_conv1 = mx.symbol.BatchNorm(name='bn_conv1', data=conv1, use_global_stats=True, fix_gamma=False, eps = self.eps)
+ scale_conv1 = bn_conv1
+ conv1_relu = mx.symbol.Activation(name='conv1_relu', data=scale_conv1, act_type='relu')
+ pool1 = mx.symbol.Pooling(name='pool1', data=conv1_relu, pooling_convention='full', pad=(0, 0), kernel=(3, 3),
+ stride=(2, 2), pool_type='max')
+ res2a_branch1 = mx.symbol.Convolution(name='res2a_branch1', data=pool1, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn2a_branch1 = mx.symbol.BatchNorm(name='bn2a_branch1', data=res2a_branch1, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale2a_branch1 = bn2a_branch1
+ res2a_branch2a = mx.symbol.Convolution(name='res2a_branch2a', data=pool1, num_filter=64, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn2a_branch2a = mx.symbol.BatchNorm(name='bn2a_branch2a', data=res2a_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale2a_branch2a = bn2a_branch2a
+ res2a_branch2a_relu = mx.symbol.Activation(name='res2a_branch2a_relu', data=scale2a_branch2a, act_type='relu')
+ res2a_branch2b = mx.symbol.Convolution(name='res2a_branch2b', data=res2a_branch2a_relu, num_filter=64,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn2a_branch2b = mx.symbol.BatchNorm(name='bn2a_branch2b', data=res2a_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale2a_branch2b = bn2a_branch2b
+ res2a_branch2b_relu = mx.symbol.Activation(name='res2a_branch2b_relu', data=scale2a_branch2b, act_type='relu')
+ res2a_branch2c = mx.symbol.Convolution(name='res2a_branch2c', data=res2a_branch2b_relu, num_filter=256,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn2a_branch2c = mx.symbol.BatchNorm(name='bn2a_branch2c', data=res2a_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale2a_branch2c = bn2a_branch2c
+ res2a = mx.symbol.broadcast_add(name='res2a', *[scale2a_branch1, scale2a_branch2c])
+ res2a_relu = mx.symbol.Activation(name='res2a_relu', data=res2a, act_type='relu')
+ res2b_branch2a = mx.symbol.Convolution(name='res2b_branch2a', data=res2a_relu, num_filter=64, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn2b_branch2a = mx.symbol.BatchNorm(name='bn2b_branch2a', data=res2b_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale2b_branch2a = bn2b_branch2a
+ res2b_branch2a_relu = mx.symbol.Activation(name='res2b_branch2a_relu', data=scale2b_branch2a, act_type='relu')
+ res2b_branch2b = mx.symbol.Convolution(name='res2b_branch2b', data=res2b_branch2a_relu, num_filter=64,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn2b_branch2b = mx.symbol.BatchNorm(name='bn2b_branch2b', data=res2b_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale2b_branch2b = bn2b_branch2b
+ res2b_branch2b_relu = mx.symbol.Activation(name='res2b_branch2b_relu', data=scale2b_branch2b, act_type='relu')
+ res2b_branch2c = mx.symbol.Convolution(name='res2b_branch2c', data=res2b_branch2b_relu, num_filter=256,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn2b_branch2c = mx.symbol.BatchNorm(name='bn2b_branch2c', data=res2b_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale2b_branch2c = bn2b_branch2c
+ res2b = mx.symbol.broadcast_add(name='res2b', *[res2a_relu, scale2b_branch2c])
+ res2b_relu = mx.symbol.Activation(name='res2b_relu', data=res2b, act_type='relu')
+ res2c_branch2a = mx.symbol.Convolution(name='res2c_branch2a', data=res2b_relu, num_filter=64, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn2c_branch2a = mx.symbol.BatchNorm(name='bn2c_branch2a', data=res2c_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale2c_branch2a = bn2c_branch2a
+ res2c_branch2a_relu = mx.symbol.Activation(name='res2c_branch2a_relu', data=scale2c_branch2a, act_type='relu')
+ res2c_branch2b = mx.symbol.Convolution(name='res2c_branch2b', data=res2c_branch2a_relu, num_filter=64,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn2c_branch2b = mx.symbol.BatchNorm(name='bn2c_branch2b', data=res2c_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale2c_branch2b = bn2c_branch2b
+ res2c_branch2b_relu = mx.symbol.Activation(name='res2c_branch2b_relu', data=scale2c_branch2b, act_type='relu')
+ res2c_branch2c = mx.symbol.Convolution(name='res2c_branch2c', data=res2c_branch2b_relu, num_filter=256,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn2c_branch2c = mx.symbol.BatchNorm(name='bn2c_branch2c', data=res2c_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale2c_branch2c = bn2c_branch2c
+ res2c = mx.symbol.broadcast_add(name='res2c', *[res2b_relu, scale2c_branch2c])
+ res2c_relu = mx.symbol.Activation(name='res2c_relu', data=res2c, act_type='relu')
+ res3a_branch1 = mx.symbol.Convolution(name='res3a_branch1', data=res2c_relu, num_filter=512, pad=(0, 0),
+ kernel=(1, 1), stride=(2, 2), no_bias=True)
+ bn3a_branch1 = mx.symbol.BatchNorm(name='bn3a_branch1', data=res3a_branch1, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3a_branch1 = bn3a_branch1
+ res3a_branch2a = mx.symbol.Convolution(name='res3a_branch2a', data=res2c_relu, num_filter=128, pad=(0, 0),
+ kernel=(1, 1), stride=(2, 2), no_bias=True)
+ bn3a_branch2a = mx.symbol.BatchNorm(name='bn3a_branch2a', data=res3a_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3a_branch2a = bn3a_branch2a
+ res3a_branch2a_relu = mx.symbol.Activation(name='res3a_branch2a_relu', data=scale3a_branch2a, act_type='relu')
+ res3a_branch2b = mx.symbol.Convolution(name='res3a_branch2b', data=res3a_branch2a_relu, num_filter=128,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn3a_branch2b = mx.symbol.BatchNorm(name='bn3a_branch2b', data=res3a_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3a_branch2b = bn3a_branch2b
+ res3a_branch2b_relu = mx.symbol.Activation(name='res3a_branch2b_relu', data=scale3a_branch2b, act_type='relu')
+ res3a_branch2c = mx.symbol.Convolution(name='res3a_branch2c', data=res3a_branch2b_relu, num_filter=512,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn3a_branch2c = mx.symbol.BatchNorm(name='bn3a_branch2c', data=res3a_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3a_branch2c = bn3a_branch2c
+ res3a = mx.symbol.broadcast_add(name='res3a', *[scale3a_branch1, scale3a_branch2c])
+ res3a_relu = mx.symbol.Activation(name='res3a_relu', data=res3a, act_type='relu')
+ res3b1_branch2a = mx.symbol.Convolution(name='res3b1_branch2a', data=res3a_relu, num_filter=128, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn3b1_branch2a = mx.symbol.BatchNorm(name='bn3b1_branch2a', data=res3b1_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3b1_branch2a = bn3b1_branch2a
+ res3b1_branch2a_relu = mx.symbol.Activation(name='res3b1_branch2a_relu', data=scale3b1_branch2a,
+ act_type='relu')
+ res3b1_branch2b = mx.symbol.Convolution(name='res3b1_branch2b', data=res3b1_branch2a_relu, num_filter=128,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn3b1_branch2b = mx.symbol.BatchNorm(name='bn3b1_branch2b', data=res3b1_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3b1_branch2b = bn3b1_branch2b
+ res3b1_branch2b_relu = mx.symbol.Activation(name='res3b1_branch2b_relu', data=scale3b1_branch2b,
+ act_type='relu')
+ res3b1_branch2c = mx.symbol.Convolution(name='res3b1_branch2c', data=res3b1_branch2b_relu, num_filter=512,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn3b1_branch2c = mx.symbol.BatchNorm(name='bn3b1_branch2c', data=res3b1_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3b1_branch2c = bn3b1_branch2c
+ res3b1 = mx.symbol.broadcast_add(name='res3b1', *[res3a_relu, scale3b1_branch2c])
+ res3b1_relu = mx.symbol.Activation(name='res3b1_relu', data=res3b1, act_type='relu')
+ res3b2_branch2a = mx.symbol.Convolution(name='res3b2_branch2a', data=res3b1_relu, num_filter=128, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn3b2_branch2a = mx.symbol.BatchNorm(name='bn3b2_branch2a', data=res3b2_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3b2_branch2a = bn3b2_branch2a
+ res3b2_branch2a_relu = mx.symbol.Activation(name='res3b2_branch2a_relu', data=scale3b2_branch2a,
+ act_type='relu')
+ res3b2_branch2b = mx.symbol.Convolution(name='res3b2_branch2b', data=res3b2_branch2a_relu, num_filter=128,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn3b2_branch2b = mx.symbol.BatchNorm(name='bn3b2_branch2b', data=res3b2_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3b2_branch2b = bn3b2_branch2b
+ res3b2_branch2b_relu = mx.symbol.Activation(name='res3b2_branch2b_relu', data=scale3b2_branch2b,
+ act_type='relu')
+ res3b2_branch2c = mx.symbol.Convolution(name='res3b2_branch2c', data=res3b2_branch2b_relu, num_filter=512,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn3b2_branch2c = mx.symbol.BatchNorm(name='bn3b2_branch2c', data=res3b2_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3b2_branch2c = bn3b2_branch2c
+ res3b2 = mx.symbol.broadcast_add(name='res3b2', *[res3b1_relu, scale3b2_branch2c])
+ res3b2_relu = mx.symbol.Activation(name='res3b2_relu', data=res3b2, act_type='relu')
+ res3b3_branch2a = mx.symbol.Convolution(name='res3b3_branch2a', data=res3b2_relu, num_filter=128, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn3b3_branch2a = mx.symbol.BatchNorm(name='bn3b3_branch2a', data=res3b3_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3b3_branch2a = bn3b3_branch2a
+ res3b3_branch2a_relu = mx.symbol.Activation(name='res3b3_branch2a_relu', data=scale3b3_branch2a,
+ act_type='relu')
+ res3b3_branch2b = mx.symbol.Convolution(name='res3b3_branch2b', data=res3b3_branch2a_relu, num_filter=128,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn3b3_branch2b = mx.symbol.BatchNorm(name='bn3b3_branch2b', data=res3b3_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3b3_branch2b = bn3b3_branch2b
+ res3b3_branch2b_relu = mx.symbol.Activation(name='res3b3_branch2b_relu', data=scale3b3_branch2b,
+ act_type='relu')
+ res3b3_branch2c = mx.symbol.Convolution(name='res3b3_branch2c', data=res3b3_branch2b_relu, num_filter=512,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn3b3_branch2c = mx.symbol.BatchNorm(name='bn3b3_branch2c', data=res3b3_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3b3_branch2c = bn3b3_branch2c
+ res3b3 = mx.symbol.broadcast_add(name='res3b3', *[res3b2_relu, scale3b3_branch2c])
+ res3b3_relu = mx.symbol.Activation(name='res3b3_relu', data=res3b3, act_type='relu')
+ res4a_branch1 = mx.symbol.Convolution(name='res4a_branch1', data=res3b3_relu, num_filter=1024, pad=(0, 0),
+ kernel=(1, 1), stride=(2, 2), no_bias=True)
+ bn4a_branch1 = mx.symbol.BatchNorm(name='bn4a_branch1', data=res4a_branch1, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4a_branch1 = bn4a_branch1
+ res4a_branch2a = mx.symbol.Convolution(name='res4a_branch2a', data=res3b3_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(2, 2), no_bias=True)
+ bn4a_branch2a = mx.symbol.BatchNorm(name='bn4a_branch2a', data=res4a_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4a_branch2a = bn4a_branch2a
+ res4a_branch2a_relu = mx.symbol.Activation(name='res4a_branch2a_relu', data=scale4a_branch2a, act_type='relu')
+ res4a_branch2b = mx.symbol.Convolution(name='res4a_branch2b', data=res4a_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4a_branch2b = mx.symbol.BatchNorm(name='bn4a_branch2b', data=res4a_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4a_branch2b = bn4a_branch2b
+ res4a_branch2b_relu = mx.symbol.Activation(name='res4a_branch2b_relu', data=scale4a_branch2b, act_type='relu')
+ res4a_branch2c = mx.symbol.Convolution(name='res4a_branch2c', data=res4a_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4a_branch2c = mx.symbol.BatchNorm(name='bn4a_branch2c', data=res4a_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4a_branch2c = bn4a_branch2c
+ res4a = mx.symbol.broadcast_add(name='res4a', *[scale4a_branch1, scale4a_branch2c])
+ res4a_relu = mx.symbol.Activation(name='res4a_relu', data=res4a, act_type='relu')
+ res4b1_branch2a = mx.symbol.Convolution(name='res4b1_branch2a', data=res4a_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b1_branch2a = mx.symbol.BatchNorm(name='bn4b1_branch2a', data=res4b1_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b1_branch2a = bn4b1_branch2a
+ res4b1_branch2a_relu = mx.symbol.Activation(name='res4b1_branch2a_relu', data=scale4b1_branch2a,
+ act_type='relu')
+ res4b1_branch2b = mx.symbol.Convolution(name='res4b1_branch2b', data=res4b1_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b1_branch2b = mx.symbol.BatchNorm(name='bn4b1_branch2b', data=res4b1_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b1_branch2b = bn4b1_branch2b
+ res4b1_branch2b_relu = mx.symbol.Activation(name='res4b1_branch2b_relu', data=scale4b1_branch2b,
+ act_type='relu')
+ res4b1_branch2c = mx.symbol.Convolution(name='res4b1_branch2c', data=res4b1_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b1_branch2c = mx.symbol.BatchNorm(name='bn4b1_branch2c', data=res4b1_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b1_branch2c = bn4b1_branch2c
+ res4b1 = mx.symbol.broadcast_add(name='res4b1', *[res4a_relu, scale4b1_branch2c])
+ res4b1_relu = mx.symbol.Activation(name='res4b1_relu', data=res4b1, act_type='relu')
+ res4b2_branch2a = mx.symbol.Convolution(name='res4b2_branch2a', data=res4b1_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b2_branch2a = mx.symbol.BatchNorm(name='bn4b2_branch2a', data=res4b2_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b2_branch2a = bn4b2_branch2a
+ res4b2_branch2a_relu = mx.symbol.Activation(name='res4b2_branch2a_relu', data=scale4b2_branch2a,
+ act_type='relu')
+ res4b2_branch2b = mx.symbol.Convolution(name='res4b2_branch2b', data=res4b2_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b2_branch2b = mx.symbol.BatchNorm(name='bn4b2_branch2b', data=res4b2_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b2_branch2b = bn4b2_branch2b
+ res4b2_branch2b_relu = mx.symbol.Activation(name='res4b2_branch2b_relu', data=scale4b2_branch2b,
+ act_type='relu')
+ res4b2_branch2c = mx.symbol.Convolution(name='res4b2_branch2c', data=res4b2_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b2_branch2c = mx.symbol.BatchNorm(name='bn4b2_branch2c', data=res4b2_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b2_branch2c = bn4b2_branch2c
+ res4b2 = mx.symbol.broadcast_add(name='res4b2', *[res4b1_relu, scale4b2_branch2c])
+ res4b2_relu = mx.symbol.Activation(name='res4b2_relu', data=res4b2, act_type='relu')
+ res4b3_branch2a = mx.symbol.Convolution(name='res4b3_branch2a', data=res4b2_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b3_branch2a = mx.symbol.BatchNorm(name='bn4b3_branch2a', data=res4b3_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b3_branch2a = bn4b3_branch2a
+ res4b3_branch2a_relu = mx.symbol.Activation(name='res4b3_branch2a_relu', data=scale4b3_branch2a,
+ act_type='relu')
+ res4b3_branch2b = mx.symbol.Convolution(name='res4b3_branch2b', data=res4b3_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b3_branch2b = mx.symbol.BatchNorm(name='bn4b3_branch2b', data=res4b3_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b3_branch2b = bn4b3_branch2b
+ res4b3_branch2b_relu = mx.symbol.Activation(name='res4b3_branch2b_relu', data=scale4b3_branch2b,
+ act_type='relu')
+ res4b3_branch2c = mx.symbol.Convolution(name='res4b3_branch2c', data=res4b3_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b3_branch2c = mx.symbol.BatchNorm(name='bn4b3_branch2c', data=res4b3_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b3_branch2c = bn4b3_branch2c
+ res4b3 = mx.symbol.broadcast_add(name='res4b3', *[res4b2_relu, scale4b3_branch2c])
+ res4b3_relu = mx.symbol.Activation(name='res4b3_relu', data=res4b3, act_type='relu')
+ res4b4_branch2a = mx.symbol.Convolution(name='res4b4_branch2a', data=res4b3_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b4_branch2a = mx.symbol.BatchNorm(name='bn4b4_branch2a', data=res4b4_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b4_branch2a = bn4b4_branch2a
+ res4b4_branch2a_relu = mx.symbol.Activation(name='res4b4_branch2a_relu', data=scale4b4_branch2a,
+ act_type='relu')
+ res4b4_branch2b = mx.symbol.Convolution(name='res4b4_branch2b', data=res4b4_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b4_branch2b = mx.symbol.BatchNorm(name='bn4b4_branch2b', data=res4b4_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b4_branch2b = bn4b4_branch2b
+ res4b4_branch2b_relu = mx.symbol.Activation(name='res4b4_branch2b_relu', data=scale4b4_branch2b,
+ act_type='relu')
+ res4b4_branch2c = mx.symbol.Convolution(name='res4b4_branch2c', data=res4b4_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b4_branch2c = mx.symbol.BatchNorm(name='bn4b4_branch2c', data=res4b4_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b4_branch2c = bn4b4_branch2c
+ res4b4 = mx.symbol.broadcast_add(name='res4b4', *[res4b3_relu, scale4b4_branch2c])
+ res4b4_relu = mx.symbol.Activation(name='res4b4_relu', data=res4b4, act_type='relu')
+ res4b5_branch2a = mx.symbol.Convolution(name='res4b5_branch2a', data=res4b4_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b5_branch2a = mx.symbol.BatchNorm(name='bn4b5_branch2a', data=res4b5_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b5_branch2a = bn4b5_branch2a
+ res4b5_branch2a_relu = mx.symbol.Activation(name='res4b5_branch2a_relu', data=scale4b5_branch2a,
+ act_type='relu')
+ res4b5_branch2b = mx.symbol.Convolution(name='res4b5_branch2b', data=res4b5_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b5_branch2b = mx.symbol.BatchNorm(name='bn4b5_branch2b', data=res4b5_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b5_branch2b = bn4b5_branch2b
+ res4b5_branch2b_relu = mx.symbol.Activation(name='res4b5_branch2b_relu', data=scale4b5_branch2b,
+ act_type='relu')
+ res4b5_branch2c = mx.symbol.Convolution(name='res4b5_branch2c', data=res4b5_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b5_branch2c = mx.symbol.BatchNorm(name='bn4b5_branch2c', data=res4b5_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b5_branch2c = bn4b5_branch2c
+ res4b5 = mx.symbol.broadcast_add(name='res4b5', *[res4b4_relu, scale4b5_branch2c])
+ res4b5_relu = mx.symbol.Activation(name='res4b5_relu', data=res4b5, act_type='relu')
+ res4b6_branch2a = mx.symbol.Convolution(name='res4b6_branch2a', data=res4b5_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b6_branch2a = mx.symbol.BatchNorm(name='bn4b6_branch2a', data=res4b6_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b6_branch2a = bn4b6_branch2a
+ res4b6_branch2a_relu = mx.symbol.Activation(name='res4b6_branch2a_relu', data=scale4b6_branch2a,
+ act_type='relu')
+ res4b6_branch2b = mx.symbol.Convolution(name='res4b6_branch2b', data=res4b6_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b6_branch2b = mx.symbol.BatchNorm(name='bn4b6_branch2b', data=res4b6_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b6_branch2b = bn4b6_branch2b
+ res4b6_branch2b_relu = mx.symbol.Activation(name='res4b6_branch2b_relu', data=scale4b6_branch2b,
+ act_type='relu')
+ res4b6_branch2c = mx.symbol.Convolution(name='res4b6_branch2c', data=res4b6_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b6_branch2c = mx.symbol.BatchNorm(name='bn4b6_branch2c', data=res4b6_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b6_branch2c = bn4b6_branch2c
+ res4b6 = mx.symbol.broadcast_add(name='res4b6', *[res4b5_relu, scale4b6_branch2c])
+ res4b6_relu = mx.symbol.Activation(name='res4b6_relu', data=res4b6, act_type='relu')
+ res4b7_branch2a = mx.symbol.Convolution(name='res4b7_branch2a', data=res4b6_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b7_branch2a = mx.symbol.BatchNorm(name='bn4b7_branch2a', data=res4b7_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b7_branch2a = bn4b7_branch2a
+ res4b7_branch2a_relu = mx.symbol.Activation(name='res4b7_branch2a_relu', data=scale4b7_branch2a,
+ act_type='relu')
+ res4b7_branch2b = mx.symbol.Convolution(name='res4b7_branch2b', data=res4b7_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b7_branch2b = mx.symbol.BatchNorm(name='bn4b7_branch2b', data=res4b7_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b7_branch2b = bn4b7_branch2b
+ res4b7_branch2b_relu = mx.symbol.Activation(name='res4b7_branch2b_relu', data=scale4b7_branch2b,
+ act_type='relu')
+ res4b7_branch2c = mx.symbol.Convolution(name='res4b7_branch2c', data=res4b7_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b7_branch2c = mx.symbol.BatchNorm(name='bn4b7_branch2c', data=res4b7_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b7_branch2c = bn4b7_branch2c
+ res4b7 = mx.symbol.broadcast_add(name='res4b7', *[res4b6_relu, scale4b7_branch2c])
+ res4b7_relu = mx.symbol.Activation(name='res4b7_relu', data=res4b7, act_type='relu')
+ res4b8_branch2a = mx.symbol.Convolution(name='res4b8_branch2a', data=res4b7_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b8_branch2a = mx.symbol.BatchNorm(name='bn4b8_branch2a', data=res4b8_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b8_branch2a = bn4b8_branch2a
+ res4b8_branch2a_relu = mx.symbol.Activation(name='res4b8_branch2a_relu', data=scale4b8_branch2a,
+ act_type='relu')
+ res4b8_branch2b = mx.symbol.Convolution(name='res4b8_branch2b', data=res4b8_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b8_branch2b = mx.symbol.BatchNorm(name='bn4b8_branch2b', data=res4b8_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b8_branch2b = bn4b8_branch2b
+ res4b8_branch2b_relu = mx.symbol.Activation(name='res4b8_branch2b_relu', data=scale4b8_branch2b,
+ act_type='relu')
+ res4b8_branch2c = mx.symbol.Convolution(name='res4b8_branch2c', data=res4b8_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b8_branch2c = mx.symbol.BatchNorm(name='bn4b8_branch2c', data=res4b8_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b8_branch2c = bn4b8_branch2c
+ res4b8 = mx.symbol.broadcast_add(name='res4b8', *[res4b7_relu, scale4b8_branch2c])
+ res4b8_relu = mx.symbol.Activation(name='res4b8_relu', data=res4b8, act_type='relu')
+ res4b9_branch2a = mx.symbol.Convolution(name='res4b9_branch2a', data=res4b8_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b9_branch2a = mx.symbol.BatchNorm(name='bn4b9_branch2a', data=res4b9_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b9_branch2a = bn4b9_branch2a
+ res4b9_branch2a_relu = mx.symbol.Activation(name='res4b9_branch2a_relu', data=scale4b9_branch2a,
+ act_type='relu')
+ res4b9_branch2b = mx.symbol.Convolution(name='res4b9_branch2b', data=res4b9_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b9_branch2b = mx.symbol.BatchNorm(name='bn4b9_branch2b', data=res4b9_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b9_branch2b = bn4b9_branch2b
+ res4b9_branch2b_relu = mx.symbol.Activation(name='res4b9_branch2b_relu', data=scale4b9_branch2b,
+ act_type='relu')
+ res4b9_branch2c = mx.symbol.Convolution(name='res4b9_branch2c', data=res4b9_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b9_branch2c = mx.symbol.BatchNorm(name='bn4b9_branch2c', data=res4b9_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b9_branch2c = bn4b9_branch2c
+ res4b9 = mx.symbol.broadcast_add(name='res4b9', *[res4b8_relu, scale4b9_branch2c])
+ res4b9_relu = mx.symbol.Activation(name='res4b9_relu', data=res4b9, act_type='relu')
+ res4b10_branch2a = mx.symbol.Convolution(name='res4b10_branch2a', data=res4b9_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b10_branch2a = mx.symbol.BatchNorm(name='bn4b10_branch2a', data=res4b10_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b10_branch2a = bn4b10_branch2a
+ res4b10_branch2a_relu = mx.symbol.Activation(name='res4b10_branch2a_relu', data=scale4b10_branch2a,
+ act_type='relu')
+ res4b10_branch2b = mx.symbol.Convolution(name='res4b10_branch2b', data=res4b10_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b10_branch2b = mx.symbol.BatchNorm(name='bn4b10_branch2b', data=res4b10_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b10_branch2b = bn4b10_branch2b
+ res4b10_branch2b_relu = mx.symbol.Activation(name='res4b10_branch2b_relu', data=scale4b10_branch2b,
+ act_type='relu')
+ res4b10_branch2c = mx.symbol.Convolution(name='res4b10_branch2c', data=res4b10_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b10_branch2c = mx.symbol.BatchNorm(name='bn4b10_branch2c', data=res4b10_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b10_branch2c = bn4b10_branch2c
+ res4b10 = mx.symbol.broadcast_add(name='res4b10', *[res4b9_relu, scale4b10_branch2c])
+ res4b10_relu = mx.symbol.Activation(name='res4b10_relu', data=res4b10, act_type='relu')
+ res4b11_branch2a = mx.symbol.Convolution(name='res4b11_branch2a', data=res4b10_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b11_branch2a = mx.symbol.BatchNorm(name='bn4b11_branch2a', data=res4b11_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b11_branch2a = bn4b11_branch2a
+ res4b11_branch2a_relu = mx.symbol.Activation(name='res4b11_branch2a_relu', data=scale4b11_branch2a,
+ act_type='relu')
+ res4b11_branch2b = mx.symbol.Convolution(name='res4b11_branch2b', data=res4b11_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b11_branch2b = mx.symbol.BatchNorm(name='bn4b11_branch2b', data=res4b11_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b11_branch2b = bn4b11_branch2b
+ res4b11_branch2b_relu = mx.symbol.Activation(name='res4b11_branch2b_relu', data=scale4b11_branch2b,
+ act_type='relu')
+ res4b11_branch2c = mx.symbol.Convolution(name='res4b11_branch2c', data=res4b11_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b11_branch2c = mx.symbol.BatchNorm(name='bn4b11_branch2c', data=res4b11_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b11_branch2c = bn4b11_branch2c
+ res4b11 = mx.symbol.broadcast_add(name='res4b11', *[res4b10_relu, scale4b11_branch2c])
+ res4b11_relu = mx.symbol.Activation(name='res4b11_relu', data=res4b11, act_type='relu')
+ res4b12_branch2a = mx.symbol.Convolution(name='res4b12_branch2a', data=res4b11_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b12_branch2a = mx.symbol.BatchNorm(name='bn4b12_branch2a', data=res4b12_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b12_branch2a = bn4b12_branch2a
+ res4b12_branch2a_relu = mx.symbol.Activation(name='res4b12_branch2a_relu', data=scale4b12_branch2a,
+ act_type='relu')
+ res4b12_branch2b = mx.symbol.Convolution(name='res4b12_branch2b', data=res4b12_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b12_branch2b = mx.symbol.BatchNorm(name='bn4b12_branch2b', data=res4b12_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b12_branch2b = bn4b12_branch2b
+ res4b12_branch2b_relu = mx.symbol.Activation(name='res4b12_branch2b_relu', data=scale4b12_branch2b,
+ act_type='relu')
+ res4b12_branch2c = mx.symbol.Convolution(name='res4b12_branch2c', data=res4b12_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b12_branch2c = mx.symbol.BatchNorm(name='bn4b12_branch2c', data=res4b12_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b12_branch2c = bn4b12_branch2c
+ res4b12 = mx.symbol.broadcast_add(name='res4b12', *[res4b11_relu, scale4b12_branch2c])
+ res4b12_relu = mx.symbol.Activation(name='res4b12_relu', data=res4b12, act_type='relu')
+ res4b13_branch2a = mx.symbol.Convolution(name='res4b13_branch2a', data=res4b12_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b13_branch2a = mx.symbol.BatchNorm(name='bn4b13_branch2a', data=res4b13_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b13_branch2a = bn4b13_branch2a
+ res4b13_branch2a_relu = mx.symbol.Activation(name='res4b13_branch2a_relu', data=scale4b13_branch2a,
+ act_type='relu')
+ res4b13_branch2b = mx.symbol.Convolution(name='res4b13_branch2b', data=res4b13_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b13_branch2b = mx.symbol.BatchNorm(name='bn4b13_branch2b', data=res4b13_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b13_branch2b = bn4b13_branch2b
+ res4b13_branch2b_relu = mx.symbol.Activation(name='res4b13_branch2b_relu', data=scale4b13_branch2b,
+ act_type='relu')
+ res4b13_branch2c = mx.symbol.Convolution(name='res4b13_branch2c', data=res4b13_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b13_branch2c = mx.symbol.BatchNorm(name='bn4b13_branch2c', data=res4b13_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b13_branch2c = bn4b13_branch2c
+ res4b13 = mx.symbol.broadcast_add(name='res4b13', *[res4b12_relu, scale4b13_branch2c])
+ res4b13_relu = mx.symbol.Activation(name='res4b13_relu', data=res4b13, act_type='relu')
+ res4b14_branch2a = mx.symbol.Convolution(name='res4b14_branch2a', data=res4b13_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b14_branch2a = mx.symbol.BatchNorm(name='bn4b14_branch2a', data=res4b14_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b14_branch2a = bn4b14_branch2a
+ res4b14_branch2a_relu = mx.symbol.Activation(name='res4b14_branch2a_relu', data=scale4b14_branch2a,
+ act_type='relu')
+ res4b14_branch2b = mx.symbol.Convolution(name='res4b14_branch2b', data=res4b14_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b14_branch2b = mx.symbol.BatchNorm(name='bn4b14_branch2b', data=res4b14_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b14_branch2b = bn4b14_branch2b
+ res4b14_branch2b_relu = mx.symbol.Activation(name='res4b14_branch2b_relu', data=scale4b14_branch2b,
+ act_type='relu')
+ res4b14_branch2c = mx.symbol.Convolution(name='res4b14_branch2c', data=res4b14_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b14_branch2c = mx.symbol.BatchNorm(name='bn4b14_branch2c', data=res4b14_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b14_branch2c = bn4b14_branch2c
+ res4b14 = mx.symbol.broadcast_add(name='res4b14', *[res4b13_relu, scale4b14_branch2c])
+ res4b14_relu = mx.symbol.Activation(name='res4b14_relu', data=res4b14, act_type='relu')
+ res4b15_branch2a = mx.symbol.Convolution(name='res4b15_branch2a', data=res4b14_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b15_branch2a = mx.symbol.BatchNorm(name='bn4b15_branch2a', data=res4b15_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b15_branch2a = bn4b15_branch2a
+ res4b15_branch2a_relu = mx.symbol.Activation(name='res4b15_branch2a_relu', data=scale4b15_branch2a,
+ act_type='relu')
+ res4b15_branch2b = mx.symbol.Convolution(name='res4b15_branch2b', data=res4b15_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b15_branch2b = mx.symbol.BatchNorm(name='bn4b15_branch2b', data=res4b15_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b15_branch2b = bn4b15_branch2b
+ res4b15_branch2b_relu = mx.symbol.Activation(name='res4b15_branch2b_relu', data=scale4b15_branch2b,
+ act_type='relu')
+ res4b15_branch2c = mx.symbol.Convolution(name='res4b15_branch2c', data=res4b15_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b15_branch2c = mx.symbol.BatchNorm(name='bn4b15_branch2c', data=res4b15_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b15_branch2c = bn4b15_branch2c
+ res4b15 = mx.symbol.broadcast_add(name='res4b15', *[res4b14_relu, scale4b15_branch2c])
+ res4b15_relu = mx.symbol.Activation(name='res4b15_relu', data=res4b15, act_type='relu')
+ res4b16_branch2a = mx.symbol.Convolution(name='res4b16_branch2a', data=res4b15_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b16_branch2a = mx.symbol.BatchNorm(name='bn4b16_branch2a', data=res4b16_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b16_branch2a = bn4b16_branch2a
+ res4b16_branch2a_relu = mx.symbol.Activation(name='res4b16_branch2a_relu', data=scale4b16_branch2a,
+ act_type='relu')
+ res4b16_branch2b = mx.symbol.Convolution(name='res4b16_branch2b', data=res4b16_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b16_branch2b = mx.symbol.BatchNorm(name='bn4b16_branch2b', data=res4b16_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b16_branch2b = bn4b16_branch2b
+ res4b16_branch2b_relu = mx.symbol.Activation(name='res4b16_branch2b_relu', data=scale4b16_branch2b,
+ act_type='relu')
+ res4b16_branch2c = mx.symbol.Convolution(name='res4b16_branch2c', data=res4b16_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b16_branch2c = mx.symbol.BatchNorm(name='bn4b16_branch2c', data=res4b16_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b16_branch2c = bn4b16_branch2c
+ res4b16 = mx.symbol.broadcast_add(name='res4b16', *[res4b15_relu, scale4b16_branch2c])
+ res4b16_relu = mx.symbol.Activation(name='res4b16_relu', data=res4b16, act_type='relu')
+ res4b17_branch2a = mx.symbol.Convolution(name='res4b17_branch2a', data=res4b16_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b17_branch2a = mx.symbol.BatchNorm(name='bn4b17_branch2a', data=res4b17_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b17_branch2a = bn4b17_branch2a
+ res4b17_branch2a_relu = mx.symbol.Activation(name='res4b17_branch2a_relu', data=scale4b17_branch2a,
+ act_type='relu')
+ res4b17_branch2b = mx.symbol.Convolution(name='res4b17_branch2b', data=res4b17_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b17_branch2b = mx.symbol.BatchNorm(name='bn4b17_branch2b', data=res4b17_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b17_branch2b = bn4b17_branch2b
+ res4b17_branch2b_relu = mx.symbol.Activation(name='res4b17_branch2b_relu', data=scale4b17_branch2b,
+ act_type='relu')
+ res4b17_branch2c = mx.symbol.Convolution(name='res4b17_branch2c', data=res4b17_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b17_branch2c = mx.symbol.BatchNorm(name='bn4b17_branch2c', data=res4b17_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b17_branch2c = bn4b17_branch2c
+ res4b17 = mx.symbol.broadcast_add(name='res4b17', *[res4b16_relu, scale4b17_branch2c])
+ res4b17_relu = mx.symbol.Activation(name='res4b17_relu', data=res4b17, act_type='relu')
+ res4b18_branch2a = mx.symbol.Convolution(name='res4b18_branch2a', data=res4b17_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b18_branch2a = mx.symbol.BatchNorm(name='bn4b18_branch2a', data=res4b18_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b18_branch2a = bn4b18_branch2a
+ res4b18_branch2a_relu = mx.symbol.Activation(name='res4b18_branch2a_relu', data=scale4b18_branch2a,
+ act_type='relu')
+ res4b18_branch2b = mx.symbol.Convolution(name='res4b18_branch2b', data=res4b18_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b18_branch2b = mx.symbol.BatchNorm(name='bn4b18_branch2b', data=res4b18_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b18_branch2b = bn4b18_branch2b
+ res4b18_branch2b_relu = mx.symbol.Activation(name='res4b18_branch2b_relu', data=scale4b18_branch2b,
+ act_type='relu')
+ res4b18_branch2c = mx.symbol.Convolution(name='res4b18_branch2c', data=res4b18_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b18_branch2c = mx.symbol.BatchNorm(name='bn4b18_branch2c', data=res4b18_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b18_branch2c = bn4b18_branch2c
+ res4b18 = mx.symbol.broadcast_add(name='res4b18', *[res4b17_relu, scale4b18_branch2c])
+ res4b18_relu = mx.symbol.Activation(name='res4b18_relu', data=res4b18, act_type='relu')
+ res4b19_branch2a = mx.symbol.Convolution(name='res4b19_branch2a', data=res4b18_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b19_branch2a = mx.symbol.BatchNorm(name='bn4b19_branch2a', data=res4b19_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b19_branch2a = bn4b19_branch2a
+ res4b19_branch2a_relu = mx.symbol.Activation(name='res4b19_branch2a_relu', data=scale4b19_branch2a,
+ act_type='relu')
+ res4b19_branch2b = mx.symbol.Convolution(name='res4b19_branch2b', data=res4b19_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b19_branch2b = mx.symbol.BatchNorm(name='bn4b19_branch2b', data=res4b19_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b19_branch2b = bn4b19_branch2b
+ res4b19_branch2b_relu = mx.symbol.Activation(name='res4b19_branch2b_relu', data=scale4b19_branch2b,
+ act_type='relu')
+ res4b19_branch2c = mx.symbol.Convolution(name='res4b19_branch2c', data=res4b19_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b19_branch2c = mx.symbol.BatchNorm(name='bn4b19_branch2c', data=res4b19_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b19_branch2c = bn4b19_branch2c
+ res4b19 = mx.symbol.broadcast_add(name='res4b19', *[res4b18_relu, scale4b19_branch2c])
+ res4b19_relu = mx.symbol.Activation(name='res4b19_relu', data=res4b19, act_type='relu')
+ res4b20_branch2a = mx.symbol.Convolution(name='res4b20_branch2a', data=res4b19_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b20_branch2a = mx.symbol.BatchNorm(name='bn4b20_branch2a', data=res4b20_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b20_branch2a = bn4b20_branch2a
+ res4b20_branch2a_relu = mx.symbol.Activation(name='res4b20_branch2a_relu', data=scale4b20_branch2a,
+ act_type='relu')
+ res4b20_branch2b = mx.symbol.Convolution(name='res4b20_branch2b', data=res4b20_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b20_branch2b = mx.symbol.BatchNorm(name='bn4b20_branch2b', data=res4b20_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b20_branch2b = bn4b20_branch2b
+ res4b20_branch2b_relu = mx.symbol.Activation(name='res4b20_branch2b_relu', data=scale4b20_branch2b,
+ act_type='relu')
+ res4b20_branch2c = mx.symbol.Convolution(name='res4b20_branch2c', data=res4b20_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b20_branch2c = mx.symbol.BatchNorm(name='bn4b20_branch2c', data=res4b20_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b20_branch2c = bn4b20_branch2c
+ res4b20 = mx.symbol.broadcast_add(name='res4b20', *[res4b19_relu, scale4b20_branch2c])
+ res4b20_relu = mx.symbol.Activation(name='res4b20_relu', data=res4b20, act_type='relu')
+ res4b21_branch2a = mx.symbol.Convolution(name='res4b21_branch2a', data=res4b20_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b21_branch2a = mx.symbol.BatchNorm(name='bn4b21_branch2a', data=res4b21_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b21_branch2a = bn4b21_branch2a
+ res4b21_branch2a_relu = mx.symbol.Activation(name='res4b21_branch2a_relu', data=scale4b21_branch2a,
+ act_type='relu')
+ res4b21_branch2b = mx.symbol.Convolution(name='res4b21_branch2b', data=res4b21_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b21_branch2b = mx.symbol.BatchNorm(name='bn4b21_branch2b', data=res4b21_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b21_branch2b = bn4b21_branch2b
+ res4b21_branch2b_relu = mx.symbol.Activation(name='res4b21_branch2b_relu', data=scale4b21_branch2b,
+ act_type='relu')
+ res4b21_branch2c = mx.symbol.Convolution(name='res4b21_branch2c', data=res4b21_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b21_branch2c = mx.symbol.BatchNorm(name='bn4b21_branch2c', data=res4b21_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b21_branch2c = bn4b21_branch2c
+ res4b21 = mx.symbol.broadcast_add(name='res4b21', *[res4b20_relu, scale4b21_branch2c])
+ res4b21_relu = mx.symbol.Activation(name='res4b21_relu', data=res4b21, act_type='relu')
+ res4b22_branch2a = mx.symbol.Convolution(name='res4b22_branch2a', data=res4b21_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b22_branch2a = mx.symbol.BatchNorm(name='bn4b22_branch2a', data=res4b22_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b22_branch2a = bn4b22_branch2a
+ res4b22_branch2a_relu = mx.symbol.Activation(name='res4b22_branch2a_relu', data=scale4b22_branch2a,
+ act_type='relu')
+ res4b22_branch2b = mx.symbol.Convolution(name='res4b22_branch2b', data=res4b22_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b22_branch2b = mx.symbol.BatchNorm(name='bn4b22_branch2b', data=res4b22_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b22_branch2b = bn4b22_branch2b
+ res4b22_branch2b_relu = mx.symbol.Activation(name='res4b22_branch2b_relu', data=scale4b22_branch2b,
+ act_type='relu')
+ res4b22_branch2c = mx.symbol.Convolution(name='res4b22_branch2c', data=res4b22_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b22_branch2c = mx.symbol.BatchNorm(name='bn4b22_branch2c', data=res4b22_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b22_branch2c = bn4b22_branch2c
+ res4b22 = mx.symbol.broadcast_add(name='res4b22', *[res4b21_relu, scale4b22_branch2c])
+ res4b22_relu = mx.symbol.Activation(name='res4b22_relu', data=res4b22, act_type='relu')
+
+ res5a_branch1 = mx.symbol.Convolution(name='res5a_branch1', data=res4b22_relu, num_filter=2048, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn5a_branch1 = mx.symbol.BatchNorm(name='bn5a_branch1', data=res5a_branch1, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale5a_branch1 = bn5a_branch1
+ res5a_branch2a = mx.symbol.Convolution(name='res5a_branch2a', data=res4b22_relu, num_filter=512, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn5a_branch2a = mx.symbol.BatchNorm(name='bn5a_branch2a', data=res5a_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale5a_branch2a = bn5a_branch2a
+ res5a_branch2a_relu = mx.symbol.Activation(name='res5a_branch2a_relu', data=scale5a_branch2a, act_type='relu')
+ res5a_branch2b = mx.symbol.Convolution(name='res5a_branch2b', data=res5a_branch2a_relu, num_filter=512,
+ pad=(2, 2), kernel=(3, 3), dilate=(2, 2), stride=(1, 1), no_bias=True)
+ bn5a_branch2b = mx.symbol.BatchNorm(name='bn5a_branch2b', data=res5a_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale5a_branch2b = bn5a_branch2b
+ res5a_branch2b_relu = mx.symbol.Activation(name='res5a_branch2b_relu', data=scale5a_branch2b, act_type='relu')
+ res5a_branch2c = mx.symbol.Convolution(name='res5a_branch2c', data=res5a_branch2b_relu, num_filter=2048,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn5a_branch2c = mx.symbol.BatchNorm(name='bn5a_branch2c', data=res5a_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale5a_branch2c = bn5a_branch2c
+ res5a = mx.symbol.broadcast_add(name='res5a', *[scale5a_branch1, scale5a_branch2c])
+ res5a_relu = mx.symbol.Activation(name='res5a_relu', data=res5a, act_type='relu')
+ res5b_branch2a = mx.symbol.Convolution(name='res5b_branch2a', data=res5a_relu, num_filter=512, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn5b_branch2a = mx.symbol.BatchNorm(name='bn5b_branch2a', data=res5b_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale5b_branch2a = bn5b_branch2a
+ res5b_branch2a_relu = mx.symbol.Activation(name='res5b_branch2a_relu', data=scale5b_branch2a, act_type='relu')
+ res5b_branch2b = mx.symbol.Convolution(name='res5b_branch2b', data=res5b_branch2a_relu, num_filter=512,
+ pad=(2, 2), kernel=(3, 3), dilate=(2, 2), stride=(1, 1), no_bias=True)
+ bn5b_branch2b = mx.symbol.BatchNorm(name='bn5b_branch2b', data=res5b_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale5b_branch2b = bn5b_branch2b
+ res5b_branch2b_relu = mx.symbol.Activation(name='res5b_branch2b_relu', data=scale5b_branch2b, act_type='relu')
+ res5b_branch2c = mx.symbol.Convolution(name='res5b_branch2c', data=res5b_branch2b_relu, num_filter=2048,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn5b_branch2c = mx.symbol.BatchNorm(name='bn5b_branch2c', data=res5b_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale5b_branch2c = bn5b_branch2c
+ res5b = mx.symbol.broadcast_add(name='res5b', *[res5a_relu, scale5b_branch2c])
+ res5b_relu = mx.symbol.Activation(name='res5b_relu', data=res5b, act_type='relu')
+ res5c_branch2a = mx.symbol.Convolution(name='res5c_branch2a', data=res5b_relu, num_filter=512, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn5c_branch2a = mx.symbol.BatchNorm(name='bn5c_branch2a', data=res5c_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale5c_branch2a = bn5c_branch2a
+ res5c_branch2a_relu = mx.symbol.Activation(name='res5c_branch2a_relu', data=scale5c_branch2a, act_type='relu')
+ res5c_branch2b = mx.symbol.Convolution(name='res5c_branch2b', data=res5c_branch2a_relu, num_filter=512,
+ pad=(2, 2), kernel=(3, 3), dilate=(2, 2), stride=(1, 1), no_bias=True)
+ bn5c_branch2b = mx.symbol.BatchNorm(name='bn5c_branch2b', data=res5c_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale5c_branch2b = bn5c_branch2b
+ res5c_branch2b_relu = mx.symbol.Activation(name='res5c_branch2b_relu', data=scale5c_branch2b, act_type='relu')
+ res5c_branch2c = mx.symbol.Convolution(name='res5c_branch2c', data=res5c_branch2b_relu, num_filter=2048,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn5c_branch2c = mx.symbol.BatchNorm(name='bn5c_branch2c', data=res5c_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale5c_branch2c = bn5c_branch2c
+ res5c = mx.symbol.broadcast_add(name='res5c', *[res5b_relu, scale5c_branch2c])
+ res5c_relu = mx.symbol.Activation(name='res5c_relu', data=res5c, act_type='relu')
+ return res5c_relu
+
+ def get_train_symbol(self, num_classes):
+ """
+ get symbol for training
+ :param num_classes: num of classes
+ :return: the symbol for training
+ """
+ data = mx.symbol.Variable(name="data")
+ seg_cls_gt = mx.symbol.Variable(name='label')
+
+ # shared convolutional layers
+ conv_feat = self.get_resnet_conv(data)
+
+ # subsequent fc layers by haozhi
+ fc6_bias = mx.symbol.Variable('fc6_bias', lr_mult=2.0)
+ fc6_weight = mx.symbol.Variable('fc6_weight', lr_mult=1.0)
+
+ fc6 = mx.symbol.Convolution(data=conv_feat, kernel=(1, 1), pad=(0, 0), num_filter=1024, name="fc6",
+ bias=fc6_bias, weight=fc6_weight, workspace=self.workspace)
+ relu_fc6 = mx.sym.Activation(data=fc6, act_type='relu', name='relu_fc6')
+
+ score_bias = mx.symbol.Variable('score_bias', lr_mult=2.0)
+ score_weight = mx.symbol.Variable('score_weight', lr_mult=1.0)
+
+ score = mx.symbol.Convolution(data=relu_fc6, kernel=(1, 1), pad=(0, 0), num_filter=num_classes, name="score",
+ bias=score_bias, weight=score_weight, workspace=self.workspace)
+
+ upsampling = mx.symbol.Deconvolution(data=score, num_filter=num_classes, kernel=(32, 32), stride=(16, 16),
+ num_group=num_classes, no_bias=True, name='upsampling',
+ attr={'lr_mult': '0.0'}, workspace=self.workspace)
+
+ croped_score = mx.symbol.Crop(*[upsampling, data], offset=(8, 8), name='croped_score')
+ softmax = mx.symbol.SoftmaxOutput(data=croped_score, label=seg_cls_gt, normalization='valid', multi_output=True,
+ use_ignore=True, ignore_label=255, name="softmax")
+
+ return softmax
+
+ def get_test_symbol(self, num_classes):
+ """
+ get symbol for testing
+ :param num_classes: num of classes
+ :return: the symbol for testing
+ """
+ data = mx.symbol.Variable(name="data")
+
+ # shared convolutional layers
+ conv_feat = self.get_resnet_conv(data)
+
+ fc6_bias = mx.symbol.Variable('fc6_bias', lr_mult=2.0)
+ fc6_weight = mx.symbol.Variable('fc6_weight', lr_mult=1.0)
+
+ fc6 = mx.symbol.Convolution(
+ data=conv_feat, kernel=(1, 1), pad=(0, 0), num_filter=1024, name="fc6", bias=fc6_bias, weight=fc6_weight,
+ workspace=self.workspace)
+ relu_fc6 = mx.sym.Activation(data=fc6, act_type='relu', name='relu_fc6')
+
+ score_bias = mx.symbol.Variable('score_bias', lr_mult=2.0)
+ score_weight = mx.symbol.Variable('score_weight', lr_mult=1.0)
+
+ score = mx.symbol.Convolution(
+ data=relu_fc6, kernel=(1, 1), pad=(0, 0), num_filter=num_classes, name="score", bias=score_bias,
+ weight=score_weight, workspace=self.workspace)
+
+ upsampling = mx.symbol.Deconvolution(
+ data=score, num_filter=num_classes, kernel=(32, 32), stride=(16, 16), num_group=num_classes, no_bias=True,
+ name='upsampling', attr={'lr_mult': '0.0'}, workspace=self.workspace)
+
+ croped_score = mx.symbol.Crop(*[upsampling, data], offset=(8, 8), name='croped_score')
+
+ softmax = mx.symbol.SoftmaxOutput(data=croped_score, normalization='valid', multi_output=True, use_ignore=True,
+ ignore_label=255, name="softmax")
+
+ return softmax
+
+ def get_symbol(self, cfg, is_train=True):
+ """
+ return a generated symbol, it also need to be assigned to self.sym
+ """
+
+ # config alias for convenient
+ num_classes = cfg.dataset.NUM_CLASSES
+
+ if is_train:
+ self.sym = self.get_train_symbol(num_classes=num_classes)
+ else:
+ self.sym = self.get_test_symbol(num_classes=num_classes)
+
+ return self.sym
+
+ def init_weights(self, cfg, arg_params, aux_params):
+ arg_params['fc6_weight'] = mx.random.normal(0, 0.01, shape=self.arg_shape_dict['fc6_weight'])
+ arg_params['fc6_bias'] = mx.nd.zeros(shape=self.arg_shape_dict['fc6_bias'])
+ arg_params['score_weight'] = mx.random.normal(0, 0.01, shape=self.arg_shape_dict['score_weight'])
+ arg_params['score_bias'] = mx.nd.zeros(shape=self.arg_shape_dict['score_bias'])
+ arg_params['upsampling_weight'] = mx.nd.zeros(shape=self.arg_shape_dict['upsampling_weight'])
+
+ init = mx.init.Initializer()
+ init._init_bilinear('upsample_weight', arg_params['upsampling_weight'])
diff --git a/deeplab/symbols/resnet_v1_101_deeplab_dcn.py b/deeplab/symbols/resnet_v1_101_deeplab_dcn.py
new file mode 100644
index 0000000..ebecc0d
--- /dev/null
+++ b/deeplab/symbols/resnet_v1_101_deeplab_dcn.py
@@ -0,0 +1,852 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Written by Zheng Zhang
+# --------------------------------------------------------
+
+import cPickle
+import mxnet as mx
+from utils.symbol import Symbol
+
+class resnet_v1_101_deeplab_dcn(Symbol):
+ def __init__(self):
+ """
+ Use __init__ to define parameter network needs
+ """
+ self.eps = 1e-5
+ self.use_global_stats = True
+ self.workspace = 4096
+ self.units = (3, 4, 23, 3) # use for 101
+ self.filter_list = [256, 512, 1024, 2048]
+
+ def get_resnet_conv(self, data):
+ conv1 = mx.symbol.Convolution(name='conv1', data=data, num_filter=64, pad=(3, 3), kernel=(7, 7), stride=(2, 2),
+ no_bias=True)
+ bn_conv1 = mx.symbol.BatchNorm(name='bn_conv1', data=conv1, use_global_stats=True, fix_gamma=False, eps = self.eps)
+ scale_conv1 = bn_conv1
+ conv1_relu = mx.symbol.Activation(name='conv1_relu', data=scale_conv1, act_type='relu')
+ pool1 = mx.symbol.Pooling(name='pool1', data=conv1_relu, pooling_convention='full', pad=(0, 0), kernel=(3, 3),
+ stride=(2, 2), pool_type='max')
+ res2a_branch1 = mx.symbol.Convolution(name='res2a_branch1', data=pool1, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn2a_branch1 = mx.symbol.BatchNorm(name='bn2a_branch1', data=res2a_branch1, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale2a_branch1 = bn2a_branch1
+ res2a_branch2a = mx.symbol.Convolution(name='res2a_branch2a', data=pool1, num_filter=64, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn2a_branch2a = mx.symbol.BatchNorm(name='bn2a_branch2a', data=res2a_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale2a_branch2a = bn2a_branch2a
+ res2a_branch2a_relu = mx.symbol.Activation(name='res2a_branch2a_relu', data=scale2a_branch2a, act_type='relu')
+ res2a_branch2b = mx.symbol.Convolution(name='res2a_branch2b', data=res2a_branch2a_relu, num_filter=64,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn2a_branch2b = mx.symbol.BatchNorm(name='bn2a_branch2b', data=res2a_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale2a_branch2b = bn2a_branch2b
+ res2a_branch2b_relu = mx.symbol.Activation(name='res2a_branch2b_relu', data=scale2a_branch2b, act_type='relu')
+ res2a_branch2c = mx.symbol.Convolution(name='res2a_branch2c', data=res2a_branch2b_relu, num_filter=256,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn2a_branch2c = mx.symbol.BatchNorm(name='bn2a_branch2c', data=res2a_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale2a_branch2c = bn2a_branch2c
+ res2a = mx.symbol.broadcast_add(name='res2a', *[scale2a_branch1, scale2a_branch2c])
+ res2a_relu = mx.symbol.Activation(name='res2a_relu', data=res2a, act_type='relu')
+ res2b_branch2a = mx.symbol.Convolution(name='res2b_branch2a', data=res2a_relu, num_filter=64, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn2b_branch2a = mx.symbol.BatchNorm(name='bn2b_branch2a', data=res2b_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale2b_branch2a = bn2b_branch2a
+ res2b_branch2a_relu = mx.symbol.Activation(name='res2b_branch2a_relu', data=scale2b_branch2a, act_type='relu')
+ res2b_branch2b = mx.symbol.Convolution(name='res2b_branch2b', data=res2b_branch2a_relu, num_filter=64,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn2b_branch2b = mx.symbol.BatchNorm(name='bn2b_branch2b', data=res2b_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale2b_branch2b = bn2b_branch2b
+ res2b_branch2b_relu = mx.symbol.Activation(name='res2b_branch2b_relu', data=scale2b_branch2b, act_type='relu')
+ res2b_branch2c = mx.symbol.Convolution(name='res2b_branch2c', data=res2b_branch2b_relu, num_filter=256,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn2b_branch2c = mx.symbol.BatchNorm(name='bn2b_branch2c', data=res2b_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale2b_branch2c = bn2b_branch2c
+ res2b = mx.symbol.broadcast_add(name='res2b', *[res2a_relu, scale2b_branch2c])
+ res2b_relu = mx.symbol.Activation(name='res2b_relu', data=res2b, act_type='relu')
+ res2c_branch2a = mx.symbol.Convolution(name='res2c_branch2a', data=res2b_relu, num_filter=64, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn2c_branch2a = mx.symbol.BatchNorm(name='bn2c_branch2a', data=res2c_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale2c_branch2a = bn2c_branch2a
+ res2c_branch2a_relu = mx.symbol.Activation(name='res2c_branch2a_relu', data=scale2c_branch2a, act_type='relu')
+ res2c_branch2b = mx.symbol.Convolution(name='res2c_branch2b', data=res2c_branch2a_relu, num_filter=64,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn2c_branch2b = mx.symbol.BatchNorm(name='bn2c_branch2b', data=res2c_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale2c_branch2b = bn2c_branch2b
+ res2c_branch2b_relu = mx.symbol.Activation(name='res2c_branch2b_relu', data=scale2c_branch2b, act_type='relu')
+ res2c_branch2c = mx.symbol.Convolution(name='res2c_branch2c', data=res2c_branch2b_relu, num_filter=256,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn2c_branch2c = mx.symbol.BatchNorm(name='bn2c_branch2c', data=res2c_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale2c_branch2c = bn2c_branch2c
+ res2c = mx.symbol.broadcast_add(name='res2c', *[res2b_relu, scale2c_branch2c])
+ res2c_relu = mx.symbol.Activation(name='res2c_relu', data=res2c, act_type='relu')
+ res3a_branch1 = mx.symbol.Convolution(name='res3a_branch1', data=res2c_relu, num_filter=512, pad=(0, 0),
+ kernel=(1, 1), stride=(2, 2), no_bias=True)
+ bn3a_branch1 = mx.symbol.BatchNorm(name='bn3a_branch1', data=res3a_branch1, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3a_branch1 = bn3a_branch1
+ res3a_branch2a = mx.symbol.Convolution(name='res3a_branch2a', data=res2c_relu, num_filter=128, pad=(0, 0),
+ kernel=(1, 1), stride=(2, 2), no_bias=True)
+ bn3a_branch2a = mx.symbol.BatchNorm(name='bn3a_branch2a', data=res3a_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3a_branch2a = bn3a_branch2a
+ res3a_branch2a_relu = mx.symbol.Activation(name='res3a_branch2a_relu', data=scale3a_branch2a, act_type='relu')
+ res3a_branch2b = mx.symbol.Convolution(name='res3a_branch2b', data=res3a_branch2a_relu, num_filter=128,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn3a_branch2b = mx.symbol.BatchNorm(name='bn3a_branch2b', data=res3a_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3a_branch2b = bn3a_branch2b
+ res3a_branch2b_relu = mx.symbol.Activation(name='res3a_branch2b_relu', data=scale3a_branch2b, act_type='relu')
+ res3a_branch2c = mx.symbol.Convolution(name='res3a_branch2c', data=res3a_branch2b_relu, num_filter=512,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn3a_branch2c = mx.symbol.BatchNorm(name='bn3a_branch2c', data=res3a_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3a_branch2c = bn3a_branch2c
+ res3a = mx.symbol.broadcast_add(name='res3a', *[scale3a_branch1, scale3a_branch2c])
+ res3a_relu = mx.symbol.Activation(name='res3a_relu', data=res3a, act_type='relu')
+ res3b1_branch2a = mx.symbol.Convolution(name='res3b1_branch2a', data=res3a_relu, num_filter=128, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn3b1_branch2a = mx.symbol.BatchNorm(name='bn3b1_branch2a', data=res3b1_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3b1_branch2a = bn3b1_branch2a
+ res3b1_branch2a_relu = mx.symbol.Activation(name='res3b1_branch2a_relu', data=scale3b1_branch2a,
+ act_type='relu')
+ res3b1_branch2b = mx.symbol.Convolution(name='res3b1_branch2b', data=res3b1_branch2a_relu, num_filter=128,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn3b1_branch2b = mx.symbol.BatchNorm(name='bn3b1_branch2b', data=res3b1_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3b1_branch2b = bn3b1_branch2b
+ res3b1_branch2b_relu = mx.symbol.Activation(name='res3b1_branch2b_relu', data=scale3b1_branch2b,
+ act_type='relu')
+ res3b1_branch2c = mx.symbol.Convolution(name='res3b1_branch2c', data=res3b1_branch2b_relu, num_filter=512,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn3b1_branch2c = mx.symbol.BatchNorm(name='bn3b1_branch2c', data=res3b1_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3b1_branch2c = bn3b1_branch2c
+ res3b1 = mx.symbol.broadcast_add(name='res3b1', *[res3a_relu, scale3b1_branch2c])
+ res3b1_relu = mx.symbol.Activation(name='res3b1_relu', data=res3b1, act_type='relu')
+ res3b2_branch2a = mx.symbol.Convolution(name='res3b2_branch2a', data=res3b1_relu, num_filter=128, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn3b2_branch2a = mx.symbol.BatchNorm(name='bn3b2_branch2a', data=res3b2_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3b2_branch2a = bn3b2_branch2a
+ res3b2_branch2a_relu = mx.symbol.Activation(name='res3b2_branch2a_relu', data=scale3b2_branch2a,
+ act_type='relu')
+ res3b2_branch2b = mx.symbol.Convolution(name='res3b2_branch2b', data=res3b2_branch2a_relu, num_filter=128,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn3b2_branch2b = mx.symbol.BatchNorm(name='bn3b2_branch2b', data=res3b2_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3b2_branch2b = bn3b2_branch2b
+ res3b2_branch2b_relu = mx.symbol.Activation(name='res3b2_branch2b_relu', data=scale3b2_branch2b,
+ act_type='relu')
+ res3b2_branch2c = mx.symbol.Convolution(name='res3b2_branch2c', data=res3b2_branch2b_relu, num_filter=512,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn3b2_branch2c = mx.symbol.BatchNorm(name='bn3b2_branch2c', data=res3b2_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3b2_branch2c = bn3b2_branch2c
+ res3b2 = mx.symbol.broadcast_add(name='res3b2', *[res3b1_relu, scale3b2_branch2c])
+ res3b2_relu = mx.symbol.Activation(name='res3b2_relu', data=res3b2, act_type='relu')
+ res3b3_branch2a = mx.symbol.Convolution(name='res3b3_branch2a', data=res3b2_relu, num_filter=128, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn3b3_branch2a = mx.symbol.BatchNorm(name='bn3b3_branch2a', data=res3b3_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3b3_branch2a = bn3b3_branch2a
+ res3b3_branch2a_relu = mx.symbol.Activation(name='res3b3_branch2a_relu', data=scale3b3_branch2a,
+ act_type='relu')
+ res3b3_branch2b = mx.symbol.Convolution(name='res3b3_branch2b', data=res3b3_branch2a_relu, num_filter=128,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn3b3_branch2b = mx.symbol.BatchNorm(name='bn3b3_branch2b', data=res3b3_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3b3_branch2b = bn3b3_branch2b
+ res3b3_branch2b_relu = mx.symbol.Activation(name='res3b3_branch2b_relu', data=scale3b3_branch2b,
+ act_type='relu')
+ res3b3_branch2c = mx.symbol.Convolution(name='res3b3_branch2c', data=res3b3_branch2b_relu, num_filter=512,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn3b3_branch2c = mx.symbol.BatchNorm(name='bn3b3_branch2c', data=res3b3_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale3b3_branch2c = bn3b3_branch2c
+ res3b3 = mx.symbol.broadcast_add(name='res3b3', *[res3b2_relu, scale3b3_branch2c])
+ res3b3_relu = mx.symbol.Activation(name='res3b3_relu', data=res3b3, act_type='relu')
+ res4a_branch1 = mx.symbol.Convolution(name='res4a_branch1', data=res3b3_relu, num_filter=1024, pad=(0, 0),
+ kernel=(1, 1), stride=(2, 2), no_bias=True)
+ bn4a_branch1 = mx.symbol.BatchNorm(name='bn4a_branch1', data=res4a_branch1, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4a_branch1 = bn4a_branch1
+ res4a_branch2a = mx.symbol.Convolution(name='res4a_branch2a', data=res3b3_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(2, 2), no_bias=True)
+ bn4a_branch2a = mx.symbol.BatchNorm(name='bn4a_branch2a', data=res4a_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4a_branch2a = bn4a_branch2a
+ res4a_branch2a_relu = mx.symbol.Activation(name='res4a_branch2a_relu', data=scale4a_branch2a, act_type='relu')
+ res4a_branch2b = mx.symbol.Convolution(name='res4a_branch2b', data=res4a_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4a_branch2b = mx.symbol.BatchNorm(name='bn4a_branch2b', data=res4a_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4a_branch2b = bn4a_branch2b
+ res4a_branch2b_relu = mx.symbol.Activation(name='res4a_branch2b_relu', data=scale4a_branch2b, act_type='relu')
+ res4a_branch2c = mx.symbol.Convolution(name='res4a_branch2c', data=res4a_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4a_branch2c = mx.symbol.BatchNorm(name='bn4a_branch2c', data=res4a_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4a_branch2c = bn4a_branch2c
+ res4a = mx.symbol.broadcast_add(name='res4a', *[scale4a_branch1, scale4a_branch2c])
+ res4a_relu = mx.symbol.Activation(name='res4a_relu', data=res4a, act_type='relu')
+ res4b1_branch2a = mx.symbol.Convolution(name='res4b1_branch2a', data=res4a_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b1_branch2a = mx.symbol.BatchNorm(name='bn4b1_branch2a', data=res4b1_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b1_branch2a = bn4b1_branch2a
+ res4b1_branch2a_relu = mx.symbol.Activation(name='res4b1_branch2a_relu', data=scale4b1_branch2a,
+ act_type='relu')
+ res4b1_branch2b = mx.symbol.Convolution(name='res4b1_branch2b', data=res4b1_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b1_branch2b = mx.symbol.BatchNorm(name='bn4b1_branch2b', data=res4b1_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b1_branch2b = bn4b1_branch2b
+ res4b1_branch2b_relu = mx.symbol.Activation(name='res4b1_branch2b_relu', data=scale4b1_branch2b,
+ act_type='relu')
+ res4b1_branch2c = mx.symbol.Convolution(name='res4b1_branch2c', data=res4b1_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b1_branch2c = mx.symbol.BatchNorm(name='bn4b1_branch2c', data=res4b1_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b1_branch2c = bn4b1_branch2c
+ res4b1 = mx.symbol.broadcast_add(name='res4b1', *[res4a_relu, scale4b1_branch2c])
+ res4b1_relu = mx.symbol.Activation(name='res4b1_relu', data=res4b1, act_type='relu')
+ res4b2_branch2a = mx.symbol.Convolution(name='res4b2_branch2a', data=res4b1_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b2_branch2a = mx.symbol.BatchNorm(name='bn4b2_branch2a', data=res4b2_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b2_branch2a = bn4b2_branch2a
+ res4b2_branch2a_relu = mx.symbol.Activation(name='res4b2_branch2a_relu', data=scale4b2_branch2a,
+ act_type='relu')
+ res4b2_branch2b = mx.symbol.Convolution(name='res4b2_branch2b', data=res4b2_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b2_branch2b = mx.symbol.BatchNorm(name='bn4b2_branch2b', data=res4b2_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b2_branch2b = bn4b2_branch2b
+ res4b2_branch2b_relu = mx.symbol.Activation(name='res4b2_branch2b_relu', data=scale4b2_branch2b,
+ act_type='relu')
+ res4b2_branch2c = mx.symbol.Convolution(name='res4b2_branch2c', data=res4b2_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b2_branch2c = mx.symbol.BatchNorm(name='bn4b2_branch2c', data=res4b2_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b2_branch2c = bn4b2_branch2c
+ res4b2 = mx.symbol.broadcast_add(name='res4b2', *[res4b1_relu, scale4b2_branch2c])
+ res4b2_relu = mx.symbol.Activation(name='res4b2_relu', data=res4b2, act_type='relu')
+ res4b3_branch2a = mx.symbol.Convolution(name='res4b3_branch2a', data=res4b2_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b3_branch2a = mx.symbol.BatchNorm(name='bn4b3_branch2a', data=res4b3_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b3_branch2a = bn4b3_branch2a
+ res4b3_branch2a_relu = mx.symbol.Activation(name='res4b3_branch2a_relu', data=scale4b3_branch2a,
+ act_type='relu')
+ res4b3_branch2b = mx.symbol.Convolution(name='res4b3_branch2b', data=res4b3_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b3_branch2b = mx.symbol.BatchNorm(name='bn4b3_branch2b', data=res4b3_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b3_branch2b = bn4b3_branch2b
+ res4b3_branch2b_relu = mx.symbol.Activation(name='res4b3_branch2b_relu', data=scale4b3_branch2b,
+ act_type='relu')
+ res4b3_branch2c = mx.symbol.Convolution(name='res4b3_branch2c', data=res4b3_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b3_branch2c = mx.symbol.BatchNorm(name='bn4b3_branch2c', data=res4b3_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b3_branch2c = bn4b3_branch2c
+ res4b3 = mx.symbol.broadcast_add(name='res4b3', *[res4b2_relu, scale4b3_branch2c])
+ res4b3_relu = mx.symbol.Activation(name='res4b3_relu', data=res4b3, act_type='relu')
+ res4b4_branch2a = mx.symbol.Convolution(name='res4b4_branch2a', data=res4b3_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b4_branch2a = mx.symbol.BatchNorm(name='bn4b4_branch2a', data=res4b4_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b4_branch2a = bn4b4_branch2a
+ res4b4_branch2a_relu = mx.symbol.Activation(name='res4b4_branch2a_relu', data=scale4b4_branch2a,
+ act_type='relu')
+ res4b4_branch2b = mx.symbol.Convolution(name='res4b4_branch2b', data=res4b4_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b4_branch2b = mx.symbol.BatchNorm(name='bn4b4_branch2b', data=res4b4_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b4_branch2b = bn4b4_branch2b
+ res4b4_branch2b_relu = mx.symbol.Activation(name='res4b4_branch2b_relu', data=scale4b4_branch2b,
+ act_type='relu')
+ res4b4_branch2c = mx.symbol.Convolution(name='res4b4_branch2c', data=res4b4_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b4_branch2c = mx.symbol.BatchNorm(name='bn4b4_branch2c', data=res4b4_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b4_branch2c = bn4b4_branch2c
+ res4b4 = mx.symbol.broadcast_add(name='res4b4', *[res4b3_relu, scale4b4_branch2c])
+ res4b4_relu = mx.symbol.Activation(name='res4b4_relu', data=res4b4, act_type='relu')
+ res4b5_branch2a = mx.symbol.Convolution(name='res4b5_branch2a', data=res4b4_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b5_branch2a = mx.symbol.BatchNorm(name='bn4b5_branch2a', data=res4b5_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b5_branch2a = bn4b5_branch2a
+ res4b5_branch2a_relu = mx.symbol.Activation(name='res4b5_branch2a_relu', data=scale4b5_branch2a,
+ act_type='relu')
+ res4b5_branch2b = mx.symbol.Convolution(name='res4b5_branch2b', data=res4b5_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b5_branch2b = mx.symbol.BatchNorm(name='bn4b5_branch2b', data=res4b5_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b5_branch2b = bn4b5_branch2b
+ res4b5_branch2b_relu = mx.symbol.Activation(name='res4b5_branch2b_relu', data=scale4b5_branch2b,
+ act_type='relu')
+ res4b5_branch2c = mx.symbol.Convolution(name='res4b5_branch2c', data=res4b5_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b5_branch2c = mx.symbol.BatchNorm(name='bn4b5_branch2c', data=res4b5_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b5_branch2c = bn4b5_branch2c
+ res4b5 = mx.symbol.broadcast_add(name='res4b5', *[res4b4_relu, scale4b5_branch2c])
+ res4b5_relu = mx.symbol.Activation(name='res4b5_relu', data=res4b5, act_type='relu')
+ res4b6_branch2a = mx.symbol.Convolution(name='res4b6_branch2a', data=res4b5_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b6_branch2a = mx.symbol.BatchNorm(name='bn4b6_branch2a', data=res4b6_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b6_branch2a = bn4b6_branch2a
+ res4b6_branch2a_relu = mx.symbol.Activation(name='res4b6_branch2a_relu', data=scale4b6_branch2a,
+ act_type='relu')
+ res4b6_branch2b = mx.symbol.Convolution(name='res4b6_branch2b', data=res4b6_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b6_branch2b = mx.symbol.BatchNorm(name='bn4b6_branch2b', data=res4b6_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b6_branch2b = bn4b6_branch2b
+ res4b6_branch2b_relu = mx.symbol.Activation(name='res4b6_branch2b_relu', data=scale4b6_branch2b,
+ act_type='relu')
+ res4b6_branch2c = mx.symbol.Convolution(name='res4b6_branch2c', data=res4b6_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b6_branch2c = mx.symbol.BatchNorm(name='bn4b6_branch2c', data=res4b6_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b6_branch2c = bn4b6_branch2c
+ res4b6 = mx.symbol.broadcast_add(name='res4b6', *[res4b5_relu, scale4b6_branch2c])
+ res4b6_relu = mx.symbol.Activation(name='res4b6_relu', data=res4b6, act_type='relu')
+ res4b7_branch2a = mx.symbol.Convolution(name='res4b7_branch2a', data=res4b6_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b7_branch2a = mx.symbol.BatchNorm(name='bn4b7_branch2a', data=res4b7_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b7_branch2a = bn4b7_branch2a
+ res4b7_branch2a_relu = mx.symbol.Activation(name='res4b7_branch2a_relu', data=scale4b7_branch2a,
+ act_type='relu')
+ res4b7_branch2b = mx.symbol.Convolution(name='res4b7_branch2b', data=res4b7_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b7_branch2b = mx.symbol.BatchNorm(name='bn4b7_branch2b', data=res4b7_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b7_branch2b = bn4b7_branch2b
+ res4b7_branch2b_relu = mx.symbol.Activation(name='res4b7_branch2b_relu', data=scale4b7_branch2b,
+ act_type='relu')
+ res4b7_branch2c = mx.symbol.Convolution(name='res4b7_branch2c', data=res4b7_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b7_branch2c = mx.symbol.BatchNorm(name='bn4b7_branch2c', data=res4b7_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b7_branch2c = bn4b7_branch2c
+ res4b7 = mx.symbol.broadcast_add(name='res4b7', *[res4b6_relu, scale4b7_branch2c])
+ res4b7_relu = mx.symbol.Activation(name='res4b7_relu', data=res4b7, act_type='relu')
+ res4b8_branch2a = mx.symbol.Convolution(name='res4b8_branch2a', data=res4b7_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b8_branch2a = mx.symbol.BatchNorm(name='bn4b8_branch2a', data=res4b8_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b8_branch2a = bn4b8_branch2a
+ res4b8_branch2a_relu = mx.symbol.Activation(name='res4b8_branch2a_relu', data=scale4b8_branch2a,
+ act_type='relu')
+ res4b8_branch2b = mx.symbol.Convolution(name='res4b8_branch2b', data=res4b8_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b8_branch2b = mx.symbol.BatchNorm(name='bn4b8_branch2b', data=res4b8_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b8_branch2b = bn4b8_branch2b
+ res4b8_branch2b_relu = mx.symbol.Activation(name='res4b8_branch2b_relu', data=scale4b8_branch2b,
+ act_type='relu')
+ res4b8_branch2c = mx.symbol.Convolution(name='res4b8_branch2c', data=res4b8_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b8_branch2c = mx.symbol.BatchNorm(name='bn4b8_branch2c', data=res4b8_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b8_branch2c = bn4b8_branch2c
+ res4b8 = mx.symbol.broadcast_add(name='res4b8', *[res4b7_relu, scale4b8_branch2c])
+ res4b8_relu = mx.symbol.Activation(name='res4b8_relu', data=res4b8, act_type='relu')
+ res4b9_branch2a = mx.symbol.Convolution(name='res4b9_branch2a', data=res4b8_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b9_branch2a = mx.symbol.BatchNorm(name='bn4b9_branch2a', data=res4b9_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b9_branch2a = bn4b9_branch2a
+ res4b9_branch2a_relu = mx.symbol.Activation(name='res4b9_branch2a_relu', data=scale4b9_branch2a,
+ act_type='relu')
+ res4b9_branch2b = mx.symbol.Convolution(name='res4b9_branch2b', data=res4b9_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b9_branch2b = mx.symbol.BatchNorm(name='bn4b9_branch2b', data=res4b9_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b9_branch2b = bn4b9_branch2b
+ res4b9_branch2b_relu = mx.symbol.Activation(name='res4b9_branch2b_relu', data=scale4b9_branch2b,
+ act_type='relu')
+ res4b9_branch2c = mx.symbol.Convolution(name='res4b9_branch2c', data=res4b9_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b9_branch2c = mx.symbol.BatchNorm(name='bn4b9_branch2c', data=res4b9_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b9_branch2c = bn4b9_branch2c
+ res4b9 = mx.symbol.broadcast_add(name='res4b9', *[res4b8_relu, scale4b9_branch2c])
+ res4b9_relu = mx.symbol.Activation(name='res4b9_relu', data=res4b9, act_type='relu')
+ res4b10_branch2a = mx.symbol.Convolution(name='res4b10_branch2a', data=res4b9_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b10_branch2a = mx.symbol.BatchNorm(name='bn4b10_branch2a', data=res4b10_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b10_branch2a = bn4b10_branch2a
+ res4b10_branch2a_relu = mx.symbol.Activation(name='res4b10_branch2a_relu', data=scale4b10_branch2a,
+ act_type='relu')
+ res4b10_branch2b = mx.symbol.Convolution(name='res4b10_branch2b', data=res4b10_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b10_branch2b = mx.symbol.BatchNorm(name='bn4b10_branch2b', data=res4b10_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b10_branch2b = bn4b10_branch2b
+ res4b10_branch2b_relu = mx.symbol.Activation(name='res4b10_branch2b_relu', data=scale4b10_branch2b,
+ act_type='relu')
+ res4b10_branch2c = mx.symbol.Convolution(name='res4b10_branch2c', data=res4b10_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b10_branch2c = mx.symbol.BatchNorm(name='bn4b10_branch2c', data=res4b10_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b10_branch2c = bn4b10_branch2c
+ res4b10 = mx.symbol.broadcast_add(name='res4b10', *[res4b9_relu, scale4b10_branch2c])
+ res4b10_relu = mx.symbol.Activation(name='res4b10_relu', data=res4b10, act_type='relu')
+ res4b11_branch2a = mx.symbol.Convolution(name='res4b11_branch2a', data=res4b10_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b11_branch2a = mx.symbol.BatchNorm(name='bn4b11_branch2a', data=res4b11_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b11_branch2a = bn4b11_branch2a
+ res4b11_branch2a_relu = mx.symbol.Activation(name='res4b11_branch2a_relu', data=scale4b11_branch2a,
+ act_type='relu')
+ res4b11_branch2b = mx.symbol.Convolution(name='res4b11_branch2b', data=res4b11_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b11_branch2b = mx.symbol.BatchNorm(name='bn4b11_branch2b', data=res4b11_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b11_branch2b = bn4b11_branch2b
+ res4b11_branch2b_relu = mx.symbol.Activation(name='res4b11_branch2b_relu', data=scale4b11_branch2b,
+ act_type='relu')
+ res4b11_branch2c = mx.symbol.Convolution(name='res4b11_branch2c', data=res4b11_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b11_branch2c = mx.symbol.BatchNorm(name='bn4b11_branch2c', data=res4b11_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b11_branch2c = bn4b11_branch2c
+ res4b11 = mx.symbol.broadcast_add(name='res4b11', *[res4b10_relu, scale4b11_branch2c])
+ res4b11_relu = mx.symbol.Activation(name='res4b11_relu', data=res4b11, act_type='relu')
+ res4b12_branch2a = mx.symbol.Convolution(name='res4b12_branch2a', data=res4b11_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b12_branch2a = mx.symbol.BatchNorm(name='bn4b12_branch2a', data=res4b12_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b12_branch2a = bn4b12_branch2a
+ res4b12_branch2a_relu = mx.symbol.Activation(name='res4b12_branch2a_relu', data=scale4b12_branch2a,
+ act_type='relu')
+ res4b12_branch2b = mx.symbol.Convolution(name='res4b12_branch2b', data=res4b12_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b12_branch2b = mx.symbol.BatchNorm(name='bn4b12_branch2b', data=res4b12_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b12_branch2b = bn4b12_branch2b
+ res4b12_branch2b_relu = mx.symbol.Activation(name='res4b12_branch2b_relu', data=scale4b12_branch2b,
+ act_type='relu')
+ res4b12_branch2c = mx.symbol.Convolution(name='res4b12_branch2c', data=res4b12_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b12_branch2c = mx.symbol.BatchNorm(name='bn4b12_branch2c', data=res4b12_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b12_branch2c = bn4b12_branch2c
+ res4b12 = mx.symbol.broadcast_add(name='res4b12', *[res4b11_relu, scale4b12_branch2c])
+ res4b12_relu = mx.symbol.Activation(name='res4b12_relu', data=res4b12, act_type='relu')
+ res4b13_branch2a = mx.symbol.Convolution(name='res4b13_branch2a', data=res4b12_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b13_branch2a = mx.symbol.BatchNorm(name='bn4b13_branch2a', data=res4b13_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b13_branch2a = bn4b13_branch2a
+ res4b13_branch2a_relu = mx.symbol.Activation(name='res4b13_branch2a_relu', data=scale4b13_branch2a,
+ act_type='relu')
+ res4b13_branch2b = mx.symbol.Convolution(name='res4b13_branch2b', data=res4b13_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b13_branch2b = mx.symbol.BatchNorm(name='bn4b13_branch2b', data=res4b13_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b13_branch2b = bn4b13_branch2b
+ res4b13_branch2b_relu = mx.symbol.Activation(name='res4b13_branch2b_relu', data=scale4b13_branch2b,
+ act_type='relu')
+ res4b13_branch2c = mx.symbol.Convolution(name='res4b13_branch2c', data=res4b13_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b13_branch2c = mx.symbol.BatchNorm(name='bn4b13_branch2c', data=res4b13_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b13_branch2c = bn4b13_branch2c
+ res4b13 = mx.symbol.broadcast_add(name='res4b13', *[res4b12_relu, scale4b13_branch2c])
+ res4b13_relu = mx.symbol.Activation(name='res4b13_relu', data=res4b13, act_type='relu')
+ res4b14_branch2a = mx.symbol.Convolution(name='res4b14_branch2a', data=res4b13_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b14_branch2a = mx.symbol.BatchNorm(name='bn4b14_branch2a', data=res4b14_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b14_branch2a = bn4b14_branch2a
+ res4b14_branch2a_relu = mx.symbol.Activation(name='res4b14_branch2a_relu', data=scale4b14_branch2a,
+ act_type='relu')
+ res4b14_branch2b = mx.symbol.Convolution(name='res4b14_branch2b', data=res4b14_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b14_branch2b = mx.symbol.BatchNorm(name='bn4b14_branch2b', data=res4b14_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b14_branch2b = bn4b14_branch2b
+ res4b14_branch2b_relu = mx.symbol.Activation(name='res4b14_branch2b_relu', data=scale4b14_branch2b,
+ act_type='relu')
+ res4b14_branch2c = mx.symbol.Convolution(name='res4b14_branch2c', data=res4b14_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b14_branch2c = mx.symbol.BatchNorm(name='bn4b14_branch2c', data=res4b14_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b14_branch2c = bn4b14_branch2c
+ res4b14 = mx.symbol.broadcast_add(name='res4b14', *[res4b13_relu, scale4b14_branch2c])
+ res4b14_relu = mx.symbol.Activation(name='res4b14_relu', data=res4b14, act_type='relu')
+ res4b15_branch2a = mx.symbol.Convolution(name='res4b15_branch2a', data=res4b14_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b15_branch2a = mx.symbol.BatchNorm(name='bn4b15_branch2a', data=res4b15_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b15_branch2a = bn4b15_branch2a
+ res4b15_branch2a_relu = mx.symbol.Activation(name='res4b15_branch2a_relu', data=scale4b15_branch2a,
+ act_type='relu')
+ res4b15_branch2b = mx.symbol.Convolution(name='res4b15_branch2b', data=res4b15_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b15_branch2b = mx.symbol.BatchNorm(name='bn4b15_branch2b', data=res4b15_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b15_branch2b = bn4b15_branch2b
+ res4b15_branch2b_relu = mx.symbol.Activation(name='res4b15_branch2b_relu', data=scale4b15_branch2b,
+ act_type='relu')
+ res4b15_branch2c = mx.symbol.Convolution(name='res4b15_branch2c', data=res4b15_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b15_branch2c = mx.symbol.BatchNorm(name='bn4b15_branch2c', data=res4b15_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b15_branch2c = bn4b15_branch2c
+ res4b15 = mx.symbol.broadcast_add(name='res4b15', *[res4b14_relu, scale4b15_branch2c])
+ res4b15_relu = mx.symbol.Activation(name='res4b15_relu', data=res4b15, act_type='relu')
+ res4b16_branch2a = mx.symbol.Convolution(name='res4b16_branch2a', data=res4b15_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b16_branch2a = mx.symbol.BatchNorm(name='bn4b16_branch2a', data=res4b16_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b16_branch2a = bn4b16_branch2a
+ res4b16_branch2a_relu = mx.symbol.Activation(name='res4b16_branch2a_relu', data=scale4b16_branch2a,
+ act_type='relu')
+ res4b16_branch2b = mx.symbol.Convolution(name='res4b16_branch2b', data=res4b16_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b16_branch2b = mx.symbol.BatchNorm(name='bn4b16_branch2b', data=res4b16_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b16_branch2b = bn4b16_branch2b
+ res4b16_branch2b_relu = mx.symbol.Activation(name='res4b16_branch2b_relu', data=scale4b16_branch2b,
+ act_type='relu')
+ res4b16_branch2c = mx.symbol.Convolution(name='res4b16_branch2c', data=res4b16_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b16_branch2c = mx.symbol.BatchNorm(name='bn4b16_branch2c', data=res4b16_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b16_branch2c = bn4b16_branch2c
+ res4b16 = mx.symbol.broadcast_add(name='res4b16', *[res4b15_relu, scale4b16_branch2c])
+ res4b16_relu = mx.symbol.Activation(name='res4b16_relu', data=res4b16, act_type='relu')
+ res4b17_branch2a = mx.symbol.Convolution(name='res4b17_branch2a', data=res4b16_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b17_branch2a = mx.symbol.BatchNorm(name='bn4b17_branch2a', data=res4b17_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b17_branch2a = bn4b17_branch2a
+ res4b17_branch2a_relu = mx.symbol.Activation(name='res4b17_branch2a_relu', data=scale4b17_branch2a,
+ act_type='relu')
+ res4b17_branch2b = mx.symbol.Convolution(name='res4b17_branch2b', data=res4b17_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b17_branch2b = mx.symbol.BatchNorm(name='bn4b17_branch2b', data=res4b17_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b17_branch2b = bn4b17_branch2b
+ res4b17_branch2b_relu = mx.symbol.Activation(name='res4b17_branch2b_relu', data=scale4b17_branch2b,
+ act_type='relu')
+ res4b17_branch2c = mx.symbol.Convolution(name='res4b17_branch2c', data=res4b17_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b17_branch2c = mx.symbol.BatchNorm(name='bn4b17_branch2c', data=res4b17_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b17_branch2c = bn4b17_branch2c
+ res4b17 = mx.symbol.broadcast_add(name='res4b17', *[res4b16_relu, scale4b17_branch2c])
+ res4b17_relu = mx.symbol.Activation(name='res4b17_relu', data=res4b17, act_type='relu')
+ res4b18_branch2a = mx.symbol.Convolution(name='res4b18_branch2a', data=res4b17_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b18_branch2a = mx.symbol.BatchNorm(name='bn4b18_branch2a', data=res4b18_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b18_branch2a = bn4b18_branch2a
+ res4b18_branch2a_relu = mx.symbol.Activation(name='res4b18_branch2a_relu', data=scale4b18_branch2a,
+ act_type='relu')
+ res4b18_branch2b = mx.symbol.Convolution(name='res4b18_branch2b', data=res4b18_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b18_branch2b = mx.symbol.BatchNorm(name='bn4b18_branch2b', data=res4b18_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b18_branch2b = bn4b18_branch2b
+ res4b18_branch2b_relu = mx.symbol.Activation(name='res4b18_branch2b_relu', data=scale4b18_branch2b,
+ act_type='relu')
+ res4b18_branch2c = mx.symbol.Convolution(name='res4b18_branch2c', data=res4b18_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b18_branch2c = mx.symbol.BatchNorm(name='bn4b18_branch2c', data=res4b18_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b18_branch2c = bn4b18_branch2c
+ res4b18 = mx.symbol.broadcast_add(name='res4b18', *[res4b17_relu, scale4b18_branch2c])
+ res4b18_relu = mx.symbol.Activation(name='res4b18_relu', data=res4b18, act_type='relu')
+ res4b19_branch2a = mx.symbol.Convolution(name='res4b19_branch2a', data=res4b18_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b19_branch2a = mx.symbol.BatchNorm(name='bn4b19_branch2a', data=res4b19_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b19_branch2a = bn4b19_branch2a
+ res4b19_branch2a_relu = mx.symbol.Activation(name='res4b19_branch2a_relu', data=scale4b19_branch2a,
+ act_type='relu')
+ res4b19_branch2b = mx.symbol.Convolution(name='res4b19_branch2b', data=res4b19_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b19_branch2b = mx.symbol.BatchNorm(name='bn4b19_branch2b', data=res4b19_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b19_branch2b = bn4b19_branch2b
+ res4b19_branch2b_relu = mx.symbol.Activation(name='res4b19_branch2b_relu', data=scale4b19_branch2b,
+ act_type='relu')
+ res4b19_branch2c = mx.symbol.Convolution(name='res4b19_branch2c', data=res4b19_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b19_branch2c = mx.symbol.BatchNorm(name='bn4b19_branch2c', data=res4b19_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b19_branch2c = bn4b19_branch2c
+ res4b19 = mx.symbol.broadcast_add(name='res4b19', *[res4b18_relu, scale4b19_branch2c])
+ res4b19_relu = mx.symbol.Activation(name='res4b19_relu', data=res4b19, act_type='relu')
+ res4b20_branch2a = mx.symbol.Convolution(name='res4b20_branch2a', data=res4b19_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b20_branch2a = mx.symbol.BatchNorm(name='bn4b20_branch2a', data=res4b20_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b20_branch2a = bn4b20_branch2a
+ res4b20_branch2a_relu = mx.symbol.Activation(name='res4b20_branch2a_relu', data=scale4b20_branch2a,
+ act_type='relu')
+ res4b20_branch2b = mx.symbol.Convolution(name='res4b20_branch2b', data=res4b20_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b20_branch2b = mx.symbol.BatchNorm(name='bn4b20_branch2b', data=res4b20_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b20_branch2b = bn4b20_branch2b
+ res4b20_branch2b_relu = mx.symbol.Activation(name='res4b20_branch2b_relu', data=scale4b20_branch2b,
+ act_type='relu')
+ res4b20_branch2c = mx.symbol.Convolution(name='res4b20_branch2c', data=res4b20_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b20_branch2c = mx.symbol.BatchNorm(name='bn4b20_branch2c', data=res4b20_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b20_branch2c = bn4b20_branch2c
+ res4b20 = mx.symbol.broadcast_add(name='res4b20', *[res4b19_relu, scale4b20_branch2c])
+ res4b20_relu = mx.symbol.Activation(name='res4b20_relu', data=res4b20, act_type='relu')
+ res4b21_branch2a = mx.symbol.Convolution(name='res4b21_branch2a', data=res4b20_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b21_branch2a = mx.symbol.BatchNorm(name='bn4b21_branch2a', data=res4b21_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b21_branch2a = bn4b21_branch2a
+ res4b21_branch2a_relu = mx.symbol.Activation(name='res4b21_branch2a_relu', data=scale4b21_branch2a,
+ act_type='relu')
+ res4b21_branch2b = mx.symbol.Convolution(name='res4b21_branch2b', data=res4b21_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b21_branch2b = mx.symbol.BatchNorm(name='bn4b21_branch2b', data=res4b21_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b21_branch2b = bn4b21_branch2b
+ res4b21_branch2b_relu = mx.symbol.Activation(name='res4b21_branch2b_relu', data=scale4b21_branch2b,
+ act_type='relu')
+ res4b21_branch2c = mx.symbol.Convolution(name='res4b21_branch2c', data=res4b21_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b21_branch2c = mx.symbol.BatchNorm(name='bn4b21_branch2c', data=res4b21_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b21_branch2c = bn4b21_branch2c
+ res4b21 = mx.symbol.broadcast_add(name='res4b21', *[res4b20_relu, scale4b21_branch2c])
+ res4b21_relu = mx.symbol.Activation(name='res4b21_relu', data=res4b21, act_type='relu')
+ res4b22_branch2a = mx.symbol.Convolution(name='res4b22_branch2a', data=res4b21_relu, num_filter=256, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b22_branch2a = mx.symbol.BatchNorm(name='bn4b22_branch2a', data=res4b22_branch2a, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b22_branch2a = bn4b22_branch2a
+ res4b22_branch2a_relu = mx.symbol.Activation(name='res4b22_branch2a_relu', data=scale4b22_branch2a,
+ act_type='relu')
+ res4b22_branch2b = mx.symbol.Convolution(name='res4b22_branch2b', data=res4b22_branch2a_relu, num_filter=256,
+ pad=(1, 1), kernel=(3, 3), stride=(1, 1), no_bias=True)
+ bn4b22_branch2b = mx.symbol.BatchNorm(name='bn4b22_branch2b', data=res4b22_branch2b, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b22_branch2b = bn4b22_branch2b
+ res4b22_branch2b_relu = mx.symbol.Activation(name='res4b22_branch2b_relu', data=scale4b22_branch2b,
+ act_type='relu')
+ res4b22_branch2c = mx.symbol.Convolution(name='res4b22_branch2c', data=res4b22_branch2b_relu, num_filter=1024,
+ pad=(0, 0), kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn4b22_branch2c = mx.symbol.BatchNorm(name='bn4b22_branch2c', data=res4b22_branch2c, use_global_stats=True,
+ fix_gamma=False, eps = self.eps)
+ scale4b22_branch2c = bn4b22_branch2c
+ res4b22 = mx.symbol.broadcast_add(name='res4b22', *[res4b21_relu, scale4b22_branch2c])
+ res4b22_relu = mx.symbol.Activation(name='res4b22_relu', data=res4b22, act_type='relu')
+
+ res5a_branch1 = mx.symbol.Convolution(name='res5a_branch1', data=res4b22_relu, num_filter=2048, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn5a_branch1 = mx.symbol.BatchNorm(name='bn5a_branch1', data=res5a_branch1, use_global_stats=True, fix_gamma=False, eps=self.eps)
+ scale5a_branch1 = bn5a_branch1
+ res5a_branch2a = mx.symbol.Convolution(name='res5a_branch2a', data=res4b22_relu, num_filter=512, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn5a_branch2a = mx.symbol.BatchNorm(name='bn5a_branch2a', data=res5a_branch2a, use_global_stats=True,
+ fix_gamma=False, eps=self.eps)
+ scale5a_branch2a = bn5a_branch2a
+ res5a_branch2a_relu = mx.symbol.Activation(name='res5a_branch2a_relu', data=scale5a_branch2a, act_type='relu')
+ res5a_branch2b_offset_weight = mx.symbol.Variable('res5a_branch2b_offset_weight', lr_mult=1.0)
+ res5a_branch2b_offset_bias = mx.symbol.Variable('res5a_branch2b_offset_bias', lr_mult=2.0)
+ res5a_branch2b_offset = mx.symbol.Convolution(name='res5a_branch2b_offset', data = res5a_branch2a_relu,
+ num_filter=18, pad=(1, 1), kernel=(3, 3), stride=(1, 1),
+ weight=res5a_branch2b_offset_weight, bias=res5a_branch2b_offset_bias)
+ res5a_branch2b = mx.contrib.symbol.DeformableConvolution(name='res5a_branch2b', data=res5a_branch2a_relu, offset=res5a_branch2b_offset,
+ num_filter=512, pad=(2, 2), kernel=(3, 3), num_deformable_group=1,
+ stride=(1, 1), dilate=(2, 2), no_bias=True)
+ bn5a_branch2b = mx.symbol.BatchNorm(name='bn5a_branch2b', data=res5a_branch2b, use_global_stats=True,
+ fix_gamma=False, eps=self.eps)
+ scale5a_branch2b = bn5a_branch2b
+ res5a_branch2b_relu = mx.symbol.Activation(name='res5a_branch2b_relu', data=scale5a_branch2b, act_type='relu')
+ res5a_branch2c = mx.symbol.Convolution(name='res5a_branch2c', data=res5a_branch2b_relu, num_filter=2048, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn5a_branch2c = mx.symbol.BatchNorm(name='bn5a_branch2c', data=res5a_branch2c, use_global_stats=True,
+ fix_gamma=False, eps=self.eps)
+ scale5a_branch2c = bn5a_branch2c
+ res5a = mx.symbol.broadcast_add(name='res5a', *[scale5a_branch1, scale5a_branch2c])
+ res5a_relu = mx.symbol.Activation(name='res5a_relu', data=res5a, act_type='relu')
+ res5b_branch2a = mx.symbol.Convolution(name='res5b_branch2a', data=res5a_relu, num_filter=512, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn5b_branch2a = mx.symbol.BatchNorm(name='bn5b_branch2a', data=res5b_branch2a, use_global_stats=True,
+ fix_gamma=False, eps=self.eps)
+ scale5b_branch2a = bn5b_branch2a
+ res5b_branch2a_relu = mx.symbol.Activation(name='res5b_branch2a_relu', data=scale5b_branch2a, act_type='relu')
+ res5b_branch2b_offset_weight = mx.symbol.Variable('res5b_branch2b_offset_weight', lr_mult=1.0)
+ res5b_branch2b_offset_bias = mx.symbol.Variable('res5b_branch2b_offset_bias', lr_mult=2.0)
+ res5b_branch2b_offset = mx.symbol.Convolution(name='res5b_branch2b_offset', data = res5b_branch2a_relu,
+ num_filter=18, pad=(1, 1), kernel=(3, 3), stride=(1, 1),
+ weight=res5b_branch2b_offset_weight, bias=res5b_branch2b_offset_bias)
+ res5b_branch2b = mx.contrib.symbol.DeformableConvolution(name='res5b_branch2b', data=res5b_branch2a_relu, offset=res5b_branch2b_offset,
+ num_filter=512, pad=(2, 2), kernel=(3, 3), num_deformable_group=1,
+ stride=(1, 1), dilate=(2, 2), no_bias=True)
+ bn5b_branch2b = mx.symbol.BatchNorm(name='bn5b_branch2b', data=res5b_branch2b, use_global_stats=True,
+ fix_gamma=False, eps=self.eps)
+ scale5b_branch2b = bn5b_branch2b
+ res5b_branch2b_relu = mx.symbol.Activation(name='res5b_branch2b_relu', data=scale5b_branch2b, act_type='relu')
+ res5b_branch2c = mx.symbol.Convolution(name='res5b_branch2c', data=res5b_branch2b_relu, num_filter=2048, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn5b_branch2c = mx.symbol.BatchNorm(name='bn5b_branch2c', data=res5b_branch2c, use_global_stats=True,
+ fix_gamma=False, eps=self.eps)
+ scale5b_branch2c = bn5b_branch2c
+ res5b = mx.symbol.broadcast_add(name='res5b', *[res5a_relu, scale5b_branch2c])
+ res5b_relu = mx.symbol.Activation(name='res5b_relu', data=res5b, act_type='relu')
+ res5c_branch2a = mx.symbol.Convolution(name='res5c_branch2a', data=res5b_relu, num_filter=512, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn5c_branch2a = mx.symbol.BatchNorm(name='bn5c_branch2a', data=res5c_branch2a, use_global_stats=True,
+ fix_gamma=False, eps=self.eps)
+ scale5c_branch2a = bn5c_branch2a
+ res5c_branch2a_relu = mx.symbol.Activation(name='res5c_branch2a_relu', data=scale5c_branch2a, act_type='relu')
+ res5c_branch2b_offset_weight = mx.symbol.Variable('res5c_branch2b_offset_weight', lr_mult=1.0)
+ res5c_branch2b_offset_bias = mx.symbol.Variable('res5c_branch2b_offset_bias', lr_mult=2.0)
+ res5c_branch2b_offset = mx.symbol.Convolution(name='res5c_branch2b_offset', data = res5c_branch2a_relu,
+ num_filter=18, pad=(1, 1), kernel=(3, 3), stride=(1, 1),
+ weight=res5c_branch2b_offset_weight, bias=res5c_branch2b_offset_bias)
+ res5c_branch2b = mx.contrib.symbol.DeformableConvolution(name='res5c_branch2b', data=res5c_branch2a_relu, offset=res5c_branch2b_offset,
+ num_filter=512, pad=(2, 2), kernel=(3, 3), num_deformable_group=1,
+ stride=(1, 1), dilate=(2, 2), no_bias=True)
+ bn5c_branch2b = mx.symbol.BatchNorm(name='bn5c_branch2b', data=res5c_branch2b, use_global_stats=True,
+ fix_gamma=False, eps=self.eps)
+ scale5c_branch2b = bn5c_branch2b
+ res5c_branch2b_relu = mx.symbol.Activation(name='res5c_branch2b_relu', data=scale5c_branch2b, act_type='relu')
+ res5c_branch2c = mx.symbol.Convolution(name='res5c_branch2c', data=res5c_branch2b_relu, num_filter=2048, pad=(0, 0),
+ kernel=(1, 1), stride=(1, 1), no_bias=True)
+ bn5c_branch2c = mx.symbol.BatchNorm(name='bn5c_branch2c', data=res5c_branch2c, use_global_stats=True,
+ fix_gamma=False, eps=self.eps)
+ scale5c_branch2c = bn5c_branch2c
+ res5c = mx.symbol.broadcast_add(name='res5c', *[res5b_relu, scale5c_branch2c])
+ res5c_relu = mx.symbol.Activation(name='res5c_relu', data=res5c, act_type='relu')
+
+ return res5c_relu
+
+ def get_train_symbol(self, num_classes):
+ """
+ get symbol for training
+ :param num_classes: num of classes
+ :return: the symbol for training
+ """
+ data = mx.symbol.Variable(name="data")
+ seg_cls_gt = mx.symbol.Variable(name='label')
+
+ # shared convolutional layers
+ conv_feat = self.get_resnet_conv(data)
+
+ # subsequent fc layers by haozhi
+ fc6_bias = mx.symbol.Variable('fc6_bias', lr_mult=2.0)
+ fc6_weight = mx.symbol.Variable('fc6_weight', lr_mult=1.0)
+
+ fc6 = mx.symbol.Convolution(data=conv_feat, kernel=(1, 1), pad=(0, 0), num_filter=1024, name="fc6",
+ bias=fc6_bias, weight=fc6_weight, workspace=self.workspace)
+ relu_fc6 = mx.sym.Activation(data=fc6, act_type='relu', name='relu_fc6')
+
+ score_bias = mx.symbol.Variable('score_bias', lr_mult=2.0)
+ score_weight = mx.symbol.Variable('score_weight', lr_mult=1.0)
+
+ score = mx.symbol.Convolution(data=relu_fc6, kernel=(1, 1), pad=(0, 0), num_filter=num_classes, name="score",
+ bias=score_bias, weight=score_weight, workspace=self.workspace)
+
+ upsampling = mx.symbol.Deconvolution(data=score, num_filter=num_classes, kernel=(32, 32), stride=(16, 16),
+ num_group=num_classes, no_bias=True, name='upsampling',
+ attr={'lr_mult': '0.0'}, workspace=self.workspace)
+
+ croped_score = mx.symbol.Crop(*[upsampling, data], offset=(8, 8), name='croped_score')
+ softmax = mx.symbol.SoftmaxOutput(data=croped_score, label=seg_cls_gt, normalization='valid', multi_output=True,
+ use_ignore=True, ignore_label=255, name="softmax")
+
+ return softmax
+
+ def get_test_symbol(self, num_classes):
+ """
+ get symbol for testing
+ :param num_classes: num of classes
+ :return: the symbol for testing
+ """
+ data = mx.symbol.Variable(name="data")
+
+ # shared convolutional layers
+ conv_feat = self.get_resnet_conv(data)
+
+ fc6_bias = mx.symbol.Variable('fc6_bias', lr_mult=2.0)
+ fc6_weight = mx.symbol.Variable('fc6_weight', lr_mult=1.0)
+
+ fc6 = mx.symbol.Convolution(
+ data=conv_feat, kernel=(1, 1), pad=(0, 0), num_filter=1024, name="fc6", bias=fc6_bias, weight=fc6_weight,
+ workspace=self.workspace)
+ relu_fc6 = mx.sym.Activation(data=fc6, act_type='relu', name='relu_fc6')
+
+ score_bias = mx.symbol.Variable('score_bias', lr_mult=2.0)
+ score_weight = mx.symbol.Variable('score_weight', lr_mult=1.0)
+
+ score = mx.symbol.Convolution(
+ data=relu_fc6, kernel=(1, 1), pad=(0, 0), num_filter=num_classes, name="score", bias=score_bias,
+ weight=score_weight, workspace=self.workspace)
+
+ upsampling = mx.symbol.Deconvolution(
+ data=score, num_filter=num_classes, kernel=(32, 32), stride=(16, 16), num_group=num_classes, no_bias=True,
+ name='upsampling', attr={'lr_mult': '0.0'}, workspace=self.workspace)
+
+ croped_score = mx.symbol.Crop(*[upsampling, data], offset=(8, 8), name='croped_score')
+
+ softmax = mx.symbol.SoftmaxOutput(data=croped_score, normalization='valid', multi_output=True, use_ignore=True,
+ ignore_label=255, name="softmax")
+
+ return softmax
+
+ def get_symbol(self, cfg, is_train=True):
+ """
+ return a generated symbol, it also need to be assigned to self.sym
+ """
+
+ # config alias for convenient
+ num_classes = cfg.dataset.NUM_CLASSES
+
+ if is_train:
+ self.sym = self.get_train_symbol(num_classes=num_classes)
+ else:
+ self.sym = self.get_test_symbol(num_classes=num_classes)
+
+ return self.sym
+
+ def init_weights(self, cfg, arg_params, aux_params):
+ arg_params['res5a_branch2b_offset_weight'] = mx.nd.zeros(shape=self.arg_shape_dict['res5a_branch2b_offset_weight'])
+ arg_params['res5a_branch2b_offset_bias'] = mx.nd.zeros(shape=self.arg_shape_dict['res5a_branch2b_offset_bias'])
+ arg_params['res5b_branch2b_offset_weight'] = mx.nd.zeros(shape=self.arg_shape_dict['res5b_branch2b_offset_weight'])
+ arg_params['res5b_branch2b_offset_bias'] = mx.nd.zeros(shape=self.arg_shape_dict['res5b_branch2b_offset_bias'])
+ arg_params['res5c_branch2b_offset_weight'] = mx.nd.zeros(shape=self.arg_shape_dict['res5c_branch2b_offset_weight'])
+ arg_params['res5c_branch2b_offset_bias'] = mx.nd.zeros(shape=self.arg_shape_dict['res5c_branch2b_offset_bias'])
+ arg_params['fc6_weight'] = mx.random.normal(0, 0.01, shape=self.arg_shape_dict['fc6_weight'])
+ arg_params['fc6_bias'] = mx.nd.zeros(shape=self.arg_shape_dict['fc6_bias'])
+ arg_params['score_weight'] = mx.random.normal(0, 0.01, shape=self.arg_shape_dict['score_weight'])
+ arg_params['score_bias'] = mx.nd.zeros(shape=self.arg_shape_dict['score_bias'])
+ arg_params['upsampling_weight'] = mx.nd.zeros(shape=self.arg_shape_dict['upsampling_weight'])
+
+ init = mx.init.Initializer()
+ init._init_bilinear('upsample_weight', arg_params['upsampling_weight'])
diff --git a/deeplab/test.py b/deeplab/test.py
new file mode 100644
index 0000000..ab59064
--- /dev/null
+++ b/deeplab/test.py
@@ -0,0 +1,102 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Written by Zheng Zhang
+# --------------------------------------------------------
+
+import _init_paths
+
+import argparse
+import os
+import sys
+import time
+import logging
+from config.config import config, update_config
+
+def parse_args():
+ parser = argparse.ArgumentParser(description='Test a Deeplab Network')
+ # general
+ parser.add_argument('--cfg', help='experiment configure file name', required=True, type=str)
+
+ args, rest = parser.parse_known_args()
+ update_config(args.cfg)
+
+ # testing
+ parser.add_argument('--vis', help='turn on visualization', action='store_true')
+ parser.add_argument('--ignore_cache', help='ignore cached results boxes', action='store_true')
+ parser.add_argument('--shuffle', help='shuffle data on visualization', action='store_true')
+ args = parser.parse_args()
+ return args
+
+args = parse_args()
+curr_path = os.path.abspath(os.path.dirname(__file__))
+sys.path.insert(0, os.path.join(curr_path, '../external/mxnet', config.MXNET_VERSION))
+
+import pprint
+import mxnet as mx
+
+from symbols import *
+from dataset import *
+from core.loader import TestDataLoader
+from core.tester import Predictor, pred_eval
+from utils.load_data import load_gt_segdb, merge_segdb
+from utils.load_model import load_param
+from utils.create_logger import create_logger
+
+def test_deeplab():
+ epoch = config.TEST.test_epoch
+ ctx = [mx.gpu(int(i)) for i in config.gpus.split(',')]
+ image_set = config.dataset.test_image_set
+ root_path = config.dataset.root_path
+ dataset = config.dataset.dataset
+ dataset_path = config.dataset.dataset_path
+
+ logger, final_output_path = create_logger(config.output_path, args.cfg, image_set)
+ prefix = os.path.join(final_output_path, '..', '_'.join([iset for iset in config.dataset.image_set.split('+')]), config.TRAIN.model_prefix)
+
+ # print config
+ pprint.pprint(config)
+ logger.info('testing config:{}\n'.format(pprint.pformat(config)))
+
+ # load symbol and testing data
+ sym_instance = eval(config.symbol + '.' + config.symbol)()
+ sym = sym_instance.get_symbol(config, is_train=False)
+
+ imdb = eval(dataset)(image_set, root_path, dataset_path, result_path=final_output_path)
+ segdb = imdb.gt_segdb()
+
+ # get test data iter
+ test_data = TestDataLoader(segdb, config=config, batch_size=len(ctx))
+
+ # infer shape
+ data_shape_dict = dict(test_data.provide_data_single)
+ sym_instance.infer_shape(data_shape_dict)
+
+ # load model and check parameters
+ arg_params, aux_params = load_param(prefix, epoch, process=True)
+
+ sym_instance.check_parameter_shapes(arg_params, aux_params, data_shape_dict, is_train=False)
+
+ # decide maximum shape
+ data_names = [k[0] for k in test_data.provide_data_single]
+ label_names = ['softmax_label']
+ max_data_shape = [[('data', (1, 3, max([v[0] for v in config.SCALES]), max([v[1] for v in config.SCALES])))]]
+
+ # create predictor
+ predictor = Predictor(sym, data_names, label_names,
+ context=ctx, max_data_shapes=max_data_shape,
+ provide_data=test_data.provide_data, provide_label=test_data.provide_label,
+ arg_params=arg_params, aux_params=aux_params)
+
+ # start detection
+ pred_eval(predictor, test_data, imdb, vis=args.vis, ignore_cache=args.ignore_cache, logger=logger)
+
+def main():
+ print args
+ test_deeplab()
+
+
+if __name__ == '__main__':
+ main()
diff --git a/deeplab/train.py b/deeplab/train.py
new file mode 100644
index 0000000..fb2722a
--- /dev/null
+++ b/deeplab/train.py
@@ -0,0 +1,162 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Written by Zheng Zhang
+# --------------------------------------------------------
+
+import _init_paths
+
+import time
+import argparse
+import logging
+import pprint
+import os
+import sys
+from config.config import config, update_config
+
+def parse_args():
+ parser = argparse.ArgumentParser(description='Train deeplab network')
+ # general
+ parser.add_argument('--cfg', help='experiment configure file name', required=True, type=str)
+
+ args, rest = parser.parse_known_args()
+ # update config
+ update_config(args.cfg)
+
+ # training
+ parser.add_argument('--frequent', help='frequency of logging', default=config.default.frequent, type=int)
+ args = parser.parse_args()
+ return args
+
+args = parse_args()
+curr_path = os.path.abspath(os.path.dirname(__file__))
+sys.path.insert(0, os.path.join(curr_path, '../external/mxnet', config.MXNET_VERSION))
+
+
+import shutil
+import numpy as np
+import mxnet as mx
+
+from symbols import *
+from core import callback, metric
+from core.loader import TrainDataLoader
+from core.module import MutableModule
+from utils.load_data import load_gt_segdb, merge_segdb
+from utils.load_model import load_param
+from utils.PrefetchingIter import PrefetchingIter
+from utils.create_logger import create_logger
+from utils.lr_scheduler import WarmupMultiFactorScheduler
+
+def train_net(args, ctx, pretrained, epoch, prefix, begin_epoch, end_epoch, lr, lr_step):
+ logger, final_output_path = create_logger(config.output_path, args.cfg, config.dataset.image_set)
+ prefix = os.path.join(final_output_path, prefix)
+
+ # load symbol
+ shutil.copy2(os.path.join(curr_path, 'symbols', config.symbol + '.py'), final_output_path)
+ sym_instance = eval(config.symbol + '.' + config.symbol)()
+ sym = sym_instance.get_symbol(config, is_train=True)
+ #sym = eval('get_' + args.network + '_train')(num_classes=config.dataset.NUM_CLASSES)
+
+ # setup multi-gpu
+ batch_size = len(ctx)
+ input_batch_size = config.TRAIN.BATCH_IMAGES * batch_size
+
+ # print config
+ pprint.pprint(config)
+ logger.info('training config:{}\n'.format(pprint.pformat(config)))
+
+ # load dataset and prepare imdb for training
+ image_sets = [iset for iset in config.dataset.image_set.split('+')]
+ segdbs = [load_gt_segdb(config.dataset.dataset, image_set, config.dataset.root_path, config.dataset.dataset_path,
+ result_path=final_output_path, flip=config.TRAIN.FLIP)
+ for image_set in image_sets]
+ segdb = merge_segdb(segdbs)
+
+ # load training data
+ train_data = TrainDataLoader(sym, segdb, config, batch_size=input_batch_size, crop_height=config.TRAIN.CROP_HEIGHT, crop_width=config.TRAIN.CROP_WIDTH,
+ shuffle=config.TRAIN.SHUFFLE, ctx=ctx)
+
+ # infer max shape
+ max_scale = [(config.TRAIN.CROP_HEIGHT, config.TRAIN.CROP_WIDTH)]
+ max_data_shape = [('data', (config.TRAIN.BATCH_IMAGES, 3, max([v[0] for v in max_scale]), max([v[1] for v in max_scale])))]
+ max_label_shape = [('label', (config.TRAIN.BATCH_IMAGES, 1, max([v[0] for v in max_scale]), max([v[1] for v in max_scale])))]
+ max_data_shape, max_label_shape = train_data.infer_shape(max_data_shape, max_label_shape)
+ print 'providing maximum shape', max_data_shape, max_label_shape
+
+ # infer shape
+ data_shape_dict = dict(train_data.provide_data_single + train_data.provide_label_single)
+ pprint.pprint(data_shape_dict)
+ sym_instance.infer_shape(data_shape_dict)
+
+ # load and initialize params
+ if config.TRAIN.RESUME:
+ print 'continue training from ', begin_epoch
+ arg_params, aux_params = load_param(prefix, begin_epoch, convert=True)
+ else:
+ print pretrained
+ arg_params, aux_params = load_param(pretrained, epoch, convert=True)
+ sym_instance.init_weights(config, arg_params, aux_params)
+
+ # check parameter shapes
+ sym_instance.check_parameter_shapes(arg_params, aux_params, data_shape_dict)
+
+ # create solver
+ fixed_param_prefix = config.network.FIXED_PARAMS
+ data_names = [k[0] for k in train_data.provide_data_single]
+ label_names = [k[0] for k in train_data.provide_label_single]
+
+ mod = MutableModule(sym, data_names=data_names, label_names=label_names,
+ logger=logger, context=ctx, max_data_shapes=[max_data_shape for _ in xrange(batch_size)],
+ max_label_shapes=[max_label_shape for _ in xrange(batch_size)], fixed_param_prefix=fixed_param_prefix)
+
+ # decide training params
+ # metric
+ fcn_loss_metric = metric.FCNLogLossMetric(config.default.frequent * batch_size)
+ eval_metrics = mx.metric.CompositeEvalMetric()
+
+ # rpn_eval_metric, rpn_cls_metric, rpn_bbox_metric, eval_metric, cls_metric, bbox_metric
+ for child_metric in [fcn_loss_metric]:
+ eval_metrics.add(child_metric)
+
+ # callback
+ batch_end_callback = callback.Speedometer(train_data.batch_size, frequent=args.frequent)
+ epoch_end_callback = mx.callback.module_checkpoint(mod, prefix, period=1, save_optimizer_states=True)
+
+ # decide learning rate
+ base_lr = lr
+ lr_factor = 0.1
+ lr_epoch = [float(epoch) for epoch in lr_step.split(',')]
+ lr_epoch_diff = [epoch - begin_epoch for epoch in lr_epoch if epoch > begin_epoch]
+ lr = base_lr * (lr_factor ** (len(lr_epoch) - len(lr_epoch_diff)))
+ lr_iters = [int(epoch * len(segdb) / batch_size) for epoch in lr_epoch_diff]
+ print 'lr', lr, 'lr_epoch_diff', lr_epoch_diff, 'lr_iters', lr_iters
+
+ lr_scheduler = WarmupMultiFactorScheduler(lr_iters, lr_factor, config.TRAIN.warmup, config.TRAIN.warmup_lr, config.TRAIN.warmup_step)
+
+ # optimizer
+ optimizer_params = {'momentum': config.TRAIN.momentum,
+ 'wd': config.TRAIN.wd,
+ 'learning_rate': lr,
+ 'lr_scheduler': lr_scheduler,
+ 'rescale_grad': 1.0,
+ 'clip_gradient': None}
+
+ if not isinstance(train_data, PrefetchingIter):
+ train_data = PrefetchingIter(train_data)
+
+ # train
+ mod.fit(train_data, eval_metric=eval_metrics, epoch_end_callback=epoch_end_callback,
+ batch_end_callback=batch_end_callback, kvstore=config.default.kvstore,
+ optimizer='sgd', optimizer_params=optimizer_params,
+ arg_params=arg_params, aux_params=aux_params, begin_epoch=begin_epoch, num_epoch=end_epoch)
+
+def main():
+ print 'Called with argument:', args
+ ctx = [mx.gpu(int(i)) for i in config.gpus.split(',')]
+ train_net(args, ctx, config.network.pretrained, config.network.pretrained_epoch, config.TRAIN.model_prefix,
+ config.TRAIN.begin_epoch, config.TRAIN.end_epoch, config.TRAIN.lr, config.TRAIN.lr_step)
+
+if __name__ == '__main__':
+ main()
diff --git a/demo/deform_conv/000240.jpg b/demo/deform_conv/000240.jpg
new file mode 100644
index 0000000..6becaf4
Binary files /dev/null and b/demo/deform_conv/000240.jpg differ
diff --git a/demo/deform_conv/000437.jpg b/demo/deform_conv/000437.jpg
new file mode 100644
index 0000000..0a90a4a
Binary files /dev/null and b/demo/deform_conv/000437.jpg differ
diff --git a/demo/deform_conv/004072.jpg b/demo/deform_conv/004072.jpg
new file mode 100644
index 0000000..c20ed5c
Binary files /dev/null and b/demo/deform_conv/004072.jpg differ
diff --git a/demo/deform_conv/007912.jpg b/demo/deform_conv/007912.jpg
new file mode 100644
index 0000000..02f6466
Binary files /dev/null and b/demo/deform_conv/007912.jpg differ
diff --git a/demo/deform_psroi/000057.jpg b/demo/deform_psroi/000057.jpg
new file mode 100644
index 0000000..0483cf8
Binary files /dev/null and b/demo/deform_psroi/000057.jpg differ
diff --git a/demo/deform_psroi/000149.jpg b/demo/deform_psroi/000149.jpg
new file mode 100644
index 0000000..574a200
Binary files /dev/null and b/demo/deform_psroi/000149.jpg differ
diff --git a/demo/deform_psroi/000351.jpg b/demo/deform_psroi/000351.jpg
new file mode 100644
index 0000000..d9d6b92
Binary files /dev/null and b/demo/deform_psroi/000351.jpg differ
diff --git a/demo/deform_psroi/002535.jpg b/demo/deform_psroi/002535.jpg
new file mode 100644
index 0000000..6b56d11
Binary files /dev/null and b/demo/deform_psroi/002535.jpg differ
diff --git a/demo/frankfurt_000001_073088_leftImg8bit.png b/demo/frankfurt_000001_073088_leftImg8bit.png
new file mode 100644
index 0000000..9605e69
Binary files /dev/null and b/demo/frankfurt_000001_073088_leftImg8bit.png differ
diff --git a/demo/lindau_000024_000019_leftImg8bit.png b/demo/lindau_000024_000019_leftImg8bit.png
new file mode 100644
index 0000000..3c6217b
Binary files /dev/null and b/demo/lindau_000024_000019_leftImg8bit.png differ
diff --git a/experiments/deeplab/cfgs/deeplab_cityscapes_demo.yaml b/experiments/deeplab/cfgs/deeplab_cityscapes_demo.yaml
new file mode 100644
index 0000000..aac0e21
--- /dev/null
+++ b/experiments/deeplab/cfgs/deeplab_cityscapes_demo.yaml
@@ -0,0 +1,71 @@
+---
+MXNET_VERSION: "mxnet"
+output_path: "./output/cityscape"
+symbol: resnet_v1_101_deeplab
+gpus: '0'
+SCALES:
+- 1024
+- 2048
+default:
+ frequent: 10
+ kvstore: device
+dataset:
+ NUM_CLASSES: 19
+ dataset: CityScape
+ dataset_path: "./data/cityscapes/"
+ image_set: leftImg8bit_train
+ root_path: "./data/"
+ test_image_set: leftImg8bit_val
+network:
+ FIXED_PARAMS:
+ - conv1
+ - bn_conv1
+ - res2
+ - bn2
+ - gamma
+ - beta
+ FIXED_PARAMS_SHARED:
+ - conv1
+ - bn_conv1
+ - res2
+ - bn2
+ - res3
+ - bn3
+ - res4
+ - bn4
+ - gamma
+ - beta
+ IMAGE_STRIDE: 0
+ PIXEL_MEANS:
+ - 103.06
+ - 115.90
+ - 123.15
+ pretrained: "./model/pretrained_model/resnet_v1_101"
+ pretrained_epoch: 0
+TRAIN:
+ warmup: true
+ warmup_lr: 0.00005
+ # typically we will use 4000 warmup step for single GPU
+ warmup_step: 1000
+ begin_epoch: 0
+ end_epoch: 53
+ lr: 0.0005
+ lr_step: '40.336'
+ model_prefix: "deeplab_resnet_v1_101_cityscapes_segmentation_dcn"
+ # whether flip image
+ FLIP: true
+ # size of images for each device
+ BATCH_IMAGES: 1
+ # wheter crop image during training
+ ENABLE_CROP: True
+ # scale of cropped image during training
+ CROP_HEIGHT: 768
+ CROP_WIDTH: 1024
+ # whether resume training
+ RESUME: false
+ # whether shuffle image
+ SHUFFLE: true
+TEST:
+ # size of images for each device
+ BATCH_IMAGES: 1
+ test_epoch: 53
diff --git a/experiments/deeplab/cfgs/deeplab_resnet_v1_101_cityscapes_segmentation_base.yaml b/experiments/deeplab/cfgs/deeplab_resnet_v1_101_cityscapes_segmentation_base.yaml
new file mode 100644
index 0000000..b595850
--- /dev/null
+++ b/experiments/deeplab/cfgs/deeplab_resnet_v1_101_cityscapes_segmentation_base.yaml
@@ -0,0 +1,71 @@
+---
+MXNET_VERSION: "mxnet"
+output_path: "./output/cityscape"
+symbol: resnet_v1_101_deeplab
+gpus: '0'
+SCALES:
+- 1024
+- 2048
+default:
+ frequent: 10
+ kvstore: device
+dataset:
+ NUM_CLASSES: 19
+ dataset: CityScape
+ dataset_path: "./data/cityscapes/"
+ image_set: leftImg8bit_train
+ root_path: "./data/"
+ test_image_set: leftImg8bit_val
+network:
+ FIXED_PARAMS:
+ - conv1
+ - bn_conv1
+ - res2
+ - bn2
+ - gamma
+ - beta
+ FIXED_PARAMS_SHARED:
+ - conv1
+ - bn_conv1
+ - res2
+ - bn2
+ - res3
+ - bn3
+ - res4
+ - bn4
+ - gamma
+ - beta
+ IMAGE_STRIDE: 0
+ PIXEL_MEANS:
+ - 103.06
+ - 115.90
+ - 123.15
+ pretrained: "./model/pretrained_model/resnet_v1_101"
+ pretrained_epoch: 0
+TRAIN:
+ warmup: true
+ warmup_lr: 0.00005
+ # typically we will use 4000 warmup step for single GPU
+ warmup_step: 1000
+ begin_epoch: 0
+ end_epoch: 53
+ lr: 0.0005
+ lr_step: '40.336'
+ model_prefix: "deeplab_resnet_v1_101_cityscapes_segmentation_base"
+ # whether flip image
+ FLIP: true
+ # size of images for each device
+ BATCH_IMAGES: 1
+ # wheter crop image during training
+ ENABLE_CROP: True
+ # scale of cropped image during training
+ CROP_HEIGHT: 768
+ CROP_WIDTH: 1024
+ # whether resume training
+ RESUME: false
+ # whether shuffle image
+ SHUFFLE: true
+TEST:
+ # size of images for each device
+ BATCH_IMAGES: 1
+ test_epoch: 53
diff --git a/experiments/deeplab/cfgs/deeplab_resnet_v1_101_cityscapes_segmentation_dcn.yaml b/experiments/deeplab/cfgs/deeplab_resnet_v1_101_cityscapes_segmentation_dcn.yaml
new file mode 100644
index 0000000..6c4f0c9
--- /dev/null
+++ b/experiments/deeplab/cfgs/deeplab_resnet_v1_101_cityscapes_segmentation_dcn.yaml
@@ -0,0 +1,71 @@
+---
+MXNET_VERSION: "mxnet"
+output_path: "./output/cityscape"
+symbol: resnet_v1_101_deeplab_dcn
+gpus: '0'
+SCALES:
+- 1024
+- 2048
+default:
+ frequent: 10
+ kvstore: device
+dataset:
+ NUM_CLASSES: 19
+ dataset: CityScape
+ dataset_path: "./data/cityscapes/"
+ image_set: leftImg8bit_train
+ root_path: "./data/"
+ test_image_set: leftImg8bit_val
+network:
+ FIXED_PARAMS:
+ - conv1
+ - bn_conv1
+ - res2
+ - bn2
+ - gamma
+ - beta
+ FIXED_PARAMS_SHARED:
+ - conv1
+ - bn_conv1
+ - res2
+ - bn2
+ - res3
+ - bn3
+ - res4
+ - bn4
+ - gamma
+ - beta
+ IMAGE_STRIDE: 0
+ PIXEL_MEANS:
+ - 103.06
+ - 115.90
+ - 123.15
+ pretrained: "./model/pretrained_model/resnet_v1_101"
+ pretrained_epoch: 0
+TRAIN:
+ warmup: true
+ warmup_lr: 0.00005
+ # typically we will use 4000 warmup step for single GPU
+ warmup_step: 1000
+ begin_epoch: 0
+ end_epoch: 53
+ lr: 0.0005
+ lr_step: '40.336'
+ model_prefix: "deeplab_resnet_v1_101_cityscapes_segmentation_dcn"
+ # whether flip image
+ FLIP: true
+ # size of images for each device
+ BATCH_IMAGES: 1
+ # wheter crop image during training
+ ENABLE_CROP: True
+ # scale of cropped image during training
+ CROP_HEIGHT: 768
+ CROP_WIDTH: 1024
+ # whether resume training
+ RESUME: false
+ # whether shuffle image
+ SHUFFLE: true
+TEST:
+ # size of images for each device
+ BATCH_IMAGES: 1
+ test_epoch: 53
diff --git a/experiments/deeplab/cfgs/deeplab_resnet_v1_101_voc12_segmentation_base.yaml b/experiments/deeplab/cfgs/deeplab_resnet_v1_101_voc12_segmentation_base.yaml
new file mode 100644
index 0000000..b8d51de
--- /dev/null
+++ b/experiments/deeplab/cfgs/deeplab_resnet_v1_101_voc12_segmentation_base.yaml
@@ -0,0 +1,71 @@
+---
+MXNET_VERSION: "mxnet"
+output_path: "./output/voc12"
+symbol: resnet_v1_101_deeplab
+gpus: '0'
+SCALES:
+- 360
+- 600
+default:
+ frequent: 10
+ kvstore: device
+dataset:
+ NUM_CLASSES: 21
+ dataset: PascalVOC
+ dataset_path: "./data/VOCdevkit2012/"
+ image_set: 2012_train_seg
+ root_path: "./data/"
+ test_image_set: 2012_val_seg
+network:
+ FIXED_PARAMS:
+ - conv1
+ - bn_conv1
+ - res2
+ - bn2
+ - gamma
+ - beta
+ FIXED_PARAMS_SHARED:
+ - conv1
+ - bn_conv1
+ - res2
+ - bn2
+ - res3
+ - bn3
+ - res4
+ - bn4
+ - gamma
+ - beta
+ IMAGE_STRIDE: 0
+ PIXEL_MEANS:
+ - 103.06
+ - 115.90
+ - 123.15
+ pretrained: "./model/pretrained_model/resnet_v1_101"
+ pretrained_epoch: 0
+TRAIN:
+ warmup: false
+ warmup_lr: 0.00005
+ # typically we will use 4000 warmup step for single GPU
+ warmup_step: 1000
+ begin_epoch: 0
+ end_epoch: 12
+ lr: 0.0005
+ lr_step: '8'
+ model_prefix: "deeplab_resnet_v1_101_voc12_segmentation_base"
+ # whether flip image
+ FLIP: true
+ # size of images for each device
+ BATCH_IMAGES: 1
+ # wheter crop image during training
+ ENABLE_CROP: False
+ # scale of cropped image during training
+ CROP_HEIGHT: 768
+ CROP_WIDTH: 1024
+ # whether resume training
+ RESUME: false
+ # whether shuffle image
+ SHUFFLE: true
+TEST:
+ # size of images for each device
+ BATCH_IMAGES: 1
+ test_epoch: 12
diff --git a/experiments/deeplab/cfgs/deeplab_resnet_v1_101_voc12_segmentation_dcn.yaml b/experiments/deeplab/cfgs/deeplab_resnet_v1_101_voc12_segmentation_dcn.yaml
new file mode 100644
index 0000000..256134b
--- /dev/null
+++ b/experiments/deeplab/cfgs/deeplab_resnet_v1_101_voc12_segmentation_dcn.yaml
@@ -0,0 +1,71 @@
+---
+MXNET_VERSION: "mxnet"
+output_path: "./output/voc12"
+symbol: resnet_v1_101_deeplab_dcn
+gpus: '0'
+SCALES:
+- 360
+- 600
+default:
+ frequent: 10
+ kvstore: device
+dataset:
+ NUM_CLASSES: 21
+ dataset: PascalVOC
+ dataset_path: "./data/VOCdevkit2012/"
+ image_set: 2012_train_seg
+ root_path: "./data/"
+ test_image_set: 2012_val_seg
+network:
+ FIXED_PARAMS:
+ - conv1
+ - bn_conv1
+ - res2
+ - bn2
+ - gamma
+ - beta
+ FIXED_PARAMS_SHARED:
+ - conv1
+ - bn_conv1
+ - res2
+ - bn2
+ - res3
+ - bn3
+ - res4
+ - bn4
+ - gamma
+ - beta
+ IMAGE_STRIDE: 0
+ PIXEL_MEANS:
+ - 103.06
+ - 115.90
+ - 123.15
+ pretrained: "./model/pretrained_model/resnet_v1_101"
+ pretrained_epoch: 0
+TRAIN:
+ warmup: false
+ warmup_lr: 0.00005
+ # typically we will use 4000 warmup step for single GPU
+ warmup_step: 1000
+ begin_epoch: 0
+ end_epoch: 12
+ lr: 0.0005
+ lr_step: '8'
+ model_prefix: "deeplab_resnet_v1_101_voc12_segmentation_dcn"
+ # whether flip image
+ FLIP: true
+ # size of images for each device
+ BATCH_IMAGES: 1
+ # wheter crop image during training
+ ENABLE_CROP: False
+ # scale of cropped image during training
+ CROP_HEIGHT: 768
+ CROP_WIDTH: 1024
+ # whether resume training
+ RESUME: false
+ # whether shuffle image
+ SHUFFLE: true
+TEST:
+ # size of images for each device
+ BATCH_IMAGES: 1
+ test_epoch: 12
diff --git a/experiments/deeplab/deeplab_test.py b/experiments/deeplab/deeplab_test.py
new file mode 100644
index 0000000..5ba2f67
--- /dev/null
+++ b/experiments/deeplab/deeplab_test.py
@@ -0,0 +1,24 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Written by Zheng Zhang
+# --------------------------------------------------------
+
+import os
+import sys
+os.environ['PYTHONUNBUFFERED'] = '1'
+os.environ['MXNET_CUDNN_AUTOTUNE_DEFAULT'] = '0'
+os.environ['MXNET_ENABLE_GPU_P2P'] = '0'
+this_dir = os.path.dirname(__file__)
+sys.path.insert(0, os.path.join(this_dir, '..', '..', 'deeplab'))
+
+import test
+
+if __name__ == "__main__":
+ test.main()
+
+
+
+
diff --git a/experiments/deeplab/deeplab_train_test.py b/experiments/deeplab/deeplab_train_test.py
new file mode 100644
index 0000000..22fb362
--- /dev/null
+++ b/experiments/deeplab/deeplab_train_test.py
@@ -0,0 +1,26 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Written by Zheng Zhang
+# --------------------------------------------------------
+
+import os
+import sys
+os.environ['PYTHONUNBUFFERED'] = '1'
+os.environ['MXNET_CUDNN_AUTOTUNE_DEFAULT'] = '0'
+os.environ['MXNET_ENABLE_GPU_P2P'] = '0'
+this_dir = os.path.dirname(__file__)
+sys.path.insert(0, os.path.join(this_dir, '..', '..', 'deeplab'))
+
+import train
+import test
+
+if __name__ == "__main__":
+ train.main()
+ test.main()
+
+
+
+
diff --git a/experiments/faster_rcnn/cfgs/resnet_v1_101_coco_trainval_rcnn_dcn_end2end.yaml b/experiments/faster_rcnn/cfgs/resnet_v1_101_coco_trainval_rcnn_dcn_end2end.yaml
new file mode 100644
index 0000000..90b92ca
--- /dev/null
+++ b/experiments/faster_rcnn/cfgs/resnet_v1_101_coco_trainval_rcnn_dcn_end2end.yaml
@@ -0,0 +1,154 @@
+---
+MXNET_VERSION: "mxnet"
+output_path: "./output/rcnn/coco"
+symbol: resnet_v1_101_rcnn_dcn
+gpus: '0,1,2,3'
+CLASS_AGNOSTIC: false
+SCALES:
+- 600
+- 1000
+default:
+ frequent: 100
+ kvstore: device
+network:
+ pretrained: "./model/pretrained_model/resnet_v1_101"
+ pretrained_epoch: 0
+ PIXEL_MEANS:
+ - 103.06
+ - 115.90
+ - 123.15
+ IMAGE_STRIDE: 0
+ RCNN_FEAT_STRIDE: 16
+ RPN_FEAT_STRIDE: 16
+ FIXED_PARAMS:
+ - conv1
+ - bn_conv1
+ - res2
+ - bn2
+ - gamma
+ - beta
+ FIXED_PARAMS_SHARED:
+ - conv1
+ - bn_conv1
+ - res2
+ - bn2
+ - res3
+ - bn3
+ - res4
+ - bn4
+ - gamma
+ - beta
+ ANCHOR_RATIOS:
+ - 0.5
+ - 1
+ - 2
+ ANCHOR_SCALES:
+ - 4
+ - 8
+ - 16
+ - 32
+ NUM_ANCHORS: 12
+dataset:
+ NUM_CLASSES: 81
+ dataset: coco
+ dataset_path: "./data/coco"
+ image_set: train2014+val2014
+ root_path: "./data"
+ test_image_set: test-dev2015
+ proposal: rpn
+TRAIN:
+ lr: 0.0005
+ lr_step: '5.333'
+ warmup: true
+ warmup_lr: 0.00005
+ # typically we will use 8000 warmup step for single GPU for COCO
+ warmup_step: 1000
+ begin_epoch: 0
+ end_epoch: 8
+ model_prefix: 'rcnn_coco'
+ # whether resume training
+ RESUME: false
+ # whether flip image
+ FLIP: true
+ # whether shuffle image
+ SHUFFLE: true
+ # whether use OHEM
+ ENABLE_OHEM: false
+ # size of images for each device, 2 for rcnn, 1 for rpn and e2e
+ BATCH_IMAGES: 1
+ # e2e changes behavior of anchor loader and metric
+ END2END: true
+ # group images with similar aspect ratio
+ ASPECT_GROUPING: true
+ # R-CNN
+ # rcnn rois batch size
+ BATCH_ROIS: 128
+ BATCH_ROIS_OHEM: 128
+ # rcnn rois sampling params
+ FG_FRACTION: 0.25
+ FG_THRESH: 0.5
+ BG_THRESH_HI: 0.5
+ BG_THRESH_LO: 0.1
+ # rcnn bounding box regression params
+ BBOX_REGRESSION_THRESH: 0.5
+ BBOX_WEIGHTS:
+ - 1.0
+ - 1.0
+ - 1.0
+ - 1.0
+
+ # RPN anchor loader
+ # rpn anchors batch size
+ RPN_BATCH_SIZE: 256
+ # rpn anchors sampling params
+ RPN_FG_FRACTION: 0.5
+ RPN_POSITIVE_OVERLAP: 0.7
+ RPN_NEGATIVE_OVERLAP: 0.3
+ RPN_CLOBBER_POSITIVES: false
+ # rpn bounding box regression params
+ RPN_BBOX_WEIGHTS:
+ - 1.0
+ - 1.0
+ - 1.0
+ - 1.0
+ RPN_POSITIVE_WEIGHT: -1.0
+ # used for end2end training
+ # RPN proposal
+ CXX_PROPOSAL: false
+ RPN_NMS_THRESH: 0.7
+ RPN_PRE_NMS_TOP_N: 6000
+ RPN_POST_NMS_TOP_N: 300
+ RPN_MIN_SIZE: 0
+ # approximate bounding box regression
+ BBOX_NORMALIZATION_PRECOMPUTED: true
+ BBOX_MEANS:
+ - 0.0
+ - 0.0
+ - 0.0
+ - 0.0
+ BBOX_STDS:
+ - 0.1
+ - 0.1
+ - 0.2
+ - 0.2
+TEST:
+ # use rpn to generate proposal
+ HAS_RPN: true
+ # size of images for each device
+ BATCH_IMAGES: 1
+ # RPN proposal
+ CXX_PROPOSAL: false
+ RPN_NMS_THRESH: 0.7
+ RPN_PRE_NMS_TOP_N: 6000
+ RPN_POST_NMS_TOP_N: 300
+ RPN_MIN_SIZE: 0
+ # RPN generate proposal
+ PROPOSAL_NMS_THRESH: 0.7
+ PROPOSAL_PRE_NMS_TOP_N: 20000
+ PROPOSAL_POST_NMS_TOP_N: 2000
+ PROPOSAL_MIN_SIZE: 0
+ # RCNN nms
+ NMS: 0.3
+ test_epoch: 8
+ max_per_image: 100
+
diff --git a/experiments/faster_rcnn/cfgs/resnet_v1_101_coco_trainval_rcnn_end2end.yaml b/experiments/faster_rcnn/cfgs/resnet_v1_101_coco_trainval_rcnn_end2end.yaml
new file mode 100644
index 0000000..59439e2
--- /dev/null
+++ b/experiments/faster_rcnn/cfgs/resnet_v1_101_coco_trainval_rcnn_end2end.yaml
@@ -0,0 +1,154 @@
+---
+MXNET_VERSION: "mxnet"
+output_path: "./output/rcnn/coco"
+symbol: resnet_v1_101_rcnn
+gpus: '0,1,2,3'
+CLASS_AGNOSTIC: false
+SCALES:
+- 600
+- 1000
+default:
+ frequent: 100
+ kvstore: device
+network:
+ pretrained: "./model/pretrained_model/resnet_v1_101"
+ pretrained_epoch: 0
+ PIXEL_MEANS:
+ - 103.06
+ - 115.90
+ - 123.15
+ IMAGE_STRIDE: 0
+ RCNN_FEAT_STRIDE: 16
+ RPN_FEAT_STRIDE: 16
+ FIXED_PARAMS:
+ - conv1
+ - bn_conv1
+ - res2
+ - bn2
+ - gamma
+ - beta
+ FIXED_PARAMS_SHARED:
+ - conv1
+ - bn_conv1
+ - res2
+ - bn2
+ - res3
+ - bn3
+ - res4
+ - bn4
+ - gamma
+ - beta
+ ANCHOR_RATIOS:
+ - 0.5
+ - 1
+ - 2
+ ANCHOR_SCALES:
+ - 4
+ - 8
+ - 16
+ - 32
+ NUM_ANCHORS: 12
+dataset:
+ NUM_CLASSES: 81
+ dataset: coco
+ dataset_path: "./data/coco"
+ image_set: train2014+val2014
+ root_path: "./data"
+ test_image_set: test-dev2015
+ proposal: rpn
+TRAIN:
+ lr: 0.0005
+ lr_step: '5.333'
+ warmup: true
+ warmup_lr: 0.00005
+ # typically we will use 8000 warmup step for single GPU for COCO
+ warmup_step: 1000
+ begin_epoch: 0
+ end_epoch: 8
+ model_prefix: 'rcnn_coco'
+ # whether resume training
+ RESUME: false
+ # whether flip image
+ FLIP: true
+ # whether shuffle image
+ SHUFFLE: true
+ # whether use OHEM
+ ENABLE_OHEM: false
+ # size of images for each device, 2 for rcnn, 1 for rpn and e2e
+ BATCH_IMAGES: 1
+ # e2e changes behavior of anchor loader and metric
+ END2END: true
+ # group images with similar aspect ratio
+ ASPECT_GROUPING: true
+ # R-CNN
+ # rcnn rois batch size
+ BATCH_ROIS: 128
+ BATCH_ROIS_OHEM: 128
+ # rcnn rois sampling params
+ FG_FRACTION: 0.25
+ FG_THRESH: 0.5
+ BG_THRESH_HI: 0.5
+ BG_THRESH_LO: 0.1
+ # rcnn bounding box regression params
+ BBOX_REGRESSION_THRESH: 0.5
+ BBOX_WEIGHTS:
+ - 1.0
+ - 1.0
+ - 1.0
+ - 1.0
+
+ # RPN anchor loader
+ # rpn anchors batch size
+ RPN_BATCH_SIZE: 256
+ # rpn anchors sampling params
+ RPN_FG_FRACTION: 0.5
+ RPN_POSITIVE_OVERLAP: 0.7
+ RPN_NEGATIVE_OVERLAP: 0.3
+ RPN_CLOBBER_POSITIVES: false
+ # rpn bounding box regression params
+ RPN_BBOX_WEIGHTS:
+ - 1.0
+ - 1.0
+ - 1.0
+ - 1.0
+ RPN_POSITIVE_WEIGHT: -1.0
+ # used for end2end training
+ # RPN proposal
+ CXX_PROPOSAL: false
+ RPN_NMS_THRESH: 0.7
+ RPN_PRE_NMS_TOP_N: 6000
+ RPN_POST_NMS_TOP_N: 300
+ RPN_MIN_SIZE: 0
+ # approximate bounding box regression
+ BBOX_NORMALIZATION_PRECOMPUTED: true
+ BBOX_MEANS:
+ - 0.0
+ - 0.0
+ - 0.0
+ - 0.0
+ BBOX_STDS:
+ - 0.1
+ - 0.1
+ - 0.2
+ - 0.2
+TEST:
+ # use rpn to generate proposal
+ HAS_RPN: true
+ # size of images for each device
+ BATCH_IMAGES: 1
+ # RPN proposal
+ CXX_PROPOSAL: false
+ RPN_NMS_THRESH: 0.7
+ RPN_PRE_NMS_TOP_N: 6000
+ RPN_POST_NMS_TOP_N: 300
+ RPN_MIN_SIZE: 0
+ # RPN generate proposal
+ PROPOSAL_NMS_THRESH: 0.7
+ PROPOSAL_PRE_NMS_TOP_N: 20000
+ PROPOSAL_POST_NMS_TOP_N: 2000
+ PROPOSAL_MIN_SIZE: 0
+ # RCNN nms
+ NMS: 0.3
+ test_epoch: 8
+ max_per_image: 100
+
diff --git a/experiments/faster_rcnn/cfgs/resnet_v1_101_voc0712_rcnn_dcn_end2end.yaml b/experiments/faster_rcnn/cfgs/resnet_v1_101_voc0712_rcnn_dcn_end2end.yaml
new file mode 100644
index 0000000..8e2701e
--- /dev/null
+++ b/experiments/faster_rcnn/cfgs/resnet_v1_101_voc0712_rcnn_dcn_end2end.yaml
@@ -0,0 +1,152 @@
+---
+MXNET_VERSION: "mxnet"
+output_path: "./output/rcnn/voc"
+symbol: resnet_v1_101_rcnn_dcn
+gpus: '0,1,2,3'
+CLASS_AGNOSTIC: false
+SCALES:
+- 600
+- 1000
+default:
+ frequent: 100
+ kvstore: device
+network:
+ pretrained: "./model/pretrained_model/resnet_v1_101"
+ pretrained_epoch: 0
+ PIXEL_MEANS:
+ - 103.06
+ - 115.90
+ - 123.15
+ IMAGE_STRIDE: 0
+ RCNN_FEAT_STRIDE: 16
+ RPN_FEAT_STRIDE: 16
+ FIXED_PARAMS:
+ - conv1
+ - bn_conv1
+ - res2
+ - bn2
+ - gamma
+ - beta
+ FIXED_PARAMS_SHARED:
+ - conv1
+ - bn_conv1
+ - res2
+ - bn2
+ - res3
+ - bn3
+ - res4
+ - bn4
+ - gamma
+ - beta
+ ANCHOR_RATIOS:
+ - 0.5
+ - 1
+ - 2
+ ANCHOR_SCALES:
+ - 8
+ - 16
+ - 32
+ NUM_ANCHORS: 9
+dataset:
+ NUM_CLASSES: 21
+ dataset: PascalVOC
+ dataset_path: "./data/VOCdevkit"
+ image_set: 2007_trainval+2012_trainval
+ root_path: "./data"
+ test_image_set: 2007_test
+ proposal: rpn
+TRAIN:
+ lr: 0.0005
+ lr_step: '4.83'
+ warmup: true
+ warmup_lr: 0.00005
+ # typically we will use 4000 warmup step for single GPU on VOC
+ warmup_step: 1000
+ begin_epoch: 0
+ end_epoch: 7
+ model_prefix: 'rcnn_voc'
+ # whether resume training
+ RESUME: false
+ # whether flip image
+ FLIP: true
+ # whether shuffle image
+ SHUFFLE: true
+ # whether use OHEM
+ ENABLE_OHEM: false
+ # size of images for each device, 2 for rcnn, 1 for rpn and e2e
+ BATCH_IMAGES: 1
+ # e2e changes behavior of anchor loader and metric
+ END2END: true
+ # group images with similar aspect ratio
+ ASPECT_GROUPING: true
+ # R-CNN
+ # rcnn rois batch size
+ BATCH_ROIS: 128
+ BATCH_ROIS_OHEM: 128
+ # rcnn rois sampling params
+ FG_FRACTION: 0.25
+ FG_THRESH: 0.5
+ BG_THRESH_HI: 0.5
+ BG_THRESH_LO: 0.1
+ # rcnn bounding box regression params
+ BBOX_REGRESSION_THRESH: 0.5
+ BBOX_WEIGHTS:
+ - 1.0
+ - 1.0
+ - 1.0
+ - 1.0
+
+ # RPN anchor loader
+ # rpn anchors batch size
+ RPN_BATCH_SIZE: 256
+ # rpn anchors sampling params
+ RPN_FG_FRACTION: 0.5
+ RPN_POSITIVE_OVERLAP: 0.7
+ RPN_NEGATIVE_OVERLAP: 0.3
+ RPN_CLOBBER_POSITIVES: false
+ # rpn bounding box regression params
+ RPN_BBOX_WEIGHTS:
+ - 1.0
+ - 1.0
+ - 1.0
+ - 1.0
+ RPN_POSITIVE_WEIGHT: -1.0
+ # used for end2end training
+ # RPN proposal
+ CXX_PROPOSAL: false
+ RPN_NMS_THRESH: 0.7
+ RPN_PRE_NMS_TOP_N: 6000
+ RPN_POST_NMS_TOP_N: 300
+ RPN_MIN_SIZE: 0
+ # approximate bounding box regression
+ BBOX_NORMALIZATION_PRECOMPUTED: true
+ BBOX_MEANS:
+ - 0.0
+ - 0.0
+ - 0.0
+ - 0.0
+ BBOX_STDS:
+ - 0.1
+ - 0.1
+ - 0.2
+ - 0.2
+TEST:
+ # use rpn to generate proposal
+ HAS_RPN: true
+ # size of images for each device
+ BATCH_IMAGES: 1
+ # RPN proposal
+ CXX_PROPOSAL: false
+ RPN_NMS_THRESH: 0.7
+ RPN_PRE_NMS_TOP_N: 6000
+ RPN_POST_NMS_TOP_N: 300
+ RPN_MIN_SIZE: 0
+ # RPN generate proposal
+ PROPOSAL_NMS_THRESH: 0.7
+ PROPOSAL_PRE_NMS_TOP_N: 20000
+ PROPOSAL_POST_NMS_TOP_N: 2000
+ PROPOSAL_MIN_SIZE: 0
+ # RCNN nms
+ NMS: 0.3
+ test_epoch: 7
+
diff --git a/experiments/faster_rcnn/cfgs/resnet_v1_101_voc0712_rcnn_end2end.yaml b/experiments/faster_rcnn/cfgs/resnet_v1_101_voc0712_rcnn_end2end.yaml
new file mode 100644
index 0000000..6fad75f
--- /dev/null
+++ b/experiments/faster_rcnn/cfgs/resnet_v1_101_voc0712_rcnn_end2end.yaml
@@ -0,0 +1,152 @@
+---
+MXNET_VERSION: "mxnet"
+output_path: "./output/rcnn/voc"
+symbol: resnet_v1_101_rcnn
+gpus: '0,1,2,3'
+CLASS_AGNOSTIC: false
+SCALES:
+- 600
+- 1000
+default:
+ frequent: 100
+ kvstore: device
+network:
+ pretrained: "./model/pretrained_model/resnet_v1_101"
+ pretrained_epoch: 0
+ PIXEL_MEANS:
+ - 103.06
+ - 115.90
+ - 123.15
+ IMAGE_STRIDE: 0
+ RCNN_FEAT_STRIDE: 16
+ RPN_FEAT_STRIDE: 16
+ FIXED_PARAMS:
+ - conv1
+ - bn_conv1
+ - res2
+ - bn2
+ - gamma
+ - beta
+ FIXED_PARAMS_SHARED:
+ - conv1
+ - bn_conv1
+ - res2
+ - bn2
+ - res3
+ - bn3
+ - res4
+ - bn4
+ - gamma
+ - beta
+ ANCHOR_RATIOS:
+ - 0.5
+ - 1
+ - 2
+ ANCHOR_SCALES:
+ - 8
+ - 16
+ - 32
+ NUM_ANCHORS: 9
+dataset:
+ NUM_CLASSES: 21
+ dataset: PascalVOC
+ dataset_path: "./data/VOCdevkit"
+ image_set: 2007_trainval+2012_trainval
+ root_path: "./data"
+ test_image_set: 2007_test
+ proposal: rpn
+TRAIN:
+ lr: 0.0005
+ lr_step: '4.83'
+ warmup: true
+ warmup_lr: 0.00005
+ # typically we will use 4000 warmup step for single GPU on VOC
+ warmup_step: 1000
+ begin_epoch: 0
+ end_epoch: 7
+ model_prefix: 'rcnn_voc'
+ # whether resume training
+ RESUME: false
+ # whether flip image
+ FLIP: true
+ # whether shuffle image
+ SHUFFLE: true
+ # whether use OHEM
+ ENABLE_OHEM: false
+ # size of images for each device, 2 for rcnn, 1 for rpn and e2e
+ BATCH_IMAGES: 1
+ # e2e changes behavior of anchor loader and metric
+ END2END: true
+ # group images with similar aspect ratio
+ ASPECT_GROUPING: true
+ # R-CNN
+ # rcnn rois batch size
+ BATCH_ROIS: 128
+ BATCH_ROIS_OHEM: 128
+ # rcnn rois sampling params
+ FG_FRACTION: 0.25
+ FG_THRESH: 0.5
+ BG_THRESH_HI: 0.5
+ BG_THRESH_LO: 0.1
+ # rcnn bounding box regression params
+ BBOX_REGRESSION_THRESH: 0.5
+ BBOX_WEIGHTS:
+ - 1.0
+ - 1.0
+ - 1.0
+ - 1.0
+
+ # RPN anchor loader
+ # rpn anchors batch size
+ RPN_BATCH_SIZE: 256
+ # rpn anchors sampling params
+ RPN_FG_FRACTION: 0.5
+ RPN_POSITIVE_OVERLAP: 0.7
+ RPN_NEGATIVE_OVERLAP: 0.3
+ RPN_CLOBBER_POSITIVES: false
+ # rpn bounding box regression params
+ RPN_BBOX_WEIGHTS:
+ - 1.0
+ - 1.0
+ - 1.0
+ - 1.0
+ RPN_POSITIVE_WEIGHT: -1.0
+ # used for end2end training
+ # RPN proposal
+ CXX_PROPOSAL: false
+ RPN_NMS_THRESH: 0.7
+ RPN_PRE_NMS_TOP_N: 6000
+ RPN_POST_NMS_TOP_N: 300
+ RPN_MIN_SIZE: 0
+ # approximate bounding box regression
+ BBOX_NORMALIZATION_PRECOMPUTED: true
+ BBOX_MEANS:
+ - 0.0
+ - 0.0
+ - 0.0
+ - 0.0
+ BBOX_STDS:
+ - 0.1
+ - 0.1
+ - 0.2
+ - 0.2
+TEST:
+ # use rpn to generate proposal
+ HAS_RPN: true
+ # size of images for each device
+ BATCH_IMAGES: 1
+ # RPN proposal
+ CXX_PROPOSAL: false
+ RPN_NMS_THRESH: 0.7
+ RPN_PRE_NMS_TOP_N: 6000
+ RPN_POST_NMS_TOP_N: 300
+ RPN_MIN_SIZE: 0
+ # RPN generate proposal
+ PROPOSAL_NMS_THRESH: 0.7
+ PROPOSAL_PRE_NMS_TOP_N: 20000
+ PROPOSAL_POST_NMS_TOP_N: 2000
+ PROPOSAL_MIN_SIZE: 0
+ # RCNN nms
+ NMS: 0.3
+ test_epoch: 7
+
diff --git a/experiments/faster_rcnn/rcnn_end2end_train_test.py b/experiments/faster_rcnn/rcnn_end2end_train_test.py
new file mode 100644
index 0000000..5598dd0
--- /dev/null
+++ b/experiments/faster_rcnn/rcnn_end2end_train_test.py
@@ -0,0 +1,25 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Guodong Zhang
+# --------------------------------------------------------
+import os
+import sys
+os.environ['PYTHONUNBUFFERED'] = '1'
+os.environ['MXNET_CUDNN_AUTOTUNE_DEFAULT'] = '0'
+os.environ['MXNET_ENABLE_GPU_P2P'] = '0'
+#os.environ['MXNET_ENGINE_TYPE'] = 'NaiveEngine'
+this_dir = os.path.dirname(__file__)
+sys.path.insert(0, os.path.join(this_dir, '..', '..', 'faster_rcnn'))
+
+import train_end2end
+import test
+
+if __name__ == "__main__":
+ train_end2end.main()
+ test.main()
+
+
+
+
diff --git a/experiments/faster_rcnn/rcnn_test.py b/experiments/faster_rcnn/rcnn_test.py
new file mode 100644
index 0000000..a4ece9a
--- /dev/null
+++ b/experiments/faster_rcnn/rcnn_test.py
@@ -0,0 +1,19 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Guodong Zhang
+# --------------------------------------------------------
+
+import os
+import sys
+os.environ['PYTHONUNBUFFERED'] = '1'
+os.environ['MXNET_CUDNN_AUTOTUNE_DEFAULT'] = '0'
+os.environ['MXNET_ENABLE_GPU_P2P'] = '0'
+this_dir = os.path.dirname(__file__)
+sys.path.insert(0, os.path.join(this_dir, '..', '..', 'faster_rcnn'))
+
+import test
+
+if __name__ == "__main__":
+ test.main()
diff --git a/experiments/faster_rcnn/rcnn_train_test.py b/experiments/faster_rcnn/rcnn_train_test.py
new file mode 100644
index 0000000..38f8540
--- /dev/null
+++ b/experiments/faster_rcnn/rcnn_train_test.py
@@ -0,0 +1,25 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Guodong Zhang
+# --------------------------------------------------------
+
+import os
+import sys
+os.environ['PYTHONUNBUFFERED'] = '1'
+os.environ['MXNET_CUDNN_AUTOTUNE_DEFAULT'] = '0'
+os.environ['MXNET_ENABLE_GPU_P2P'] = '0'
+this_dir = os.path.dirname(__file__)
+sys.path.insert(0, os.path.join(this_dir, '..', '..', 'faster_rcnn'))
+
+import train_rcnn
+import test
+
+if __name__ == "__main__":
+ train_rcnn.main()
+ test.main()
+
+
+
+
diff --git a/experiments/rfcn/cfgs/deform_conv_demo.yaml b/experiments/rfcn/cfgs/deform_conv_demo.yaml
new file mode 100644
index 0000000..6e144f8
--- /dev/null
+++ b/experiments/rfcn/cfgs/deform_conv_demo.yaml
@@ -0,0 +1,152 @@
+---
+MXNET_VERSION: "mxnet"
+output_path: "./output/rfcn"
+symbol: deform_conv_demo
+gpus: '0'
+CLASS_AGNOSTIC: true
+SCALES:
+- 600
+- 1000
+default:
+ frequent: 100
+ kvstore: device
+network:
+ pretrained: "./model/pretrained_model/resnet_v1_101"
+ pretrained_epoch: 0
+ PIXEL_MEANS:
+ - 103.06
+ - 115.90
+ - 123.15
+ IMAGE_STRIDE: 0
+ RCNN_FEAT_STRIDE: 16
+ RPN_FEAT_STRIDE: 16
+ FIXED_PARAMS:
+ - conv1
+ - bn_conv1
+ - res2
+ - bn2
+ - gamma
+ - beta
+ FIXED_PARAMS_SHARED:
+ - conv1
+ - bn_conv1
+ - res2
+ - bn2
+ - res3
+ - bn3
+ - res4
+ - bn4
+ - gamma
+ - beta
+ ANCHOR_RATIOS:
+ - 0.5
+ - 1
+ - 2
+ ANCHOR_SCALES:
+ - 8
+ - 16
+ - 32
+ NUM_ANCHORS: 9
+dataset:
+ NUM_CLASSES: 21
+ dataset: PascalVOC
+ dataset_path: "./data/VOCdevkit"
+ image_set: 2007_trainval+2012_trainval
+ root_path: "./data"
+ test_image_set: 2007_test
+ proposal: rpn
+TRAIN:
+ lr: 0.0005
+ lr_step: '4.83'
+ warmup: true
+ warmup_lr: 0.00005
+ # typically we will use 4000 warmup step for single GPU on VOC
+ warmup_step: 1000
+ begin_epoch: 0
+ end_epoch: 8
+ model_prefix: 'rfcn_voc'
+ # whether resume training
+ RESUME: false
+ # whether flip image
+ FLIP: true
+ # whether shuffle image
+ SHUFFLE: true
+ # whether use OHEM
+ ENABLE_OHEM: true
+ # size of images for each device, 2 for rcnn, 1 for rpn and e2e
+ BATCH_IMAGES: 1
+ # e2e changes behavior of anchor loader and metric
+ END2END: true
+ # group images with similar aspect ratio
+ ASPECT_GROUPING: true
+ # R-CNN
+ # rcnn rois batch size
+ BATCH_ROIS: -1
+ BATCH_ROIS_OHEM: 128
+ # rcnn rois sampling params
+ FG_FRACTION: 0.25
+ FG_THRESH: 0.5
+ BG_THRESH_HI: 0.5
+ BG_THRESH_LO: 0.0
+ # rcnn bounding box regression params
+ BBOX_REGRESSION_THRESH: 0.5
+ BBOX_WEIGHTS:
+ - 1.0
+ - 1.0
+ - 1.0
+ - 1.0
+
+ # RPN anchor loader
+ # rpn anchors batch size
+ RPN_BATCH_SIZE: 256
+ # rpn anchors sampling params
+ RPN_FG_FRACTION: 0.5
+ RPN_POSITIVE_OVERLAP: 0.7
+ RPN_NEGATIVE_OVERLAP: 0.3
+ RPN_CLOBBER_POSITIVES: false
+ # rpn bounding box regression params
+ RPN_BBOX_WEIGHTS:
+ - 1.0
+ - 1.0
+ - 1.0
+ - 1.0
+ RPN_POSITIVE_WEIGHT: -1.0
+ # used for end2end training
+ # RPN proposal
+ CXX_PROPOSAL: false
+ RPN_NMS_THRESH: 0.7
+ RPN_PRE_NMS_TOP_N: 6000
+ RPN_POST_NMS_TOP_N: 300
+ RPN_MIN_SIZE: 0
+ # approximate bounding box regression
+ BBOX_NORMALIZATION_PRECOMPUTED: true
+ BBOX_MEANS:
+ - 0.0
+ - 0.0
+ - 0.0
+ - 0.0
+ BBOX_STDS:
+ - 0.1
+ - 0.1
+ - 0.2
+ - 0.2
+TEST:
+ # use rpn to generate proposal
+ HAS_RPN: true
+ # size of images for each device
+ BATCH_IMAGES: 1
+ # RPN proposal
+ CXX_PROPOSAL: false
+ RPN_NMS_THRESH: 0.7
+ RPN_PRE_NMS_TOP_N: 6000
+ RPN_POST_NMS_TOP_N: 300
+ RPN_MIN_SIZE: 0
+ # RPN generate proposal
+ PROPOSAL_NMS_THRESH: 0.7
+ PROPOSAL_PRE_NMS_TOP_N: 20000
+ PROPOSAL_POST_NMS_TOP_N: 2000
+ PROPOSAL_MIN_SIZE: 0
+ # RCNN nms
+ NMS: 0.3
+ test_epoch: 7
+
diff --git a/experiments/rfcn/cfgs/deform_psroi_demo.yaml b/experiments/rfcn/cfgs/deform_psroi_demo.yaml
new file mode 100644
index 0000000..4961749
--- /dev/null
+++ b/experiments/rfcn/cfgs/deform_psroi_demo.yaml
@@ -0,0 +1,152 @@
+---
+MXNET_VERSION: "mxnet"
+output_path: "./output/rfcn"
+symbol: deform_psroi_demo
+gpus: '0'
+CLASS_AGNOSTIC: true
+SCALES:
+- 600
+- 1000
+default:
+ frequent: 100
+ kvstore: device
+network:
+ pretrained: "./model/pretrained_model/resnet_v1_101"
+ pretrained_epoch: 0
+ PIXEL_MEANS:
+ - 103.06
+ - 115.90
+ - 123.15
+ IMAGE_STRIDE: 0
+ RCNN_FEAT_STRIDE: 16
+ RPN_FEAT_STRIDE: 16
+ FIXED_PARAMS:
+ - conv1
+ - bn_conv1
+ - res2
+ - bn2
+ - gamma
+ - beta
+ FIXED_PARAMS_SHARED:
+ - conv1
+ - bn_conv1
+ - res2
+ - bn2
+ - res3
+ - bn3
+ - res4
+ - bn4
+ - gamma
+ - beta
+ ANCHOR_RATIOS:
+ - 0.5
+ - 1
+ - 2
+ ANCHOR_SCALES:
+ - 8
+ - 16
+ - 32
+ NUM_ANCHORS: 9
+dataset:
+ NUM_CLASSES: 21
+ dataset: PascalVOC
+ dataset_path: "./data/VOCdevkit"
+ image_set: 2007_trainval+2012_trainval
+ root_path: "./data"
+ test_image_set: 2007_test
+ proposal: selective_search
+TRAIN:
+ lr: 0.0005
+ lr_step: '4.83'
+ warmup: true
+ warmup_lr: 0.00005
+ # typically we will use 4000 warmup step for single GPU on VOC
+ warmup_step: 1000
+ begin_epoch: 0
+ end_epoch: 8
+ model_prefix: 'rfcn_voc'
+ # whether resume training
+ RESUME: false
+ # whether flip image
+ FLIP: true
+ # whether shuffle image
+ SHUFFLE: true
+ # whether use OHEM
+ ENABLE_OHEM: true
+ # size of images for each device, 2 for rcnn, 1 for rpn and e2e
+ BATCH_IMAGES: 1
+ # e2e changes behavior of anchor loader and metric
+ END2END: false
+ # group images with similar aspect ratio
+ ASPECT_GROUPING: true
+ # R-CNN
+ # rcnn rois batch size
+ BATCH_ROIS: -1
+ BATCH_ROIS_OHEM: 128
+ # rcnn rois sampling params
+ FG_FRACTION: 0.25
+ FG_THRESH: 0.5
+ BG_THRESH_HI: 0.5
+ BG_THRESH_LO: 0.0
+ # rcnn bounding box regression params
+ BBOX_REGRESSION_THRESH: 0.5
+ BBOX_WEIGHTS:
+ - 1.0
+ - 1.0
+ - 1.0
+ - 1.0
+
+ # RPN anchor loader
+ # rpn anchors batch size
+ RPN_BATCH_SIZE: 256
+ # rpn anchors sampling params
+ RPN_FG_FRACTION: 0.5
+ RPN_POSITIVE_OVERLAP: 0.7
+ RPN_NEGATIVE_OVERLAP: 0.3
+ RPN_CLOBBER_POSITIVES: false
+ # rpn bounding box regression params
+ RPN_BBOX_WEIGHTS:
+ - 1.0
+ - 1.0
+ - 1.0
+ - 1.0
+ RPN_POSITIVE_WEIGHT: -1.0
+ # used for end2end training
+ # RPN proposal
+ CXX_PROPOSAL: false
+ RPN_NMS_THRESH: 0.7
+ RPN_PRE_NMS_TOP_N: 6000
+ RPN_POST_NMS_TOP_N: 300
+ RPN_MIN_SIZE: 0
+ # approximate bounding box regression
+ BBOX_NORMALIZATION_PRECOMPUTED: true
+ BBOX_MEANS:
+ - 0.0
+ - 0.0
+ - 0.0
+ - 0.0
+ BBOX_STDS:
+ - 0.1
+ - 0.1
+ - 0.2
+ - 0.2
+TEST:
+ # use rpn to generate proposal
+ HAS_RPN: false
+ # size of images for each device
+ BATCH_IMAGES: 1
+ # RPN proposal
+ CXX_PROPOSAL: false
+ RPN_NMS_THRESH: 0.7
+ RPN_PRE_NMS_TOP_N: 6000
+ RPN_POST_NMS_TOP_N: 300
+ RPN_MIN_SIZE: 0
+ # RPN generate proposal
+ PROPOSAL_NMS_THRESH: 0.7
+ PROPOSAL_PRE_NMS_TOP_N: 20000
+ PROPOSAL_POST_NMS_TOP_N: 2000
+ PROPOSAL_MIN_SIZE: 0
+ # RCNN nms
+ NMS: 0.3
+ test_epoch: 7
+
diff --git a/experiments/rfcn/cfgs/resnet_v1_101_voc0712_rfcn_dcn_end2end_ohem.yaml b/experiments/rfcn/cfgs/resnet_v1_101_voc0712_rfcn_dcn_end2end_ohem.yaml
index 78198e3..dbb8cd3 100644
--- a/experiments/rfcn/cfgs/resnet_v1_101_voc0712_rfcn_dcn_end2end_ohem.yaml
+++ b/experiments/rfcn/cfgs/resnet_v1_101_voc0712_rfcn_dcn_end2end_ohem.yaml
@@ -1,6 +1,6 @@
---
MXNET_VERSION: "mxnet"
-output_path: "./output/dcn_rfcn/voc"
+output_path: "./output/rfcn_dcn/voc"
symbol: resnet_v1_101_rfcn_dcn
gpus: '0,1,2,3'
CLASS_AGNOSTIC: true
diff --git a/experiments/rfcn/cfgs/resnet_v1_101_voc0712_rfcn_end2end_ohem.yaml b/experiments/rfcn/cfgs/resnet_v1_101_voc0712_rfcn_end2end_ohem.yaml
index 496d121..79d0494 100644
--- a/experiments/rfcn/cfgs/resnet_v1_101_voc0712_rfcn_end2end_ohem.yaml
+++ b/experiments/rfcn/cfgs/resnet_v1_101_voc0712_rfcn_end2end_ohem.yaml
@@ -1,6 +1,6 @@
---
MXNET_VERSION: "mxnet"
-output_path: "./output/dcn_rfcn/voc"
+output_path: "./output/rfcn/voc"
symbol: resnet_v1_101_rfcn
gpus: '0,1,2,3'
CLASS_AGNOSTIC: true
diff --git a/faster_rcnn/__init__.py b/faster_rcnn/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/faster_rcnn/_init_paths.py b/faster_rcnn/_init_paths.py
new file mode 100644
index 0000000..5bbe057
--- /dev/null
+++ b/faster_rcnn/_init_paths.py
@@ -0,0 +1,11 @@
+import os.path as osp
+import sys
+
+def add_path(path):
+ if path not in sys.path:
+ sys.path.insert(0, path)
+
+this_dir = osp.dirname(__file__)
+
+lib_path = osp.join(this_dir, '..', 'lib')
+add_path(lib_path)
diff --git a/faster_rcnn/config/__init__.py b/faster_rcnn/config/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/faster_rcnn/config/config.py b/faster_rcnn/config/config.py
new file mode 100644
index 0000000..70845ea
--- /dev/null
+++ b/faster_rcnn/config/config.py
@@ -0,0 +1,188 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Yuwen Xiong, Bin Xiao
+# --------------------------------------------------------
+
+import yaml
+import numpy as np
+from easydict import EasyDict as edict
+
+config = edict()
+
+config.MXNET_VERSION = ''
+config.output_path = ''
+config.symbol = ''
+config.gpus = ''
+config.CLASS_AGNOSTIC = True
+config.SCALES = [(600, 1000)] # first is scale (the shorter side); second is max size
+
+# default training
+config.default = edict()
+config.default.frequent = 20
+config.default.kvstore = 'device'
+
+# network related params
+config.network = edict()
+config.network.pretrained = ''
+config.network.pretrained_epoch = 0
+config.network.PIXEL_MEANS = np.array([0, 0, 0])
+config.network.IMAGE_STRIDE = 0
+config.network.RPN_FEAT_STRIDE = 16
+config.network.RCNN_FEAT_STRIDE = 16
+config.network.FIXED_PARAMS = ['gamma', 'beta']
+config.network.FIXED_PARAMS_SHARED = ['gamma', 'beta']
+config.network.ANCHOR_SCALES = (8, 16, 32)
+config.network.ANCHOR_RATIOS = (0.5, 1, 2)
+config.network.NUM_ANCHORS = len(config.network.ANCHOR_SCALES) * len(config.network.ANCHOR_RATIOS)
+
+# dataset related params
+config.dataset = edict()
+config.dataset.dataset = 'PascalVOC'
+config.dataset.image_set = '2007_trainval'
+config.dataset.test_image_set = '2007_test'
+config.dataset.root_path = './data'
+config.dataset.dataset_path = './data/VOCdevkit'
+config.dataset.NUM_CLASSES = 21
+
+
+config.TRAIN = edict()
+
+config.TRAIN.lr = 0
+config.TRAIN.lr_step = ''
+config.TRAIN.lr_factor = 0.1
+config.TRAIN.warmup = False
+config.TRAIN.warmup_lr = 0
+config.TRAIN.warmup_step = 0
+config.TRAIN.momentum = 0.9
+config.TRAIN.wd = 0.0005
+config.TRAIN.begin_epoch = 0
+config.TRAIN.end_epoch = 0
+config.TRAIN.model_prefix = ''
+
+config.TRAIN.ALTERNATE = edict()
+config.TRAIN.ALTERNATE.RPN_BATCH_IMAGES = 0
+config.TRAIN.ALTERNATE.RCNN_BATCH_IMAGES = 0
+config.TRAIN.ALTERNATE.rpn1_lr = 0
+config.TRAIN.ALTERNATE.rpn1_lr_step = '' # recommend '2'
+config.TRAIN.ALTERNATE.rpn1_epoch = 0 # recommend 3
+config.TRAIN.ALTERNATE.rfcn1_lr = 0
+config.TRAIN.ALTERNATE.rfcn1_lr_step = '' # recommend '5'
+config.TRAIN.ALTERNATE.rfcn1_epoch = 0 # recommend 8
+config.TRAIN.ALTERNATE.rpn2_lr = 0
+config.TRAIN.ALTERNATE.rpn2_lr_step = '' # recommend '2'
+config.TRAIN.ALTERNATE.rpn2_epoch = 0 # recommend 3
+config.TRAIN.ALTERNATE.rfcn2_lr = 0
+config.TRAIN.ALTERNATE.rfcn2_lr_step = '' # recommend '5'
+config.TRAIN.ALTERNATE.rfcn2_epoch = 0 # recommend 8
+# optional
+config.TRAIN.ALTERNATE.rpn3_lr = 0
+config.TRAIN.ALTERNATE.rpn3_lr_step = '' # recommend '2'
+config.TRAIN.ALTERNATE.rpn3_epoch = 0 # recommend 3
+
+# whether resume training
+config.TRAIN.RESUME = False
+# whether flip image
+config.TRAIN.FLIP = True
+# whether shuffle image
+config.TRAIN.SHUFFLE = True
+# whether use OHEM
+config.TRAIN.ENABLE_OHEM = False
+# size of images for each device, 2 for rcnn, 1 for rpn and e2e
+config.TRAIN.BATCH_IMAGES = 2
+# e2e changes behavior of anchor loader and metric
+config.TRAIN.END2END = False
+# group images with similar aspect ratio
+config.TRAIN.ASPECT_GROUPING = True
+
+# R-CNN
+# rcnn rois batch size
+config.TRAIN.BATCH_ROIS = 128
+config.TRAIN.BATCH_ROIS_OHEM = 128
+# rcnn rois sampling params
+config.TRAIN.FG_FRACTION = 0.25
+config.TRAIN.FG_THRESH = 0.5
+config.TRAIN.BG_THRESH_HI = 0.5
+config.TRAIN.BG_THRESH_LO = 0.0
+# rcnn bounding box regression params
+config.TRAIN.BBOX_REGRESSION_THRESH = 0.5
+config.TRAIN.BBOX_WEIGHTS = np.array([1.0, 1.0, 1.0, 1.0])
+
+# RPN anchor loader
+# rpn anchors batch size
+config.TRAIN.RPN_BATCH_SIZE = 256
+# rpn anchors sampling params
+config.TRAIN.RPN_FG_FRACTION = 0.5
+config.TRAIN.RPN_POSITIVE_OVERLAP = 0.7
+config.TRAIN.RPN_NEGATIVE_OVERLAP = 0.3
+config.TRAIN.RPN_CLOBBER_POSITIVES = False
+# rpn bounding box regression params
+config.TRAIN.RPN_BBOX_WEIGHTS = (1.0, 1.0, 1.0, 1.0)
+config.TRAIN.RPN_POSITIVE_WEIGHT = -1.0
+
+# used for end2end training
+# RPN proposal
+config.TRAIN.CXX_PROPOSAL = True
+config.TRAIN.RPN_NMS_THRESH = 0.7
+config.TRAIN.RPN_PRE_NMS_TOP_N = 12000
+config.TRAIN.RPN_POST_NMS_TOP_N = 2000
+config.TRAIN.RPN_MIN_SIZE = config.network.RPN_FEAT_STRIDE
+# approximate bounding box regression
+config.TRAIN.BBOX_NORMALIZATION_PRECOMPUTED = False
+config.TRAIN.BBOX_MEANS = (0.0, 0.0, 0.0, 0.0)
+config.TRAIN.BBOX_STDS = (0.1, 0.1, 0.2, 0.2)
+
+config.TEST = edict()
+
+# R-CNN testing
+# use rpn to generate proposal
+config.TEST.HAS_RPN = False
+# size of images for each device
+config.TEST.BATCH_IMAGES = 1
+
+# RPN proposal
+config.TEST.CXX_PROPOSAL = True
+config.TEST.RPN_NMS_THRESH = 0.7
+config.TEST.RPN_PRE_NMS_TOP_N = 6000
+config.TEST.RPN_POST_NMS_TOP_N = 300
+config.TEST.RPN_MIN_SIZE = config.network.RPN_FEAT_STRIDE
+
+# RPN generate proposal
+config.TEST.PROPOSAL_NMS_THRESH = 0.7
+config.TEST.PROPOSAL_PRE_NMS_TOP_N = 20000
+config.TEST.PROPOSAL_POST_NMS_TOP_N = 2000
+config.TEST.PROPOSAL_MIN_SIZE = config.network.RPN_FEAT_STRIDE
+
+# RCNN nms
+config.TEST.NMS = 0.3
+
+config.TEST.max_per_image = 300
+
+# Test Model Epoch
+config.TEST.test_epoch = 0
+
+
+def update_config(config_file):
+ exp_config = None
+ with open(config_file) as f:
+ exp_config = edict(yaml.load(f))
+ for k, v in exp_config.items():
+ if k in config:
+ if isinstance(v, dict):
+ if k == 'TRAIN':
+ if 'BBOX_WEIGHTS' in v:
+ v['BBOX_WEIGHTS'] = np.array(v['BBOX_WEIGHTS'])
+ elif k == 'network':
+ if 'PIXEL_MEANS' in v:
+ v['PIXEL_MEANS'] = np.array(v['PIXEL_MEANS'])
+ for vk, vv in v.items():
+ config[k][vk] = vv
+ else:
+ if k == 'SCALES':
+ config[k][0] = (tuple(v))
+ else:
+ config[k] = v
+ else:
+ raise ValueError("key must exist in config.py")
diff --git a/faster_rcnn/core/DataParallelExecutorGroup.py b/faster_rcnn/core/DataParallelExecutorGroup.py
new file mode 100644
index 0000000..9579514
--- /dev/null
+++ b/faster_rcnn/core/DataParallelExecutorGroup.py
@@ -0,0 +1,591 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Yuwen Xiong
+# --------------------------------------------------------
+
+import logging
+import numpy as np
+
+from mxnet import context as ctx
+from mxnet import ndarray as nd
+from mxnet.io import DataDesc
+from mxnet.executor_manager import _split_input_slice
+
+
+
+def _load_general(data, targets, major_axis):
+ """Load a list of arrays into a list of arrays specified by slices"""
+ for d_src, d_targets in zip(data, targets):
+ if isinstance(d_targets, nd.NDArray):
+ d_src.copyto(d_targets)
+ elif isinstance(d_src, (list, tuple)):
+ for src, dst in zip(d_src, d_targets):
+ src.copyto(dst)
+ else:
+ raise NotImplementedError
+
+
+
+def _load_data(batch, targets, major_axis):
+ """Load data into sliced arrays"""
+ _load_general(batch.data, targets, major_axis)
+
+
+def _load_label(batch, targets, major_axis):
+ """Load label into sliced arrays"""
+ _load_general(batch.label, targets, major_axis)
+
+
+def _merge_multi_context(outputs, major_axis):
+ """Merge outputs that lives on multiple context into one, so that they look
+ like living on one context.
+ """
+ rets = []
+ for tensors, axis in zip(outputs, major_axis):
+ if axis >= 0:
+ rets.append(nd.concatenate(tensors, axis=axis, always_copy=False))
+ else:
+ # negative axis means the there is no batch_size axis, and all the
+ # results should be the same on each device. We simply take the
+ # first one, without checking they are actually the same
+ rets.append(tensors[0])
+ return rets
+
+
+
+class DataParallelExecutorGroup(object):
+ """DataParallelExecutorGroup is a group of executors that lives on a group of devices.
+ This is a helper class used to implement data parallelization. Each mini-batch will
+ be split and run on the devices.
+
+ Parameters
+ ----------
+ symbol : Symbol
+ The common symbolic computation graph for all executors.
+ contexts : list
+ A list of contexts.
+ workload : list
+ If not `None`, could be a list of numbers that specify the workload to be assigned
+ to different context. Larger number indicate heavier workload.
+ data_shapes : list
+ Should be a list of (name, shape) tuples, for the shapes of data. Note the order is
+ important and should be the same as the order that the `DataIter` provide the data.
+ label_shapes : list
+ Should be a list of (name, shape) tuples, for the shapes of label. Note the order is
+ important and should be the same as the order that the `DataIter` provide the label.
+ param_names : list
+ A list of strings, indicating the names of parameters (e.g. weights, filters, etc.)
+ in the computation graph.
+ for_training : bool
+ Indicate whether the executors should be bind for training. When not doing training,
+ the memory for gradients will not be allocated.
+ inputs_need_grad : bool
+ Indicate whether the gradients for the input data should be computed. This is currently
+ not used. It will be useful for implementing composition of modules.
+ shared_group : DataParallelExecutorGroup
+ Default is `None`. This is used in bucketing. When not `None`, it should be a executor
+ group corresponding to a different bucket. In other words, it will correspond to a different
+ symbol but with the same set of parameters (e.g. unrolled RNNs with different lengths).
+ In this case, many memory will be shared.
+ logger : Logger
+ Default is `logging`.
+ fixed_param_names: list of str
+ Indicate parameters to be fixed during training. Parameters in this list will not allocate
+ space for gradient, nor do gradient calculation.
+ grad_req : str, list of str, dict of str to str
+ Requirement for gradient accumulation. Can be 'write', 'add', or 'null'
+ (default to 'write').
+ Can be specified globally (str) or for each argument (list, dict).
+ """
+ def __init__(self, symbol, contexts, workload, data_shapes, label_shapes, param_names,
+ for_training, inputs_need_grad, shared_group=None, logger=logging,
+ fixed_param_names=None, grad_req='write', state_names=None):
+ self.param_names = param_names
+ self.arg_names = symbol.list_arguments()
+ self.aux_names = symbol.list_auxiliary_states()
+
+ self.symbol = symbol
+ self.contexts = contexts
+ self.workload = workload
+
+ self.for_training = for_training
+ self.inputs_need_grad = inputs_need_grad
+
+ self.logger = logger
+ #In the future we should have a better way to profile memory per device (haibin)
+ # self._total_exec_bytes = 0
+ self.fixed_param_names = fixed_param_names
+ if self.fixed_param_names is None:
+ self.fixed_param_names = []
+
+ self.state_names = state_names
+ if self.state_names is None:
+ self.state_names = []
+
+ if not for_training:
+ grad_req = 'null'
+
+ # data_shapes = [x if isinstance(x, DataDesc) else DataDesc(*x) for x in data_shapes]
+ # if label_shapes is not None:
+ # label_shapes = [x if isinstance(x, DataDesc) else DataDesc(*x) for x in label_shapes]
+
+ data_names = [x.name for x in data_shapes[0]]
+
+ if isinstance(grad_req, str):
+ self.grad_req = {}
+ for k in self.arg_names:
+ if k in self.param_names:
+ self.grad_req[k] = 'null' if k in self.fixed_param_names else grad_req
+ elif k in data_names:
+ self.grad_req[k] = grad_req if self.inputs_need_grad else 'null'
+ else:
+ self.grad_req[k] = 'null'
+ elif isinstance(grad_req, (list, tuple)):
+ assert len(grad_req) == len(self.arg_names)
+ self.grad_req = dict(zip(self.arg_names, grad_req))
+ elif isinstance(grad_req, dict):
+ self.grad_req = {}
+ for k in self.arg_names:
+ if k in self.param_names:
+ self.grad_req[k] = 'null' if k in self.fixed_param_names else 'write'
+ elif k in data_names:
+ self.grad_req[k] = 'write' if self.inputs_need_grad else 'null'
+ else:
+ self.grad_req[k] = 'null'
+ self.grad_req.update(grad_req)
+ else:
+ raise ValueError("grad_req must be one of str, list, tuple, or dict.")
+
+ if shared_group is not None:
+ self.shared_data_arrays = shared_group.shared_data_arrays
+ else:
+ self.shared_data_arrays = [{} for _ in contexts]
+
+ # initialize some instance variables
+ self.batch_size = len(data_shapes)
+ self.slices = None
+ self.execs = []
+ self._default_execs = None
+ self.data_arrays = None
+ self.label_arrays = None
+ self.param_arrays = None
+ self.state_arrays = None
+ self.grad_arrays = None
+ self.aux_arrays = None
+ self.input_grad_arrays = None
+
+ self.data_shapes = None
+ self.label_shapes = None
+ self.data_layouts = None
+ self.label_layouts = None
+ self.output_layouts = [DataDesc.get_batch_axis(self.symbol[name].attr('__layout__'))
+ for name in self.symbol.list_outputs()]
+ self.bind_exec(data_shapes, label_shapes, shared_group)
+
+ def decide_slices(self, data_shapes):
+ """Decide the slices for each context according to the workload.
+
+ Parameters
+ ----------
+ data_shapes : list
+ list of (name, shape) specifying the shapes for the input data or label.
+ """
+ assert len(data_shapes) > 0
+ major_axis = [DataDesc.get_batch_axis(x.layout) for x in data_shapes]
+
+ for (name, shape), axis in zip(data_shapes, major_axis):
+ if axis == -1:
+ continue
+
+ batch_size = shape[axis]
+ if self.batch_size is not None:
+ assert batch_size == self.batch_size, ("all data must have the same batch size: "
+ + ("batch_size = %d, but " % self.batch_size)
+ + ("%s has shape %s" % (name, shape)))
+ else:
+ self.batch_size = batch_size
+ self.slices = _split_input_slice(self.batch_size, self.workload)
+
+ return major_axis
+
+ def _collect_arrays(self):
+ """Collect internal arrays from executors."""
+ # convenient data structures
+ self.data_arrays = [[e.arg_dict[name] for name, _ in self.data_shapes[0]] for e in self.execs]
+
+ self.state_arrays = [[e.arg_dict[name] for e in self.execs]
+ for name in self.state_names]
+
+ if self.label_shapes is not None:
+ self.label_arrays = [[e.arg_dict[name] for name, _ in self.label_shapes[0]] for e in self.execs]
+ else:
+ self.label_arrays = None
+
+ self.param_arrays = [[exec_.arg_arrays[i] for exec_ in self.execs]
+ for i, name in enumerate(self.arg_names)
+ if name in self.param_names]
+ if self.for_training:
+ self.grad_arrays = [[exec_.grad_arrays[i] for exec_ in self.execs]
+ for i, name in enumerate(self.arg_names)
+ if name in self.param_names]
+ else:
+ self.grad_arrays = None
+
+ data_names = [x[0] for x in self.data_shapes]
+ if self.inputs_need_grad:
+ self.input_grad_arrays = [[exec_.grad_arrays[i] for exec_ in self.execs]
+ for i, name in enumerate(self.arg_names)
+ if name in data_names]
+ else:
+ self.input_grad_arrays = None
+
+ self.aux_arrays = [[exec_.aux_arrays[i] for exec_ in self.execs]
+ for i in range(len(self.aux_names))]
+
+ def bind_exec(self, data_shapes, label_shapes, shared_group=None, reshape=False):
+ """Bind executors on their respective devices.
+
+ Parameters
+ ----------
+ data_shapes : list
+ label_shapes : list
+ shared_group : DataParallelExecutorGroup
+ reshape : bool
+ """
+ assert reshape or not self.execs
+
+ for i in range(len(self.contexts)):
+ data_shapes_i = data_shapes[i]
+ if label_shapes is not None:
+ label_shapes_i = label_shapes[i]
+ else:
+ label_shapes_i = []
+
+ if reshape:
+ self.execs[i] = self._default_execs[i].reshape(
+ allow_up_sizing=True, **dict(data_shapes_i + label_shapes_i))
+ else:
+ self.execs.append(self._bind_ith_exec(i, data_shapes_i, label_shapes_i,
+ shared_group))
+
+ self.data_shapes = data_shapes
+ self.label_shapes = label_shapes
+ self._collect_arrays()
+
+ def reshape(self, data_shapes, label_shapes):
+ """Reshape executors.
+
+ Parameters
+ ----------
+ data_shapes : list
+ label_shapes : list
+ """
+ if self._default_execs is None:
+ self._default_execs = [i for i in self.execs]
+ for i in range(len(self.contexts)):
+ self.execs[i] = self._default_execs[i].reshape(
+ allow_up_sizing=True, **dict(data_shapes[i] + (label_shapes[i] if label_shapes is not None else []))
+ )
+ self.data_shapes = data_shapes
+ self.label_shapes = label_shapes
+ self._collect_arrays()
+
+
+ def set_params(self, arg_params, aux_params):
+ """Assign, i.e. copy parameters to all the executors.
+
+ Parameters
+ ----------
+ arg_params : dict
+ A dictionary of name to `NDArray` parameter mapping.
+ aux_params : dict
+ A dictionary of name to `NDArray` auxiliary variable mapping.
+ """
+ for exec_ in self.execs:
+ exec_.copy_params_from(arg_params, aux_params)
+
+ def get_params(self, arg_params, aux_params):
+ """ Copy data from each executor to `arg_params` and `aux_params`.
+
+ Parameters
+ ----------
+ arg_params : list of NDArray
+ target parameter arrays
+ aux_params : list of NDArray
+ target aux arrays
+
+ Notes
+ -----
+ - This function will inplace update the NDArrays in arg_params and aux_params.
+ """
+ for name, block in zip(self.param_names, self.param_arrays):
+ weight = sum(w.copyto(ctx.cpu()) for w in block) / len(block)
+ weight.astype(arg_params[name].dtype).copyto(arg_params[name])
+ for name, block in zip(self.aux_names, self.aux_arrays):
+ weight = sum(w.copyto(ctx.cpu()) for w in block) / len(block)
+ weight.astype(aux_params[name].dtype).copyto(aux_params[name])
+
+ def forward(self, data_batch, is_train=None):
+ """Split `data_batch` according to workload and run forward on each devices.
+
+ Parameters
+ ----------
+ data_batch : DataBatch
+ Or could be any object implementing similar interface.
+ is_train : bool
+ The hint for the backend, indicating whether we are during training phase.
+ Default is `None`, then the value `self.for_training` will be used.
+ Returns
+ -------
+
+ """
+ _load_data(data_batch, self.data_arrays, self.data_layouts)
+ if is_train is None:
+ is_train = self.for_training
+
+ if self.label_arrays is not None:
+ assert not is_train or data_batch.label
+ if data_batch.label:
+ _load_label(data_batch, self.label_arrays, self.label_layouts)
+
+ for exec_ in self.execs:
+ exec_.forward(is_train=is_train)
+
+
+ def get_outputs(self, merge_multi_context=True):
+ """Get outputs of the previous forward computation.
+
+ Parameters
+ ----------
+ merge_multi_context : bool
+ Default is `True`. In the case when data-parallelism is used, the outputs
+ will be collected from multiple devices. A `True` value indicate that we
+ should merge the collected results so that they look like from a single
+ executor.
+
+ Returns
+ -------
+ If `merge_multi_context` is `True`, it is like `[out1, out2]`. Otherwise, it
+ is like `[[out1_dev1, out1_dev2], [out2_dev1, out2_dev2]]`. All the output
+ elements are `NDArray`.
+ """
+ outputs = [[exec_.outputs[i] for exec_ in self.execs]
+ for i in range(len(self.execs[0].outputs))]
+ if merge_multi_context:
+ outputs = _merge_multi_context(outputs, self.output_layouts)
+ return outputs
+
+ def get_states(self, merge_multi_context=True):
+ """Get states from all devices
+
+ Parameters
+ ----------
+ merge_multi_context : bool
+ Default is `True`. In the case when data-parallelism is used, the states
+ will be collected from multiple devices. A `True` value indicate that we
+ should merge the collected results so that they look like from a single
+ executor.
+
+ Returns
+ -------
+ If `merge_multi_context` is `True`, it is like `[out1, out2]`. Otherwise, it
+ is like `[[out1_dev1, out1_dev2], [out2_dev1, out2_dev2]]`. All the output
+ elements are `NDArray`.
+ """
+ assert not merge_multi_context, \
+ "merge_multi_context=True is not supported for get_states yet."
+ return self.state_arrays
+
+ def set_states(self, states=None, value=None):
+ """Set value for states. Only one of states & value can be specified.
+
+ Parameters
+ ----------
+ states : list of list of NDArrays
+ source states arrays formatted like [[state1_dev1, state1_dev2],
+ [state2_dev1, state2_dev2]].
+ value : number
+ a single scalar value for all state arrays.
+ """
+ if states is not None:
+ assert value is None, "Only one of states & value can be specified."
+ _load_general(states, self.state_arrays, (0,)*len(states))
+ else:
+ assert value is not None, "At least one of states & value must be specified."
+ assert states is None, "Only one of states & value can be specified."
+ for d_dst in self.state_arrays:
+ for dst in d_dst:
+ dst[:] = value
+
+ def get_input_grads(self, merge_multi_context=True):
+ """Get the gradients with respect to the inputs of the module.
+
+ Parameters
+ ----------
+ merge_multi_context : bool
+ Default is `True`. In the case when data-parallelism is used, the outputs
+ will be collected from multiple devices. A `True` value indicate that we
+ should merge the collected results so that they look like from a single
+ executor.
+
+ Returns
+ -------
+ If `merge_multi_context` is `True`, it is like `[grad1, grad2]`. Otherwise, it
+ is like `[[grad1_dev1, grad1_dev2], [grad2_dev1, grad2_dev2]]`. All the output
+ elements are `NDArray`.
+ """
+ assert self.inputs_need_grad
+ if merge_multi_context:
+ return _merge_multi_context(self.input_grad_arrays, self.data_layouts)
+ return self.input_grad_arrays
+
+ def backward(self, out_grads=None):
+ """Run backward on all devices. A backward should be called after
+ a call to the forward function. Backward cannot be called unless
+ `self.for_training` is `True`.
+
+ Parameters
+ ----------
+ out_grads : NDArray or list of NDArray, optional
+ Gradient on the outputs to be propagated back.
+ This parameter is only needed when bind is called
+ on outputs that are not a loss function.
+ """
+ assert self.for_training, 're-bind with for_training=True to run backward'
+ if out_grads is None:
+ out_grads = []
+
+ for i, exec_ in enumerate(self.execs):
+ out_grads_slice = []
+ exec_.backward(out_grads=out_grads_slice)
+
+ def update_metric(self, eval_metric, labels):
+ """Accumulate the performance according to `eval_metric` on all devices.
+
+ Parameters
+ ----------
+ eval_metric : EvalMetric
+ The metric used for evaluation.
+ labels : list of NDArray
+ Typically comes from `label` of a `DataBatch`.
+ """
+ for texec, labels in zip(self.execs, labels):
+ eval_metric.update(labels, texec.outputs)
+
+ def _bind_ith_exec(self, i, data_shapes, label_shapes, shared_group):
+ """Internal utility function to bind the i-th executor.
+ """
+ shared_exec = None if shared_group is None else shared_group.execs[i]
+ context = self.contexts[i]
+ shared_data_arrays = self.shared_data_arrays[i]
+
+ input_shapes = dict(data_shapes)
+ if label_shapes is not None:
+ input_shapes.update(dict(label_shapes))
+
+ arg_shapes, _, aux_shapes = self.symbol.infer_shape(**input_shapes)
+ assert arg_shapes is not None, "shape inference failed"
+
+ input_types = {x.name: x.dtype for x in data_shapes}
+ if label_shapes is not None:
+ input_types.update({x.name: x.dtype for x in label_shapes})
+ arg_types, _, aux_types = self.symbol.infer_type(**input_types)
+ assert arg_types is not None, "type inference failed"
+
+ arg_arrays = []
+ grad_arrays = {} if self.for_training else None
+
+ def _get_or_reshape(name, shared_data_arrays, arg_shape, arg_type, context, logger):
+ """Internal helper to get a memory block or re-use by re-shaping"""
+ if name in shared_data_arrays:
+ arg_arr = shared_data_arrays[name]
+
+ if np.prod(arg_arr.shape) >= np.prod(arg_shape):
+ # nice, we can directly re-use this data blob
+ assert arg_arr.dtype == arg_type
+ arg_arr = arg_arr.reshape(arg_shape)
+ else:
+ logger.warning(('bucketing: data "%s" has a shape %s' % (name, arg_shape)) +
+ (', which is larger than already allocated ') +
+ ('shape %s' % (arg_arr.shape,)) +
+ ('. Need to re-allocate. Consider putting ') +
+ ('default_bucket_key to') +
+ (' be the bucket taking the largest input for better ') +
+ ('memory sharing.'))
+ arg_arr = nd.zeros(arg_shape, context, dtype=arg_type)
+
+ # replace existing shared array because the new one is bigger
+ shared_data_arrays[name] = arg_arr
+ else:
+ arg_arr = nd.zeros(arg_shape, context, dtype=arg_type)
+ shared_data_arrays[name] = arg_arr
+
+ return arg_arr
+
+ # create or borrow arguments and gradients
+ for j in range(len(self.arg_names)):
+ name = self.arg_names[j]
+ if name in self.param_names: # model parameters
+ if shared_exec is None:
+ arg_arr = nd.zeros(arg_shapes[j], context, dtype=arg_types[j])
+ if self.grad_req[name] != 'null':
+ grad_arr = nd.zeros(arg_shapes[j], context, dtype=arg_types[j])
+ grad_arrays[name] = grad_arr
+ else:
+ arg_arr = shared_exec.arg_dict[name]
+ assert arg_arr.shape == arg_shapes[j]
+ assert arg_arr.dtype == arg_types[j]
+ if self.grad_req[name] != 'null':
+ grad_arrays[name] = shared_exec.grad_dict[name]
+ else: # data, label, or states
+ arg_arr = _get_or_reshape(name, shared_data_arrays, arg_shapes[j], arg_types[j],
+ context, self.logger)
+
+ # data might also need grad if inputs_need_grad is True
+ if self.grad_req[name] != 'null':
+ grad_arrays[name] = _get_or_reshape('grad of ' + name, shared_data_arrays,
+ arg_shapes[j], arg_types[j], context,
+ self.logger)
+
+ arg_arrays.append(arg_arr)
+
+ # create or borrow aux variables
+ if shared_exec is None:
+ aux_arrays = [nd.zeros(s, context, dtype=t) for s, t in zip(aux_shapes, aux_types)]
+ else:
+ for j, arr in enumerate(shared_exec.aux_arrays):
+ assert aux_shapes[j] == arr.shape
+ assert aux_types[j] == arr.dtype
+ aux_arrays = shared_exec.aux_arrays[:]
+
+ executor = self.symbol.bind(ctx=context, args=arg_arrays,
+ args_grad=grad_arrays, aux_states=aux_arrays,
+ grad_req=self.grad_req, shared_exec=shared_exec)
+ # Get the total bytes allocated for this executor
+ return executor
+
+ def _sliced_shape(self, shapes, i, major_axis):
+ """Get the sliced shapes for the i-th executor.
+
+ Parameters
+ ----------
+ shapes : list of (str, tuple)
+ The original (name, shape) pairs.
+ i : int
+ Which executor we are dealing with.
+ """
+ sliced_shapes = []
+ for desc, axis in zip(shapes, major_axis):
+ shape = list(desc.shape)
+ if axis >= 0:
+ shape[axis] = self.slices[i].stop - self.slices[i].start
+ sliced_shapes.append(DataDesc(desc.name, tuple(shape), desc.dtype, desc.layout))
+ return sliced_shapes
+
+ def install_monitor(self, mon):
+ """Install monitor on all executors"""
+ for exe in self.execs:
+ mon.install(exe)
diff --git a/faster_rcnn/core/__init__.py b/faster_rcnn/core/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/faster_rcnn/core/callback.py b/faster_rcnn/core/callback.py
new file mode 100644
index 0000000..4286f43
--- /dev/null
+++ b/faster_rcnn/core/callback.py
@@ -0,0 +1,56 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Yuwen Xiong
+# --------------------------------------------------------
+
+import time
+import logging
+import mxnet as mx
+
+
+class Speedometer(object):
+ def __init__(self, batch_size, frequent=50):
+ self.batch_size = batch_size
+ self.frequent = frequent
+ self.init = False
+ self.tic = 0
+ self.last_count = 0
+
+ def __call__(self, param):
+ """Callback to Show speed."""
+ count = param.nbatch
+ if self.last_count > count:
+ self.init = False
+ self.last_count = count
+
+ if self.init:
+ if count % self.frequent == 0:
+ speed = self.frequent * self.batch_size / (time.time() - self.tic)
+ s = ''
+ if param.eval_metric is not None:
+ name, value = param.eval_metric.get()
+ s = "Epoch[%d] Batch [%d]\tSpeed: %.2f samples/sec\tTrain-" % (param.epoch, count, speed)
+ for n, v in zip(name, value):
+ s += "%s=%f,\t" % (n, v)
+ else:
+ s = "Iter[%d] Batch [%d]\tSpeed: %.2f samples/sec" % (param.epoch, count, speed)
+
+ logging.info(s)
+ print(s)
+ self.tic = time.time()
+ else:
+ self.init = True
+ self.tic = time.time()
+
+
+def do_checkpoint(prefix, means, stds):
+ def _callback(iter_no, sym, arg, aux):
+ arg['bbox_pred_weight_test'] = (arg['bbox_pred_weight'].T * mx.nd.array(stds)).T
+ arg['bbox_pred_bias_test'] = arg['bbox_pred_bias'] * mx.nd.array(stds) + mx.nd.array(means)
+ mx.model.save_checkpoint(prefix, iter_no + 1, sym, arg, aux)
+ arg.pop('bbox_pred_weight_test')
+ arg.pop('bbox_pred_bias_test')
+ return _callback
diff --git a/faster_rcnn/core/loader.py b/faster_rcnn/core/loader.py
new file mode 100644
index 0000000..78de81b
--- /dev/null
+++ b/faster_rcnn/core/loader.py
@@ -0,0 +1,506 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Yuwen Xiong
+# --------------------------------------------------------
+
+import numpy as np
+import mxnet as mx
+from mxnet.executor_manager import _split_input_slice
+
+from config.config import config
+from utils.image import tensor_vstack
+from rpn.rpn import get_rpn_testbatch, get_rpn_batch, assign_anchor
+from rcnn import get_rcnn_testbatch, get_rcnn_batch
+
+
+class TestLoader(mx.io.DataIter):
+ def __init__(self, roidb, config, batch_size=1, shuffle=False,
+ has_rpn=False):
+ super(TestLoader, self).__init__()
+
+ # save parameters as properties
+ self.cfg = config
+ self.roidb = roidb
+ self.batch_size = batch_size
+ self.shuffle = shuffle
+ self.has_rpn = has_rpn
+
+ # infer properties from roidb
+ self.size = len(self.roidb)
+ self.index = np.arange(self.size)
+
+ # decide data and label names (only for training)
+ if has_rpn:
+ self.data_name = ['data', 'im_info']
+ else:
+ self.data_name = ['data', 'rois']
+ self.label_name = None
+
+ # status variable for synchronization between get_data and get_label
+ self.cur = 0
+ self.data = None
+ self.label = []
+ self.im_info = None
+
+ # get first batch to fill in provide_data and provide_label
+ self.reset()
+ self.get_batch()
+
+ @property
+ def provide_data(self):
+ return [[(k, v.shape) for k, v in zip(self.data_name, idata)] for idata in self.data]
+
+ @property
+ def provide_label(self):
+ return [None for _ in range(len(self.data))]
+
+ @property
+ def provide_data_single(self):
+ return [(k, v.shape) for k, v in zip(self.data_name, self.data[0])]
+
+ @property
+ def provide_label_single(self):
+ return None
+
+ def reset(self):
+ self.cur = 0
+ if self.shuffle:
+ np.random.shuffle(self.index)
+
+ def iter_next(self):
+ return self.cur < self.size
+
+ def next(self):
+ if self.iter_next():
+ self.get_batch()
+ self.cur += self.batch_size
+ return self.im_info, mx.io.DataBatch(data=self.data, label=self.label,
+ pad=self.getpad(), index=self.getindex(),
+ provide_data=self.provide_data, provide_label=self.provide_label)
+ else:
+ raise StopIteration
+
+ def getindex(self):
+ return self.cur / self.batch_size
+
+ def getpad(self):
+ if self.cur + self.batch_size > self.size:
+ return self.cur + self.batch_size - self.size
+ else:
+ return 0
+
+ def get_batch(self):
+ cur_from = self.cur
+ cur_to = min(cur_from + self.batch_size, self.size)
+ roidb = [self.roidb[self.index[i]] for i in range(cur_from, cur_to)]
+ if self.has_rpn:
+ data, label, im_info = get_rpn_testbatch(roidb, self.cfg)
+ else:
+ data, label, im_info = get_rcnn_testbatch(roidb, self.cfg)
+ self.data = [[mx.nd.array(idata[name]) for name in self.data_name] for idata in data]
+ self.im_info = im_info
+
+ def get_batch_individual(self):
+ cur_from = self.cur
+ cur_to = min(cur_from + self.batch_size, self.size)
+ roidb = [self.roidb[self.index[i]] for i in range(cur_from, cur_to)]
+ if self.has_rpn:
+ data, label, im_info = get_rpn_testbatch(roidb, self.cfg)
+ else:
+ data, label, im_info = get_rcnn_testbatch(roidb, self.cfg)
+ self.data = [mx.nd.array(data[name]) for name in self.data_name]
+ self.im_info = im_info
+
+
+class ROIIter(mx.io.DataIter):
+ def __init__(self, roidb, config, batch_size=2, shuffle=False, ctx=None, work_load_list=None, aspect_grouping=False):
+ """
+ This Iter will provide roi data to Fast R-CNN network
+ :param roidb: must be preprocessed
+ :param batch_size: must divide BATCH_SIZE(128)
+ :param shuffle: bool
+ :param ctx: list of contexts
+ :param work_load_list: list of work load
+ :param aspect_grouping: group images with similar aspects
+ :return: ROIIter
+ """
+ super(ROIIter, self).__init__()
+
+ # save parameters as properties
+ self.roidb = roidb
+ self.cfg = config
+ self.batch_size = batch_size
+ self.shuffle = shuffle
+ self.ctx = ctx
+ if self.ctx is None:
+ self.ctx = [mx.cpu()]
+ self.work_load_list = work_load_list
+ self.aspect_grouping = aspect_grouping
+
+ # infer properties from roidb
+ self.size = len(roidb)
+ self.index = np.arange(self.size)
+
+ # decide data and label names (only for training)
+ self.data_name = ['data', 'rois']
+ self.label_name = ['label', 'bbox_target', 'bbox_weight']
+
+ # status variable for synchronization between get_data and get_label
+ self.cur = 0
+ self.batch = None
+ self.data = None
+ self.label = None
+
+ # get first batch to fill in provide_data and provide_label
+ self.reset()
+ self.get_batch_individual()
+
+ @property
+ def provide_data(self):
+ return [[(k, v.shape) for k, v in zip(self.data_name, self.data[i])] for i in xrange(len(self.data))]
+
+ @property
+ def provide_label(self):
+ return [[(k, v.shape) for k, v in zip(self.label_name, self.label[i])] for i in xrange(len(self.data))]
+
+ @property
+ def provide_data_single(self):
+ return [(k, v.shape) for k, v in zip(self.data_name, self.data[0])]
+
+ @property
+ def provide_label_single(self):
+ return [(k, v.shape) for k, v in zip(self.label_name, self.label[0])]
+
+ def reset(self):
+ self.cur = 0
+ if self.shuffle:
+ if self.aspect_grouping:
+ widths = np.array([r['width'] for r in self.roidb])
+ heights = np.array([r['height'] for r in self.roidb])
+ horz = (widths >= heights)
+ vert = np.logical_not(horz)
+ horz_inds = np.where(horz)[0]
+ vert_inds = np.where(vert)[0]
+ inds = np.hstack((np.random.permutation(horz_inds), np.random.permutation(vert_inds)))
+ extra = inds.shape[0] % self.batch_size
+ inds_ = np.reshape(inds[:-extra], (-1, self.batch_size))
+ row_perm = np.random.permutation(np.arange(inds_.shape[0]))
+ inds[:-extra] = np.reshape(inds_[row_perm, :], (-1,))
+ self.index = inds
+ else:
+ np.random.shuffle(self.index)
+
+ def iter_next(self):
+ return self.cur + self.batch_size <= self.size
+
+ def next(self):
+ if self.iter_next():
+ self.get_batch_individual()
+ self.cur += self.batch_size
+ return mx.io.DataBatch(data=self.data, label=self.label,
+ pad=self.getpad(), index=self.getindex(),
+ provide_data=self.provide_data, provide_label=self.provide_label)
+ else:
+ raise StopIteration
+
+ def getindex(self):
+ return self.cur / self.batch_size
+
+ def getpad(self):
+ if self.cur + self.batch_size > self.size:
+ return self.cur + self.batch_size - self.size
+ else:
+ return 0
+
+ def get_batch(self):
+ # slice roidb
+ cur_from = self.cur
+ cur_to = min(cur_from + self.batch_size, self.size)
+ roidb = [self.roidb[self.index[i]] for i in range(cur_from, cur_to)]
+
+ # decide multi device slices
+ work_load_list = self.work_load_list
+ ctx = self.ctx
+ if work_load_list is None:
+ work_load_list = [1] * len(ctx)
+ assert isinstance(work_load_list, list) and len(work_load_list) == len(ctx), \
+ "Invalid settings for work load. "
+ slices = _split_input_slice(self.batch_size, work_load_list)
+
+ # get each device
+ data_list = []
+ label_list = []
+ for islice in slices:
+ iroidb = [roidb[i] for i in range(islice.start, islice.stop)]
+ data, label = get_rcnn_batch(iroidb, self.cfg)
+ data_list.append(data)
+ label_list.append(label)
+
+ all_data = dict()
+ for key in data_list[0].keys():
+ all_data[key] = tensor_vstack([batch[key] for batch in data_list])
+
+ all_label = dict()
+ for key in label_list[0].keys():
+ all_label[key] = tensor_vstack([batch[key] for batch in label_list])
+
+ self.data = [mx.nd.array(all_data[name]) for name in self.data_name]
+ self.label = [mx.nd.array(all_label[name]) for name in self.label_name]
+
+ def get_batch_individual(self):
+ # slice roidb
+ cur_from = self.cur
+ cur_to = min(cur_from + self.batch_size, self.size)
+ roidb = [self.roidb[self.index[i]] for i in range(cur_from, cur_to)]
+
+ # decide multi device slices
+ work_load_list = self.work_load_list
+ ctx = self.ctx
+ if work_load_list is None:
+ work_load_list = [1] * len(ctx)
+ assert isinstance(work_load_list, list) and len(work_load_list) == len(ctx), \
+ "Invalid settings for work load. "
+ slices = _split_input_slice(self.batch_size, work_load_list)
+
+ rst = []
+ for idx, islice in enumerate(slices):
+ iroidb = [roidb[i] for i in range(islice.start, islice.stop)]
+ rst.append(self.parfetch(iroidb))
+
+ all_data = [_['data'] for _ in rst]
+ all_label = [_['label'] for _ in rst]
+ self.data = [[mx.nd.array(data[key]) for key in self.data_name] for data in all_data]
+ self.label = [[mx.nd.array(label[key]) for key in self.label_name] for label in all_label]
+
+ def parfetch(self, iroidb):
+ data, label = get_rcnn_batch(iroidb, self.cfg)
+ return {'data': data, 'label': label}
+
+
+class AnchorLoader(mx.io.DataIter):
+
+ def __init__(self, feat_sym, roidb, cfg, batch_size=1, shuffle=False, ctx=None, work_load_list=None,
+ feat_stride=16, anchor_scales=(8, 16, 32), anchor_ratios=(0.5, 1, 2), allowed_border=0,
+ aspect_grouping=False):
+ """
+ This Iter will provide roi data to Fast R-CNN network
+ :param feat_sym: to infer shape of assign_output
+ :param roidb: must be preprocessed
+ :param batch_size: must divide BATCH_SIZE(128)
+ :param shuffle: bool
+ :param ctx: list of contexts
+ :param work_load_list: list of work load
+ :param aspect_grouping: group images with similar aspects
+ :return: AnchorLoader
+ """
+ super(AnchorLoader, self).__init__()
+
+ # save parameters as properties
+ self.feat_sym = feat_sym
+ self.roidb = roidb
+ self.cfg = cfg
+ self.batch_size = batch_size
+ self.shuffle = shuffle
+ self.ctx = ctx
+ if self.ctx is None:
+ self.ctx = [mx.cpu()]
+ self.work_load_list = work_load_list
+ self.feat_stride = feat_stride
+ self.anchor_scales = anchor_scales
+ self.anchor_ratios = anchor_ratios
+ self.allowed_border = allowed_border
+ self.aspect_grouping = aspect_grouping
+
+ # infer properties from roidb
+ self.size = len(roidb)
+ self.index = np.arange(self.size)
+
+ # decide data and label names
+ if config.TRAIN.END2END:
+ self.data_name = ['data', 'im_info', 'gt_boxes']
+ else:
+ self.data_name = ['data']
+ self.label_name = ['label', 'bbox_target', 'bbox_weight']
+
+ # status variable for synchronization between get_data and get_label
+ self.cur = 0
+ self.batch = None
+ self.data = None
+ self.label = None
+
+ # get first batch to fill in provide_data and provide_label
+ self.reset()
+ self.get_batch_individual()
+
+ @property
+ def provide_data(self):
+ return [[(k, v.shape) for k, v in zip(self.data_name, self.data[i])] for i in xrange(len(self.data))]
+
+ @property
+ def provide_label(self):
+ return [[(k, v.shape) for k, v in zip(self.label_name, self.label[i])] for i in xrange(len(self.data))]
+
+ @property
+ def provide_data_single(self):
+ return [(k, v.shape) for k, v in zip(self.data_name, self.data[0])]
+
+ @property
+ def provide_label_single(self):
+ return [(k, v.shape) for k, v in zip(self.label_name, self.label[0])]
+
+ def reset(self):
+ self.cur = 0
+ if self.shuffle:
+ if self.aspect_grouping:
+ widths = np.array([r['width'] for r in self.roidb])
+ heights = np.array([r['height'] for r in self.roidb])
+ horz = (widths >= heights)
+ vert = np.logical_not(horz)
+ horz_inds = np.where(horz)[0]
+ vert_inds = np.where(vert)[0]
+ inds = np.hstack((np.random.permutation(horz_inds), np.random.permutation(vert_inds)))
+ extra = inds.shape[0] % self.batch_size
+ inds_ = np.reshape(inds[:-extra], (-1, self.batch_size))
+ row_perm = np.random.permutation(np.arange(inds_.shape[0]))
+ inds[:-extra] = np.reshape(inds_[row_perm, :], (-1,))
+ self.index = inds
+ else:
+ np.random.shuffle(self.index)
+
+ def iter_next(self):
+ return self.cur + self.batch_size <= self.size
+
+ def next(self):
+ if self.iter_next():
+ self.get_batch_individual()
+ self.cur += self.batch_size
+ return mx.io.DataBatch(data=self.data, label=self.label,
+ pad=self.getpad(), index=self.getindex(),
+ provide_data=self.provide_data, provide_label=self.provide_label)
+ else:
+ raise StopIteration
+
+ def getindex(self):
+ return self.cur / self.batch_size
+
+ def getpad(self):
+ if self.cur + self.batch_size > self.size:
+ return self.cur + self.batch_size - self.size
+ else:
+ return 0
+
+ def infer_shape(self, max_data_shape=None, max_label_shape=None):
+ """ Return maximum data and label shape for single gpu """
+ if max_data_shape is None:
+ max_data_shape = []
+ if max_label_shape is None:
+ max_label_shape = []
+ max_shapes = dict(max_data_shape + max_label_shape)
+ input_batch_size = max_shapes['data'][0]
+ im_info = [[max_shapes['data'][2], max_shapes['data'][3], 1.0]]
+ _, feat_shape, _ = self.feat_sym.infer_shape(**max_shapes)
+ label = assign_anchor(feat_shape[0], np.zeros((0, 5)), im_info, self.cfg,
+ self.feat_stride, self.anchor_scales, self.anchor_ratios, self.allowed_border)
+ label = [label[k] for k in self.label_name]
+ label_shape = [(k, tuple([input_batch_size] + list(v.shape[1:]))) for k, v in zip(self.label_name, label)]
+ return max_data_shape, label_shape
+
+ def get_batch(self):
+ # slice roidb
+ cur_from = self.cur
+ cur_to = min(cur_from + self.batch_size, self.size)
+ roidb = [self.roidb[self.index[i]] for i in range(cur_from, cur_to)]
+
+ # decide multi device slice
+ work_load_list = self.work_load_list
+ ctx = self.ctx
+ if work_load_list is None:
+ work_load_list = [1] * len(ctx)
+ assert isinstance(work_load_list, list) and len(work_load_list) == len(ctx), \
+ "Invalid settings for work load. "
+ slices = _split_input_slice(self.batch_size, work_load_list)
+
+ # get testing data for multigpu
+ data_list = []
+ label_list = []
+ for islice in slices:
+ iroidb = [roidb[i] for i in range(islice.start, islice.stop)]
+ data, label = get_rpn_batch(iroidb, self.cfg)
+ data_list.append(data)
+ label_list.append(label)
+
+ # pad data first and then assign anchor (read label)
+ data_tensor = tensor_vstack([batch['data'] for batch in data_list])
+ for data, data_pad in zip(data_list, data_tensor):
+ data['data'] = data_pad[np.newaxis, :]
+
+ new_label_list = []
+ for data, label in zip(data_list, label_list):
+ # infer label shape
+ data_shape = {k: v.shape for k, v in data.items()}
+ del data_shape['im_info']
+ _, feat_shape, _ = self.feat_sym.infer_shape(**data_shape)
+ feat_shape = [int(i) for i in feat_shape[0]]
+
+ # add gt_boxes to data for e2e
+ data['gt_boxes'] = label['gt_boxes'][np.newaxis, :, :]
+
+ # assign anchor for label
+ label = assign_anchor(feat_shape, label['gt_boxes'], data['im_info'], self.cfg,
+ self.feat_stride, self.anchor_scales,
+ self.anchor_ratios, self.allowed_border)
+ new_label_list.append(label)
+
+ all_data = dict()
+ for key in self.data_name:
+ all_data[key] = tensor_vstack([batch[key] for batch in data_list])
+
+ all_label = dict()
+ for key in self.label_name:
+ pad = -1 if key == 'label' else 0
+ all_label[key] = tensor_vstack([batch[key] for batch in new_label_list], pad=pad)
+
+ self.data = [mx.nd.array(all_data[key]) for key in self.data_name]
+ self.label = [mx.nd.array(all_label[key]) for key in self.label_name]
+
+ def get_batch_individual(self):
+ cur_from = self.cur
+ cur_to = min(cur_from + self.batch_size, self.size)
+ roidb = [self.roidb[self.index[i]] for i in range(cur_from, cur_to)]
+ # decide multi device slice
+ work_load_list = self.work_load_list
+ ctx = self.ctx
+ if work_load_list is None:
+ work_load_list = [1] * len(ctx)
+ assert isinstance(work_load_list, list) and len(work_load_list) == len(ctx), \
+ "Invalid settings for work load. "
+ slices = _split_input_slice(self.batch_size, work_load_list)
+ rst = []
+ for idx, islice in enumerate(slices):
+ iroidb = [roidb[i] for i in range(islice.start, islice.stop)]
+ rst.append(self.parfetch(iroidb))
+ all_data = [_['data'] for _ in rst]
+ all_label = [_['label'] for _ in rst]
+ self.data = [[mx.nd.array(data[key]) for key in self.data_name] for data in all_data]
+ self.label = [[mx.nd.array(label[key]) for key in self.label_name] for label in all_label]
+
+ def parfetch(self, iroidb):
+ # get testing data for multigpu
+ data, label = get_rpn_batch(iroidb, self.cfg)
+ data_shape = {k: v.shape for k, v in data.items()}
+ del data_shape['im_info']
+ _, feat_shape, _ = self.feat_sym.infer_shape(**data_shape)
+ feat_shape = [int(i) for i in feat_shape[0]]
+
+ # add gt_boxes to data for e2e
+ data['gt_boxes'] = label['gt_boxes'][np.newaxis, :, :]
+
+ # assign anchor for label
+ label = assign_anchor(feat_shape, label['gt_boxes'], data['im_info'], self.cfg,
+ self.feat_stride, self.anchor_scales,
+ self.anchor_ratios, self.allowed_border)
+ return {'data': data, 'label': label}
+
diff --git a/faster_rcnn/core/metric.py b/faster_rcnn/core/metric.py
new file mode 100644
index 0000000..52f885b
--- /dev/null
+++ b/faster_rcnn/core/metric.py
@@ -0,0 +1,176 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Yuwen Xiong
+# --------------------------------------------------------
+
+import mxnet as mx
+import numpy as np
+
+
+def get_rpn_names():
+ pred = ['rpn_cls_prob', 'rpn_bbox_loss']
+ label = ['rpn_label', 'rpn_bbox_target', 'rpn_bbox_weight']
+ return pred, label
+
+
+def get_rcnn_names(cfg):
+ pred = ['rcnn_cls_prob', 'rcnn_bbox_loss']
+ label = ['rcnn_label', 'rcnn_bbox_target', 'rcnn_bbox_weight']
+ if cfg.TRAIN.ENABLE_OHEM or cfg.TRAIN.END2END:
+ pred.append('rcnn_label')
+ if cfg.TRAIN.END2END:
+ rpn_pred, rpn_label = get_rpn_names()
+ pred = rpn_pred + pred
+ label = rpn_label
+ return pred, label
+
+
+class RPNAccMetric(mx.metric.EvalMetric):
+ def __init__(self):
+ super(RPNAccMetric, self).__init__('RPNAcc')
+ self.pred, self.label = get_rpn_names()
+
+ def update(self, labels, preds):
+ pred = preds[self.pred.index('rpn_cls_prob')]
+ label = labels[self.label.index('rpn_label')]
+
+ # pred (b, c, p) or (b, c, h, w)
+ pred_label = mx.ndarray.argmax_channel(pred).asnumpy().astype('int32')
+ pred_label = pred_label.reshape((pred_label.shape[0], -1))
+ # label (b, p)
+ label = label.asnumpy().astype('int32')
+
+ # filter with keep_inds
+ keep_inds = np.where(label != -1)
+ pred_label = pred_label[keep_inds]
+ label = label[keep_inds]
+
+ self.sum_metric += np.sum(pred_label.flat == label.flat)
+ self.num_inst += len(pred_label.flat)
+
+
+class RCNNAccMetric(mx.metric.EvalMetric):
+ def __init__(self, cfg):
+ super(RCNNAccMetric, self).__init__('RCNNAcc')
+ self.e2e = cfg.TRAIN.END2END
+ self.ohem = cfg.TRAIN.ENABLE_OHEM
+ self.pred, self.label = get_rcnn_names(cfg)
+
+ def update(self, labels, preds):
+ pred = preds[self.pred.index('rcnn_cls_prob')]
+ if self.ohem or self.e2e:
+ label = preds[self.pred.index('rcnn_label')]
+ else:
+ label = labels[self.label.index('rcnn_label')]
+
+ last_dim = pred.shape[-1]
+ pred_label = pred.asnumpy().reshape(-1, last_dim).argmax(axis=1).astype('int32')
+ label = label.asnumpy().reshape(-1,).astype('int32')
+
+ # filter with keep_inds
+ keep_inds = np.where(label != -1)
+ pred_label = pred_label[keep_inds]
+ label = label[keep_inds]
+
+ self.sum_metric += np.sum(pred_label.flat == label.flat)
+ self.num_inst += len(pred_label.flat)
+
+
+class RPNLogLossMetric(mx.metric.EvalMetric):
+ def __init__(self):
+ super(RPNLogLossMetric, self).__init__('RPNLogLoss')
+ self.pred, self.label = get_rpn_names()
+
+ def update(self, labels, preds):
+ pred = preds[self.pred.index('rpn_cls_prob')]
+ label = labels[self.label.index('rpn_label')]
+
+ # label (b, p)
+ label = label.asnumpy().astype('int32').reshape((-1))
+ # pred (b, c, p) or (b, c, h, w) --> (b, p, c) --> (b*p, c)
+ pred = pred.asnumpy().reshape((pred.shape[0], pred.shape[1], -1)).transpose((0, 2, 1))
+ pred = pred.reshape((label.shape[0], -1))
+
+ # filter with keep_inds
+ keep_inds = np.where(label != -1)[0]
+ label = label[keep_inds]
+ cls = pred[keep_inds, label]
+
+ cls += 1e-14
+ cls_loss = -1 * np.log(cls)
+ cls_loss = np.sum(cls_loss)
+ self.sum_metric += cls_loss
+ self.num_inst += label.shape[0]
+
+
+class RCNNLogLossMetric(mx.metric.EvalMetric):
+ def __init__(self, cfg):
+ super(RCNNLogLossMetric, self).__init__('RCNNLogLoss')
+ self.e2e = cfg.TRAIN.END2END
+ self.ohem = cfg.TRAIN.ENABLE_OHEM
+ self.pred, self.label = get_rcnn_names(cfg)
+
+ def update(self, labels, preds):
+ pred = preds[self.pred.index('rcnn_cls_prob')]
+ if self.ohem or self.e2e:
+ label = preds[self.pred.index('rcnn_label')]
+ else:
+ label = labels[self.label.index('rcnn_label')]
+
+ last_dim = pred.shape[-1]
+ pred = pred.asnumpy().reshape(-1, last_dim)
+ label = label.asnumpy().reshape(-1,).astype('int32')
+
+ # filter with keep_inds
+ keep_inds = np.where(label != -1)[0]
+ label = label[keep_inds]
+ cls = pred[keep_inds, label]
+
+ cls += 1e-14
+ cls_loss = -1 * np.log(cls)
+ cls_loss = np.sum(cls_loss)
+ self.sum_metric += cls_loss
+ self.num_inst += label.shape[0]
+
+
+class RPNL1LossMetric(mx.metric.EvalMetric):
+ def __init__(self):
+ super(RPNL1LossMetric, self).__init__('RPNL1Loss')
+ self.pred, self.label = get_rpn_names()
+
+ def update(self, labels, preds):
+ bbox_loss = preds[self.pred.index('rpn_bbox_loss')].asnumpy()
+
+ # calculate num_inst (average on those kept anchors)
+ label = labels[self.label.index('rpn_label')].asnumpy()
+ num_inst = np.sum(label != -1)
+
+ self.sum_metric += np.sum(bbox_loss)
+ self.num_inst += num_inst
+
+
+class RCNNL1LossMetric(mx.metric.EvalMetric):
+ def __init__(self, cfg):
+ super(RCNNL1LossMetric, self).__init__('RCNNL1Loss')
+ self.e2e = cfg.TRAIN.END2END
+ self.ohem = cfg.TRAIN.ENABLE_OHEM
+ self.pred, self.label = get_rcnn_names(cfg)
+
+ def update(self, labels, preds):
+ bbox_loss = preds[self.pred.index('rcnn_bbox_loss')].asnumpy()
+ if self.ohem:
+ label = preds[self.pred.index('rcnn_label')].asnumpy()
+ else:
+ if self.e2e:
+ label = preds[self.pred.index('rcnn_label')].asnumpy()
+ else:
+ label = labels[self.label.index('rcnn_label')].asnumpy()
+
+ # calculate num_inst (average on those kept anchors)
+ num_inst = np.sum(label != -1)
+
+ self.sum_metric += np.sum(bbox_loss)
+ self.num_inst += num_inst
diff --git a/faster_rcnn/core/module.py b/faster_rcnn/core/module.py
new file mode 100644
index 0000000..25924fb
--- /dev/null
+++ b/faster_rcnn/core/module.py
@@ -0,0 +1,1067 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Yuwen Xiong
+# --------------------------------------------------------
+
+"""A `MutableModule` implement the `BaseModule` API, and allows input shape
+varying with training iterations. If shapes vary, executors will rebind,
+using shared arrays from the initial module binded with maximum shape.
+"""
+
+import time
+import logging
+import warnings
+
+from mxnet import context as ctx
+from mxnet.initializer import Uniform, InitDesc
+from mxnet.module.base_module import BaseModule, _check_input_names, _parse_data_desc, _as_list
+from mxnet.model import _create_kvstore, _initialize_kvstore, _update_params, _update_params_on_kvstore, load_checkpoint, BatchEndParam
+from mxnet import metric
+
+from .DataParallelExecutorGroup import DataParallelExecutorGroup
+from mxnet import ndarray as nd
+from mxnet import optimizer as opt
+
+
+class Module(BaseModule):
+ """Module is a basic module that wrap a `Symbol`. It is functionally the same
+ as the `FeedForward` model, except under the module API.
+
+ Parameters
+ ----------
+ symbol : Symbol
+ data_names : list of str
+ Default is `('data')` for a typical model used in image classification.
+ label_names : list of str
+ Default is `('softmax_label')` for a typical model used in image
+ classification.
+ logger : Logger
+ Default is `logging`.
+ context : Context or list of Context
+ Default is `cpu()`.
+ work_load_list : list of number
+ Default `None`, indicating uniform workload.
+ fixed_param_names: list of str
+ Default `None`, indicating no network parameters are fixed.
+ state_names : list of str
+ states are similar to data and label, but not provided by data iterator.
+ Instead they are initialized to 0 and can be set by set_states()
+ """
+ def __init__(self, symbol, data_names=('data',), label_names=('softmax_label',),
+ logger=logging, context=ctx.cpu(), work_load_list=None,
+ fixed_param_names=None, state_names=None):
+ super(Module, self).__init__(logger=logger)
+
+ if isinstance(context, ctx.Context):
+ context = [context]
+ self._context = context
+ if work_load_list is None:
+ work_load_list = [1] * len(self._context)
+ assert len(work_load_list) == len(self._context)
+ self._work_load_list = work_load_list
+
+ self._symbol = symbol
+
+ data_names = list(data_names) if data_names is not None else []
+ label_names = list(label_names) if label_names is not None else []
+ state_names = list(state_names) if state_names is not None else []
+ fixed_param_names = list(fixed_param_names) if fixed_param_names is not None else []
+
+ _check_input_names(symbol, data_names, "data", True)
+ _check_input_names(symbol, label_names, "label", False)
+ _check_input_names(symbol, state_names, "state", True)
+ _check_input_names(symbol, fixed_param_names, "fixed_param", True)
+
+ arg_names = symbol.list_arguments()
+ input_names = data_names + label_names + state_names
+ self._param_names = [x for x in arg_names if x not in input_names]
+ self._fixed_param_names = fixed_param_names
+ self._aux_names = symbol.list_auxiliary_states()
+ self._data_names = data_names
+ self._label_names = label_names
+ self._state_names = state_names
+ self._output_names = symbol.list_outputs()
+
+ self._arg_params = None
+ self._aux_params = None
+ self._params_dirty = False
+
+ self._optimizer = None
+ self._kvstore = None
+ self._update_on_kvstore = None
+ self._updater = None
+ self._preload_opt_states = None
+ self._grad_req = None
+
+ self._exec_group = None
+ self._data_shapes = None
+ self._label_shapes = None
+
+ @staticmethod
+ def load(prefix, epoch, load_optimizer_states=False, **kwargs):
+ """Create a model from previously saved checkpoint.
+
+ Parameters
+ ----------
+ prefix : str
+ path prefix of saved model files. You should have
+ "prefix-symbol.json", "prefix-xxxx.params", and
+ optionally "prefix-xxxx.states", where xxxx is the
+ epoch number.
+ epoch : int
+ epoch to load.
+ load_optimizer_states : bool
+ whether to load optimizer states. Checkpoint needs
+ to have been made with save_optimizer_states=True.
+ data_names : list of str
+ Default is `('data')` for a typical model used in image classification.
+ label_names : list of str
+ Default is `('softmax_label')` for a typical model used in image
+ classification.
+ logger : Logger
+ Default is `logging`.
+ context : Context or list of Context
+ Default is `cpu()`.
+ work_load_list : list of number
+ Default `None`, indicating uniform workload.
+ fixed_param_names: list of str
+ Default `None`, indicating no network parameters are fixed.
+ """
+ sym, args, auxs = load_checkpoint(prefix, epoch)
+ mod = Module(symbol=sym, **kwargs)
+ mod._arg_params = args
+ mod._aux_params = auxs
+ mod.params_initialized = True
+ if load_optimizer_states:
+ mod._preload_opt_states = '%s-%04d.states'%(prefix, epoch)
+ return mod
+
+ def save_checkpoint(self, prefix, epoch, save_optimizer_states=False):
+ """Save current progress to checkpoint.
+ Use mx.callback.module_checkpoint as epoch_end_callback to save during training.
+
+ Parameters
+ ----------
+ prefix : str
+ The file prefix to checkpoint to
+ epoch : int
+ The current epoch number
+ save_optimizer_states : bool
+ Whether to save optimizer states for continue training
+ """
+ self._symbol.save('%s-symbol.json'%prefix)
+ param_name = '%s-%04d.params' % (prefix, epoch)
+ self.save_params(param_name)
+ logging.info('Saved checkpoint to \"%s\"', param_name)
+ if save_optimizer_states:
+ state_name = '%s-%04d.states' % (prefix, epoch)
+ self.save_optimizer_states(state_name)
+ logging.info('Saved optimizer state to \"%s\"', state_name)
+
+ def _reset_bind(self):
+ """Internal function to reset binded state."""
+ self.binded = False
+ self._exec_group = None
+ self._data_shapes = None
+ self._label_shapes = None
+
+ @property
+ def data_names(self):
+ """A list of names for data required by this module."""
+ return self._data_names
+
+ @property
+ def label_names(self):
+ """A list of names for labels required by this module."""
+ return self._label_names
+
+ @property
+ def output_names(self):
+ """A list of names for the outputs of this module."""
+ return self._output_names
+
+ @property
+ def data_shapes(self):
+ """Get data shapes.
+ Returns
+ -------
+ A list of `(name, shape)` pairs.
+ """
+ assert self.binded
+ return self._data_shapes
+
+ @property
+ def label_shapes(self):
+ """Get label shapes.
+ Returns
+ -------
+ A list of `(name, shape)` pairs. The return value could be `None` if
+ the module does not need labels, or if the module is not binded for
+ training (in this case, label information is not available).
+ """
+ assert self.binded
+ return self._label_shapes
+
+ @property
+ def output_shapes(self):
+ """Get output shapes.
+ Returns
+ -------
+ A list of `(name, shape)` pairs.
+ """
+ assert self.binded
+ return self._exec_group.get_output_shapes()
+
+ def get_params(self):
+ """Get current parameters.
+ Returns
+ -------
+ `(arg_params, aux_params)`, each a dictionary of name to parameters (in
+ `NDArray`) mapping.
+ """
+ assert self.binded and self.params_initialized
+
+ if self._params_dirty:
+ self._sync_params_from_devices()
+ return (self._arg_params, self._aux_params)
+
+ def init_params(self, initializer=Uniform(0.01), arg_params=None, aux_params=None,
+ allow_missing=False, force_init=False):
+ """Initialize the parameters and auxiliary states.
+
+ Parameters
+ ----------
+ initializer : Initializer
+ Called to initialize parameters if needed.
+ arg_params : dict
+ If not None, should be a dictionary of existing arg_params. Initialization
+ will be copied from that.
+ aux_params : dict
+ If not None, should be a dictionary of existing aux_params. Initialization
+ will be copied from that.
+ allow_missing : bool
+ If true, params could contain missing values, and the initializer will be
+ called to fill those missing params.
+ force_init : bool
+ If true, will force re-initialize even if already initialized.
+ """
+ if self.params_initialized and not force_init:
+ warnings.warn("Parameters already initialized and force_init=False. "
+ "init_params call ignored.", stacklevel=2)
+ return
+ assert self.binded, 'call bind before initializing the parameters'
+
+ def _impl(name, arr, cache):
+ """Internal helper for parameter initialization"""
+ if cache is not None:
+ if name in cache:
+ cache_arr = cache[name]
+
+ # just in case the cached array is just the target itself
+ if cache_arr is not arr:
+ cache_arr.copyto(arr)
+ else:
+ if not allow_missing:
+ raise RuntimeError("%s is not presented" % name)
+ if initializer != None:
+ initializer(name, arr)
+ else:
+ initializer(name, arr)
+
+ attrs = self._symbol.attr_dict()
+ for name, arr in self._arg_params.items():
+ desc = InitDesc(name, attrs.get(name, None))
+ _impl(desc, arr, arg_params)
+
+ for name, arr in self._aux_params.items():
+ desc = InitDesc(name, attrs.get(name, None))
+ _impl(desc, arr, aux_params)
+
+ self.params_initialized = True
+ self._params_dirty = False
+
+ # copy the initialized parameters to devices
+ self._exec_group.set_params(self._arg_params, self._aux_params)
+
+ def set_params(self, arg_params, aux_params, allow_missing=False, force_init=True):
+ """Assign parameter and aux state values.
+
+ Parameters
+ ----------
+ arg_params : dict
+ Dictionary of name to value (`NDArray`) mapping.
+ aux_params : dict
+ Dictionary of name to value (`NDArray`) mapping.
+ allow_missing : bool
+ If true, params could contain missing values, and the initializer will be
+ called to fill those missing params.
+ force_init : bool
+ If true, will force re-initialize even if already initialized.
+
+ Examples
+ --------
+ An example of setting module parameters::
+ >>> sym, arg_params, aux_params = \
+ >>> mx.model.load_checkpoint(model_prefix, n_epoch_load)
+ >>> mod.set_params(arg_params=arg_params, aux_params=aux_params)
+ """
+ if not allow_missing:
+ self.init_params(initializer=None, arg_params=arg_params, aux_params=aux_params,
+ allow_missing=allow_missing, force_init=force_init)
+ return
+
+ if self.params_initialized and not force_init:
+ warnings.warn("Parameters already initialized and force_init=False. "
+ "set_params call ignored.", stacklevel=2)
+ return
+
+ self._exec_group.set_params(arg_params, aux_params)
+
+ # because we didn't update self._arg_params, they are dirty now.
+ self._params_dirty = True
+ self.params_initialized = True
+
+ def bind(self, data_shapes, label_shapes=None, for_training=True,
+ inputs_need_grad=False, force_rebind=False, shared_module=None,
+ grad_req='write'):
+ """Bind the symbols to construct executors. This is necessary before one
+ can perform computation with the module.
+
+ Parameters
+ ----------
+ data_shapes : list of (str, tuple)
+ Typically is `data_iter.provide_data`.
+ label_shapes : list of (str, tuple)
+ Typically is `data_iter.provide_label`.
+ for_training : bool
+ Default is `True`. Whether the executors should be bind for training.
+ inputs_need_grad : bool
+ Default is `False`. Whether the gradients to the input data need to be computed.
+ Typically this is not needed. But this might be needed when implementing composition
+ of modules.
+ force_rebind : bool
+ Default is `False`. This function does nothing if the executors are already
+ binded. But with this `True`, the executors will be forced to rebind.
+ shared_module : Module
+ Default is `None`. This is used in bucketing. When not `None`, the shared module
+ essentially corresponds to a different bucket -- a module with different symbol
+ but with the same sets of parameters (e.g. unrolled RNNs with different lengths).
+ """
+ # force rebinding is typically used when one want to switch from
+ # training to prediction phase.
+ if force_rebind:
+ self._reset_bind()
+
+ if self.binded:
+ self.logger.warning('Already binded, ignoring bind()')
+ return
+
+ self.for_training = for_training
+ self.inputs_need_grad = inputs_need_grad
+ self.binded = True
+ self._grad_req = grad_req
+
+ if not for_training:
+ assert not inputs_need_grad
+ else:
+ pass
+ # this is not True, as some module might not contains a loss function
+ # that consumes the labels
+ # assert label_shapes is not None
+
+ # self._data_shapes, self._label_shapes = _parse_data_desc(
+ # self.data_names, self.label_names, data_shapes, label_shapes)
+ self._data_shapes, self._label_shapes = zip(*[_parse_data_desc(self.data_names, self.label_names, data_shape, label_shape)
+ for data_shape, label_shape in zip(data_shapes, label_shapes)])
+ if self._label_shapes.count(None) == len(self._label_shapes):
+ self._label_shapes = None
+
+ if shared_module is not None:
+ assert isinstance(shared_module, Module) and \
+ shared_module.binded and shared_module.params_initialized
+ shared_group = shared_module._exec_group
+ else:
+ shared_group = None
+
+ self._exec_group = DataParallelExecutorGroup(self._symbol, self._context,
+ self._work_load_list, self._data_shapes,
+ self._label_shapes, self._param_names,
+ for_training, inputs_need_grad,
+ shared_group, logger=self.logger,
+ fixed_param_names=self._fixed_param_names,
+ grad_req=grad_req,
+ state_names=self._state_names)
+ # self._total_exec_bytes = self._exec_group._total_exec_bytes
+ if shared_module is not None:
+ self.params_initialized = True
+ self._arg_params = shared_module._arg_params
+ self._aux_params = shared_module._aux_params
+ elif self.params_initialized:
+ # if the parameters are already initialized, we are re-binding
+ # so automatically copy the already initialized params
+ self._exec_group.set_params(self._arg_params, self._aux_params)
+ else:
+ assert self._arg_params is None and self._aux_params is None
+ param_arrays = [
+ nd.zeros(x[0].shape, dtype=x[0].dtype)
+ for x in self._exec_group.param_arrays
+ ]
+ self._arg_params = {name:arr for name, arr in zip(self._param_names, param_arrays)}
+
+ aux_arrays = [
+ nd.zeros(x[0].shape, dtype=x[0].dtype)
+ for x in self._exec_group.aux_arrays
+ ]
+ self._aux_params = {name:arr for name, arr in zip(self._aux_names, aux_arrays)}
+
+ if shared_module is not None and shared_module.optimizer_initialized:
+ self.borrow_optimizer(shared_module)
+
+
+ def reshape(self, data_shapes, label_shapes=None):
+ """Reshape the module for new input shapes.
+
+ Parameters
+ ----------
+ data_shapes : list of (str, tuple)
+ Typically is `data_iter.provide_data`.
+ label_shapes : list of (str, tuple)
+ Typically is `data_iter.provide_label`.
+ """
+ assert self.binded
+ # self._data_shapes, self._label_shapes = _parse_data_desc(
+ # self.data_names, self.label_names, data_shapes, label_shapes)
+ self._data_shapes, self._label_shapes = zip(*[_parse_data_desc(self.data_names, self.label_names, data_shape, label_shape)
+ for data_shape, label_shape in zip(data_shapes, label_shapes)])
+
+ self._exec_group.reshape(self._data_shapes, self._label_shapes)
+
+
+ def init_optimizer(self, kvstore='local', optimizer='sgd',
+ optimizer_params=(('learning_rate', 0.01),), force_init=False):
+ """Install and initialize optimizers.
+
+ Parameters
+ ----------
+ kvstore : str or KVStore
+ Default `'local'`.
+ optimizer : str or Optimizer
+ Default `'sgd'`
+ optimizer_params : dict
+ Default `(('learning_rate', 0.01),)`. The default value is not a dictionary,
+ just to avoid pylint warning of dangerous default values.
+ force_init : bool
+ Default `False`, indicating whether we should force re-initializing the
+ optimizer in the case an optimizer is already installed.
+ """
+ assert self.binded and self.params_initialized
+
+ if self.optimizer_initialized and not force_init:
+ self.logger.warning('optimizer already initialized, ignoring...')
+ return
+
+ (kvstore, update_on_kvstore) = \
+ _create_kvstore(kvstore, len(self._context), self._arg_params)
+
+ batch_size = self._exec_group.batch_size
+ if kvstore and 'dist' in kvstore.type and '_sync' in kvstore.type:
+ batch_size *= kvstore.num_workers
+ rescale_grad = 1.0/batch_size
+
+ if isinstance(optimizer, str):
+ idx2name = {}
+ if update_on_kvstore:
+ idx2name.update(enumerate(self._exec_group.param_names))
+ else:
+ for k in range(len(self._context)):
+ idx2name.update({i*len(self._context)+k: n
+ for i, n in enumerate(self._exec_group.param_names)})
+ optimizer_params = dict(optimizer_params)
+ if 'rescale_grad' not in optimizer_params:
+ optimizer_params['rescale_grad'] = rescale_grad
+ optimizer = opt.create(optimizer,
+ sym=self.symbol, param_idx2name=idx2name,
+ **optimizer_params)
+ else:
+ assert isinstance(optimizer, opt.Optimizer)
+ if optimizer.rescale_grad != rescale_grad:
+ #pylint: disable=no-member
+ warnings.warn(
+ "Optimizer created manually outside Module but rescale_grad " +
+ "is not normalized to 1.0/batch_size/num_workers (%s vs. %s). "%(
+ optimizer.rescale_grad, rescale_grad) +
+ "Is this intended?", stacklevel=2)
+
+ self._optimizer = optimizer
+ self._kvstore = kvstore
+ self._update_on_kvstore = update_on_kvstore
+ self._updater = None
+
+ if kvstore:
+ # copy initialized local parameters to kvstore
+ _initialize_kvstore(kvstore=kvstore,
+ param_arrays=self._exec_group.param_arrays,
+ arg_params=self._arg_params,
+ param_names=self._param_names,
+ update_on_kvstore=update_on_kvstore)
+ if update_on_kvstore:
+ kvstore.set_optimizer(self._optimizer)
+ else:
+ self._updater = opt.get_updater(optimizer)
+
+ self.optimizer_initialized = True
+
+ if self._preload_opt_states is not None:
+ self.load_optimizer_states(self._preload_opt_states)
+ self._preload_opt_states = None
+
+ def borrow_optimizer(self, shared_module):
+ """Borrow optimizer from a shared module. Used in bucketing, where exactly the same
+ optimizer (esp. kvstore) is used.
+
+ Parameters
+ ----------
+ shared_module : Module
+ """
+ assert shared_module.optimizer_initialized
+ self._optimizer = shared_module._optimizer
+ self._kvstore = shared_module._kvstore
+ self._update_on_kvstore = shared_module._update_on_kvstore
+ self._updater = shared_module._updater
+ self.optimizer_initialized = True
+
+ def forward(self, data_batch, is_train=None):
+ """Forward computation.
+
+ Parameters
+ ----------
+ data_batch : DataBatch
+ Could be anything with similar API implemented.
+ is_train : bool
+ Default is `None`, which means `is_train` takes the value of `self.for_training`.
+ """
+ assert self.binded and self.params_initialized
+ self._exec_group.forward(data_batch, is_train)
+
+ def backward(self, out_grads=None):
+ """Backward computation.
+
+ Parameters
+ ----------
+ out_grads : NDArray or list of NDArray, optional
+ Gradient on the outputs to be propagated back.
+ This parameter is only needed when bind is called
+ on outputs that are not a loss function.
+ """
+ assert self.binded and self.params_initialized
+ self._exec_group.backward(out_grads=out_grads)
+
+ def update(self):
+ """Update parameters according to the installed optimizer and the gradients computed
+ in the previous forward-backward batch.
+ """
+ assert self.binded and self.params_initialized and self.optimizer_initialized
+
+ self._params_dirty = True
+ if self._update_on_kvstore:
+ _update_params_on_kvstore(self._exec_group.param_arrays,
+ self._exec_group.grad_arrays,
+ self._kvstore)
+ else:
+ _update_params(self._exec_group.param_arrays,
+ self._exec_group.grad_arrays,
+ updater=self._updater,
+ num_device=len(self._context),
+ kvstore=self._kvstore)
+
+ def get_outputs(self, merge_multi_context=True):
+ """Get outputs of the previous forward computation.
+
+ Parameters
+ ----------
+ merge_multi_context : bool
+ Default is `True`. In the case when data-parallelism is used, the outputs
+ will be collected from multiple devices. A `True` value indicate that we
+ should merge the collected results so that they look like from a single
+ executor.
+
+ Returns
+ -------
+ If `merge_multi_context` is `True`, it is like `[out1, out2]`. Otherwise, it
+ is like `[[out1_dev1, out1_dev2], [out2_dev1, out2_dev2]]`. All the output
+ elements are `NDArray`.
+ """
+ assert self.binded and self.params_initialized
+ return self._exec_group.get_outputs(merge_multi_context=merge_multi_context)
+
+ def get_input_grads(self, merge_multi_context=True):
+ """Get the gradients with respect to the inputs of the module.
+
+ Parameters
+ ----------
+ merge_multi_context : bool
+ Default is `True`. In the case when data-parallelism is used, the outputs
+ will be collected from multiple devices. A `True` value indicate that we
+ should merge the collected results so that they look like from a single
+ executor.
+
+ Returns
+ -------
+ If `merge_multi_context` is `True`, it is like `[grad1, grad2]`. Otherwise, it
+ is like `[[grad1_dev1, grad1_dev2], [grad2_dev1, grad2_dev2]]`. All the output
+ elements are `NDArray`.
+ """
+ assert self.binded and self.params_initialized and self.inputs_need_grad
+ return self._exec_group.get_input_grads(merge_multi_context=merge_multi_context)
+
+ def get_states(self, merge_multi_context=True):
+ """Get states from all devices
+
+ Parameters
+ ----------
+ merge_multi_context : bool
+ Default is `True`. In the case when data-parallelism is used, the states
+ will be collected from multiple devices. A `True` value indicate that we
+ should merge the collected results so that they look like from a single
+ executor.
+
+ Returns
+ -------
+ If `merge_multi_context` is `True`, it is like `[out1, out2]`. Otherwise, it
+ is like `[[out1_dev1, out1_dev2], [out2_dev1, out2_dev2]]`. All the output
+ elements are `NDArray`.
+ """
+ assert self.binded and self.params_initialized
+ return self._exec_group.get_states(merge_multi_context=merge_multi_context)
+
+ def set_states(self, states=None, value=None):
+ """Set value for states. Only one of states & value can be specified.
+
+ Parameters
+ ----------
+ states : list of list of NDArrays
+ source states arrays formatted like [[state1_dev1, state1_dev2],
+ [state2_dev1, state2_dev2]].
+ value : number
+ a single scalar value for all state arrays.
+ """
+ assert self.binded and self.params_initialized
+ self._exec_group.set_states(states, value)
+
+ def update_metric(self, eval_metric, labels):
+ """Evaluate and accumulate evaluation metric on outputs of the last forward computation.
+
+ Parameters
+ ----------
+ eval_metric : EvalMetric
+ labels : list of NDArray
+ Typically `data_batch.label`.
+ """
+ self._exec_group.update_metric(eval_metric, labels)
+
+ def _sync_params_from_devices(self):
+ """Synchronize parameters from devices to CPU. This function should be called after
+ calling `update` that updates the parameters on the devices, before one can read the
+ latest parameters from `self._arg_params` and `self._aux_params`.
+ """
+ self._exec_group.get_params(self._arg_params, self._aux_params)
+ self._params_dirty = False
+
+ def save_optimizer_states(self, fname):
+ """Save optimizer (updater) state to file
+
+ Parameters
+ ----------
+ fname : str
+ Path to output states file.
+ """
+ assert self.optimizer_initialized
+
+ if self._update_on_kvstore:
+ self._kvstore.save_optimizer_states(fname)
+ else:
+ with open(fname, 'wb') as fout:
+ fout.write(self._updater.get_states())
+
+ def load_optimizer_states(self, fname):
+ """Load optimizer (updater) state from file
+
+ Parameters
+ ----------
+ fname : str
+ Path to input states file.
+ """
+ assert self.optimizer_initialized
+
+ if self._update_on_kvstore:
+ self._kvstore.load_optimizer_states(fname)
+ else:
+ self._updater.set_states(open(fname, 'rb').read())
+
+ def install_monitor(self, mon):
+ """ Install monitor on all executors """
+ assert self.binded
+ self._exec_group.install_monitor(mon)
+
+
+class MutableModule(BaseModule):
+ """A mutable module is a module that supports variable input data.
+
+ Parameters
+ ----------
+ symbol : Symbol
+ data_names : list of str
+ label_names : list of str
+ logger : Logger
+ context : Context or list of Context
+ work_load_list : list of number
+ max_data_shapes : list of (name, shape) tuple, designating inputs whose shape vary
+ max_label_shapes : list of (name, shape) tuple, designating inputs whose shape vary
+ fixed_param_prefix : list of str, indicating fixed parameters
+ """
+ def __init__(self, symbol, data_names, label_names,
+ logger=logging, context=ctx.cpu(), work_load_list=None,
+ max_data_shapes=None, max_label_shapes=None, fixed_param_prefix=None):
+ super(MutableModule, self).__init__(logger=logger)
+ self._symbol = symbol
+ self._data_names = data_names
+ self._label_names = label_names
+ self._context = context
+ self._work_load_list = work_load_list
+
+ self._curr_module = None
+ self._max_data_shapes = max_data_shapes
+ self._max_label_shapes = max_label_shapes
+ self._fixed_param_prefix = fixed_param_prefix
+
+ fixed_param_names = list()
+ if fixed_param_prefix is not None:
+ for name in self._symbol.list_arguments():
+ for prefix in self._fixed_param_prefix:
+ if prefix in name:
+ fixed_param_names.append(name)
+ self._fixed_param_names = fixed_param_names
+ self._preload_opt_states = None
+
+ def _reset_bind(self):
+ self.binded = False
+ self._curr_module = None
+
+ @property
+ def data_names(self):
+ return self._data_names
+
+ @property
+ def output_names(self):
+ return self._symbol.list_outputs()
+
+ @property
+ def data_shapes(self):
+ assert self.binded
+ return self._curr_module.data_shapes
+
+ @property
+ def label_shapes(self):
+ assert self.binded
+ return self._curr_module.label_shapes
+
+ @property
+ def output_shapes(self):
+ assert self.binded
+ return self._curr_module.output_shapes
+
+ def get_params(self):
+ assert self.binded and self.params_initialized
+ return self._curr_module.get_params()
+
+ def init_params(self, initializer=Uniform(0.01), arg_params=None, aux_params=None,
+ allow_missing=False, force_init=False):
+ if self.params_initialized and not force_init:
+ return
+ assert self.binded, 'call bind before initializing the parameters'
+ self._curr_module.init_params(initializer=initializer, arg_params=arg_params,
+ aux_params=aux_params, allow_missing=allow_missing,
+ force_init=force_init)
+ self.params_initialized = True
+
+ def bind(self, data_shapes, label_shapes=None, for_training=True,
+ inputs_need_grad=False, force_rebind=False, shared_module=None, grad_req='write'):
+ # in case we already initialized params, keep it
+ if self.params_initialized:
+ arg_params, aux_params = self.get_params()
+
+ # force rebinding is typically used when one want to switch from
+ # training to prediction phase.
+ if force_rebind:
+ self._reset_bind()
+
+ if self.binded:
+ self.logger.warning('Already binded, ignoring bind()')
+ return
+
+ assert shared_module is None, 'shared_module for MutableModule is not supported'
+
+ self.for_training = for_training
+ self.inputs_need_grad = inputs_need_grad
+ self.binded = True
+
+ max_shapes_dict = dict()
+ if self._max_data_shapes is not None:
+ max_shapes_dict.update(dict(self._max_data_shapes[0]))
+ if self._max_label_shapes is not None:
+ max_shapes_dict.update(dict(self._max_label_shapes[0]))
+
+ max_data_shapes = list()
+ for name, shape in data_shapes[0]:
+ if name in max_shapes_dict:
+ max_data_shapes.append((name, max_shapes_dict[name]))
+ else:
+ max_data_shapes.append((name, shape))
+
+ max_label_shapes = list()
+ if not label_shapes.count(None) == len(label_shapes):
+ for name, shape in label_shapes[0]:
+ if name in max_shapes_dict:
+ max_label_shapes.append((name, max_shapes_dict[name]))
+ else:
+ max_label_shapes.append((name, shape))
+
+ if len(max_label_shapes) == 0:
+ max_label_shapes = None
+
+ module = Module(self._symbol, self._data_names, self._label_names, logger=self.logger,
+ context=self._context, work_load_list=self._work_load_list,
+ fixed_param_names=self._fixed_param_names)
+ module.bind([max_data_shapes for _ in xrange(len(self._context))], [max_label_shapes for _ in xrange(len(self._context))],
+ for_training, inputs_need_grad, force_rebind=False, shared_module=None)
+ self._curr_module = module
+
+ # copy back saved params, if already initialized
+ if self.params_initialized:
+ self.set_params(arg_params, aux_params)
+
+ def save_checkpoint(self, prefix, epoch, save_optimizer_states=False):
+ """Save current progress to checkpoint.
+ Use mx.callback.module_checkpoint as epoch_end_callback to save during training.
+
+ Parameters
+ ----------
+ prefix : str
+ The file prefix to checkpoint to
+ epoch : int
+ The current epoch number
+ save_optimizer_states : bool
+ Whether to save optimizer states for continue training
+ """
+ self._curr_module.save_checkpoint(prefix, epoch, save_optimizer_states)
+
+ def init_optimizer(self, kvstore='local', optimizer='sgd',
+ optimizer_params=(('learning_rate', 0.01),), force_init=False):
+ assert self.binded and self.params_initialized
+ if self.optimizer_initialized and not force_init:
+ self.logger.warning('optimizer already initialized, ignoring.')
+ return
+
+ self._curr_module._preload_opt_states = self._preload_opt_states
+ self._curr_module.init_optimizer(kvstore, optimizer, optimizer_params,
+ force_init=force_init)
+ self.optimizer_initialized = True
+
+ def fit(self, train_data, eval_data=None, eval_metric='acc',
+ epoch_end_callback=None, batch_end_callback=None, kvstore='local',
+ optimizer='sgd', optimizer_params=(('learning_rate', 0.01),),
+ eval_end_callback=None,
+ eval_batch_end_callback=None, initializer=Uniform(0.01),
+ arg_params=None, aux_params=None, allow_missing=False,
+ force_rebind=False, force_init=False, begin_epoch=0, num_epoch=None,
+ validation_metric=None, monitor=None, prefix=None):
+ """Train the module parameters.
+
+ Parameters
+ ----------
+ train_data : DataIter
+ eval_data : DataIter
+ If not `None`, will be used as validation set and evaluate the performance
+ after each epoch.
+ eval_metric : str or EvalMetric
+ Default `'acc'`. The performance measure used to display during training.
+ epoch_end_callback : function or list of function
+ Each callback will be called with the current `epoch`, `symbol`, `arg_params`
+ and `aux_params`.
+ batch_end_callback : function or list of function
+ Each callback will be called with a `BatchEndParam`.
+ kvstore : str or KVStore
+ Default `'local'`.
+ optimizer : str or Optimizer
+ Default `'sgd'`
+ optimizer_params : dict
+ Default `(('learning_rate', 0.01),)`. The parameters for the optimizer constructor.
+ The default value is not a `dict`, just to avoid pylint warning on dangerous
+ default values.
+ eval_end_callback : function or list of function
+ These will be called at the end of each full evaluation, with the metrics over
+ the entire evaluation set.
+ eval_batch_end_callback : function or list of function
+ These will be called at the end of each minibatch during evaluation
+ initializer : Initializer
+ Will be called to initialize the module parameters if not already initialized.
+ arg_params : dict
+ Default `None`, if not `None`, should be existing parameters from a trained
+ model or loaded from a checkpoint (previously saved model). In this case,
+ the value here will be used to initialize the module parameters, unless they
+ are already initialized by the user via a call to `init_params` or `fit`.
+ `arg_params` has higher priority to `initializer`.
+ aux_params : dict
+ Default `None`. Similar to `arg_params`, except for auxiliary states.
+ allow_missing : bool
+ Default `False`. Indicate whether we allow missing parameters when `arg_params`
+ and `aux_params` are not `None`. If this is `True`, then the missing parameters
+ will be initialized via the `initializer`.
+ force_rebind : bool
+ Default `False`. Whether to force rebinding the executors if already binded.
+ force_init : bool
+ Default `False`. Indicate whether we should force initialization even if the
+ parameters are already initialized.
+ begin_epoch : int
+ Default `0`. Indicate the starting epoch. Usually, if we are resuming from a
+ checkpoint saved at a previous training phase at epoch N, then we should specify
+ this value as N+1.
+ num_epoch : int
+ Number of epochs to run training.
+
+ Examples
+ --------
+ An example of using fit for training::
+ >>> #Assume training dataIter and validation dataIter are ready
+ >>> mod.fit(train_data=train_dataiter, eval_data=val_dataiter,
+ optimizer_params={'learning_rate':0.01, 'momentum': 0.9},
+ num_epoch=10)
+ """
+ assert num_epoch is not None, 'please specify number of epochs'
+
+ self.bind(data_shapes=train_data.provide_data, label_shapes=train_data.provide_label,
+ for_training=True, force_rebind=force_rebind)
+ if monitor is not None:
+ self.install_monitor(monitor)
+ self.init_params(initializer=initializer, arg_params=arg_params, aux_params=aux_params,
+ allow_missing=allow_missing, force_init=force_init)
+ self.init_optimizer(kvstore=kvstore, optimizer=optimizer,
+ optimizer_params=optimizer_params)
+
+ if validation_metric is None:
+ validation_metric = eval_metric
+ if not isinstance(eval_metric, metric.EvalMetric):
+ eval_metric = metric.create(eval_metric)
+
+ ################################################################################
+ # training loop
+ ################################################################################
+ for epoch in range(begin_epoch, num_epoch):
+ tic = time.time()
+ eval_metric.reset()
+ for nbatch, data_batch in enumerate(train_data):
+ if monitor is not None:
+ monitor.tic()
+ self.forward_backward(data_batch)
+ self.update()
+ self.update_metric(eval_metric, data_batch.label)
+
+ if monitor is not None:
+ monitor.toc_print()
+
+ if batch_end_callback is not None:
+ batch_end_params = BatchEndParam(epoch=epoch, nbatch=nbatch,
+ eval_metric=eval_metric,
+ locals=locals())
+ for callback in _as_list(batch_end_callback):
+ callback(batch_end_params)
+
+ # one epoch of training is finished
+ for name, val in eval_metric.get_name_value():
+ self.logger.info('Epoch[%d] Train-%s=%f', epoch, name, val)
+ toc = time.time()
+ self.logger.info('Epoch[%d] Time cost=%.3f', epoch, (toc-tic))
+
+ # sync aux params across devices
+ arg_params, aux_params = self.get_params()
+ self.set_params(arg_params, aux_params)
+
+ if epoch_end_callback is not None:
+ for callback in _as_list(epoch_end_callback):
+ callback(epoch, self.symbol, arg_params, aux_params)
+
+ #----------------------------------------
+ # evaluation on validation set
+ if eval_data:
+ res = self.score(eval_data, validation_metric,
+ score_end_callback=eval_end_callback,
+ batch_end_callback=eval_batch_end_callback, epoch=epoch)
+ #TODO: pull this into default
+ for name, val in res:
+ self.logger.info('Epoch[%d] Validation-%s=%f', epoch, name, val)
+
+ # end of 1 epoch, reset the data-iter for another epoch
+ train_data.reset()
+
+
+ def forward(self, data_batch, is_train=None):
+ assert self.binded and self.params_initialized
+
+ # get current_shapes
+ if self._curr_module.label_shapes is not None:
+ current_shapes = [dict(self._curr_module.data_shapes[i] + self._curr_module.label_shapes[i]) for i in xrange(len(self._context))]
+ else:
+ current_shapes = [dict(self._curr_module.data_shapes[i]) for i in xrange(len(self._context))]
+
+ # get input_shapes
+ if is_train:
+ input_shapes = [dict(data_batch.provide_data[i] + data_batch.provide_label[i]) for i in xrange(len(self._context))]
+ else:
+ input_shapes = [dict(data_batch.provide_data[i]) for i in xrange(len(data_batch.provide_data))]
+
+ # decide if shape changed
+ shape_changed = len(current_shapes) != len(input_shapes)
+ for pre, cur in zip(current_shapes, input_shapes):
+ for k, v in pre.items():
+ if v != cur[k]:
+ shape_changed = True
+
+ if shape_changed:
+ # self._curr_module.reshape(data_batch.provide_data, data_batch.provide_label)
+ module = Module(self._symbol, self._data_names, self._label_names,
+ logger=self.logger, context=[self._context[i] for i in xrange(len(data_batch.provide_data))],
+ work_load_list=self._work_load_list,
+ fixed_param_names=self._fixed_param_names)
+ module.bind(data_batch.provide_data, data_batch.provide_label, self._curr_module.for_training,
+ self._curr_module.inputs_need_grad, force_rebind=False,
+ shared_module=self._curr_module)
+ self._curr_module = module
+
+ self._curr_module.forward(data_batch, is_train=is_train)
+
+ def backward(self, out_grads=None):
+ assert self.binded and self.params_initialized
+ self._curr_module.backward(out_grads=out_grads)
+
+ def update(self):
+ assert self.binded and self.params_initialized and self.optimizer_initialized
+ self._curr_module.update()
+
+ def get_outputs(self, merge_multi_context=True):
+ assert self.binded and self.params_initialized
+ return self._curr_module.get_outputs(merge_multi_context=merge_multi_context)
+ def get_input_grads(self, merge_multi_context=True):
+ assert self.binded and self.params_initialized and self.inputs_need_grad
+ return self._curr_module.get_input_grads(merge_multi_context=merge_multi_context)
+
+ def update_metric(self, eval_metric, labels):
+ assert self.binded and self.params_initialized
+ self._curr_module.update_metric(eval_metric, labels)
+
+ def install_monitor(self, mon):
+ """ Install monitor on all executors """
+ assert self.binded
+ self._curr_module.install_monitor(mon)
diff --git a/faster_rcnn/core/rcnn.py b/faster_rcnn/core/rcnn.py
new file mode 100644
index 0000000..d3863ac
--- /dev/null
+++ b/faster_rcnn/core/rcnn.py
@@ -0,0 +1,186 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Yuwen Xiong
+# --------------------------------------------------------
+"""
+Fast R-CNN:
+data =
+ {'data': [num_images, c, h, w],
+ 'rois': [num_rois, 5]}
+label =
+ {'label': [num_rois],
+ 'bbox_target': [num_rois, 4 * num_classes],
+ 'bbox_weight': [num_rois, 4 * num_classes]}
+roidb extended format [image_index]
+ ['image', 'height', 'width', 'flipped',
+ 'boxes', 'gt_classes', 'gt_overlaps', 'max_classes', 'max_overlaps', 'bbox_targets']
+"""
+
+import numpy as np
+import numpy.random as npr
+
+from utils.image import get_image, tensor_vstack
+from bbox.bbox_transform import bbox_overlaps, bbox_transform
+from bbox.bbox_regression import expand_bbox_regression_targets
+
+
+def get_rcnn_testbatch(roidb, cfg):
+ """
+ return a dict of testbatch
+ :param roidb: ['image', 'flipped'] + ['boxes']
+ :return: data, label, im_info
+ """
+ # assert len(roidb) == 1, 'Single batch only'
+ imgs, roidb = get_image(roidb, cfg)
+ im_array = imgs
+ im_info = [np.array([roidb[i]['im_info']], dtype=np.float32) for i in range(len(roidb))]
+
+ im_rois = [roidb[i]['boxes'] for i in range(len(roidb))]
+ rois = im_rois
+ rois_array = [np.hstack((0 * np.ones((rois[i].shape[0], 1)), rois[i])) for i in range(len(rois))]
+
+ data = [{'data': im_array[i],
+ 'rois': rois_array[i]} for i in range(len(roidb))]
+ label = {}
+
+ return data, label, im_info
+
+
+def get_rcnn_batch(roidb, cfg):
+ """
+ return a dict of multiple images
+ :param roidb: a list of dict, whose length controls batch size
+ ['images', 'flipped'] + ['gt_boxes', 'boxes', 'gt_overlap'] => ['bbox_targets']
+ :return: data, label
+ """
+ num_images = len(roidb)
+ imgs, roidb = get_image(roidb, cfg)
+ im_array = tensor_vstack(imgs)
+
+ assert cfg.TRAIN.BATCH_ROIS == -1 or cfg.TRAIN.BATCH_ROIS % cfg.TRAIN.BATCH_IMAGES == 0, \
+ 'BATCHIMAGES {} must divide BATCH_ROIS {}'.format(cfg.TRAIN.BATCH_IMAGES, cfg.TRAIN.BATCH_ROIS)
+
+ if cfg.TRAIN.BATCH_ROIS == -1:
+ rois_per_image = np.sum([iroidb['boxes'].shape[0] for iroidb in roidb])
+ fg_rois_per_image = rois_per_image
+ else:
+ rois_per_image = cfg.TRAIN.BATCH_ROIS / cfg.TRAIN.BATCH_IMAGES
+ fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image).astype(int)
+
+ rois_array = list()
+ labels_array = list()
+ bbox_targets_array = list()
+ bbox_weights_array = list()
+
+ for im_i in range(num_images):
+ roi_rec = roidb[im_i]
+
+ # infer num_classes from gt_overlaps
+ num_classes = roi_rec['gt_overlaps'].shape[1]
+
+ # label = class RoI has max overlap with
+ rois = roi_rec['boxes']
+ labels = roi_rec['max_classes']
+ overlaps = roi_rec['max_overlaps']
+ bbox_targets = roi_rec['bbox_targets']
+
+ im_rois, labels, bbox_targets, bbox_weights = \
+ sample_rois(rois, fg_rois_per_image, rois_per_image, num_classes, cfg,
+ labels, overlaps, bbox_targets)
+
+ # project im_rois
+ # do not round roi
+ rois = im_rois
+ batch_index = im_i * np.ones((rois.shape[0], 1))
+ rois_array_this_image = np.hstack((batch_index, rois))
+ rois_array.append(rois_array_this_image)
+
+ # add labels
+ labels_array.append(labels)
+ bbox_targets_array.append(bbox_targets)
+ bbox_weights_array.append(bbox_weights)
+
+ rois_array = np.array(rois_array)
+ labels_array = np.array(labels_array)
+ bbox_targets_array = np.array(bbox_targets_array)
+ bbox_weights_array = np.array(bbox_weights_array)
+
+ data = {'data': im_array,
+ 'rois': rois_array}
+ label = {'label': labels_array,
+ 'bbox_target': bbox_targets_array,
+ 'bbox_weight': bbox_weights_array}
+
+ return data, label
+
+
+def sample_rois(rois, fg_rois_per_image, rois_per_image, num_classes, cfg,
+ labels=None, overlaps=None, bbox_targets=None, gt_boxes=None):
+ """
+ generate random sample of ROIs comprising foreground and background examples
+ :param rois: all_rois [n, 4]; e2e: [n, 5] with batch_index
+ :param fg_rois_per_image: foreground roi number
+ :param rois_per_image: total roi number
+ :param num_classes: number of classes
+ :param labels: maybe precomputed
+ :param overlaps: maybe precomputed (max_overlaps)
+ :param bbox_targets: maybe precomputed
+ :param gt_boxes: optional for e2e [n, 5] (x1, y1, x2, y2, cls)
+ :return: (labels, rois, bbox_targets, bbox_weights)
+ """
+ if labels is None:
+ overlaps = bbox_overlaps(rois[:, 1:].astype(np.float), gt_boxes[:, :4].astype(np.float))
+ gt_assignment = overlaps.argmax(axis=1)
+ overlaps = overlaps.max(axis=1)
+ labels = gt_boxes[gt_assignment, 4]
+
+ # foreground RoI with FG_THRESH overlap
+ fg_indexes = np.where(overlaps >= cfg.TRAIN.FG_THRESH)[0]
+ # guard against the case when an image has fewer than fg_rois_per_image foreground RoIs
+ fg_rois_per_this_image = np.minimum(fg_rois_per_image, fg_indexes.size)
+ # Sample foreground regions without replacement
+ if len(fg_indexes) > fg_rois_per_this_image:
+ fg_indexes = npr.choice(fg_indexes, size=fg_rois_per_this_image, replace=False)
+
+ # Select background RoIs as those within [BG_THRESH_LO, BG_THRESH_HI)
+ bg_indexes = np.where((overlaps < cfg.TRAIN.BG_THRESH_HI) & (overlaps >= cfg.TRAIN.BG_THRESH_LO))[0]
+ # Compute number of background RoIs to take from this image (guarding against there being fewer than desired)
+ bg_rois_per_this_image = rois_per_image - fg_rois_per_this_image
+ bg_rois_per_this_image = np.minimum(bg_rois_per_this_image, bg_indexes.size)
+ # Sample foreground regions without replacement
+ if len(bg_indexes) > bg_rois_per_this_image:
+ bg_indexes = npr.choice(bg_indexes, size=bg_rois_per_this_image, replace=False)
+
+ # indexes selected
+ keep_indexes = np.append(fg_indexes, bg_indexes)
+
+ # pad more to ensure a fixed minibatch size
+ while keep_indexes.shape[0] < rois_per_image:
+ gap = np.minimum(len(rois), rois_per_image - keep_indexes.shape[0])
+ gap_indexes = npr.choice(range(len(rois)), size=gap, replace=False)
+ keep_indexes = np.append(keep_indexes, gap_indexes)
+
+ # select labels
+ labels = labels[keep_indexes]
+ # set labels of bg_rois to be 0
+ labels[fg_rois_per_this_image:] = 0
+ rois = rois[keep_indexes]
+
+ # load or compute bbox_target
+ if bbox_targets is not None:
+ bbox_target_data = bbox_targets[keep_indexes, :]
+ else:
+ targets = bbox_transform(rois[:, 1:], gt_boxes[gt_assignment[keep_indexes], :4])
+ if cfg.TRAIN.BBOX_NORMALIZATION_PRECOMPUTED:
+ targets = ((targets - np.array(cfg.TRAIN.BBOX_MEANS))
+ / np.array(cfg.TRAIN.BBOX_STDS))
+ bbox_target_data = np.hstack((labels[:, np.newaxis], targets))
+
+ bbox_targets, bbox_weights = \
+ expand_bbox_regression_targets(bbox_target_data, num_classes, cfg)
+
+ return rois, labels, bbox_targets, bbox_weights
+
diff --git a/faster_rcnn/core/tester.py b/faster_rcnn/core/tester.py
new file mode 100644
index 0000000..db7e433
--- /dev/null
+++ b/faster_rcnn/core/tester.py
@@ -0,0 +1,307 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Yuwen Xiong
+# --------------------------------------------------------
+
+import cPickle
+import os
+import time
+import mxnet as mx
+import numpy as np
+
+from module import MutableModule
+from utils import image
+from bbox.bbox_transform import bbox_pred, clip_boxes
+from nms.nms import py_nms_wrapper, cpu_nms_wrapper, gpu_nms_wrapper
+from utils.PrefetchingIter import PrefetchingIter
+
+
+class Predictor(object):
+ def __init__(self, symbol, data_names, label_names,
+ context=mx.cpu(), max_data_shapes=None,
+ provide_data=None, provide_label=None,
+ arg_params=None, aux_params=None):
+ self._mod = MutableModule(symbol, data_names, label_names,
+ context=context, max_data_shapes=max_data_shapes)
+ self._mod.bind(provide_data, provide_label, for_training=False)
+ self._mod.init_params(arg_params=arg_params, aux_params=aux_params)
+
+ def predict(self, data_batch):
+ self._mod.forward(data_batch)
+ # [dict(zip(self._mod.output_names, _)) for _ in zip(*self._mod.get_outputs(merge_multi_context=False))]
+ return [dict(zip(self._mod.output_names, _)) for _ in zip(*self._mod.get_outputs(merge_multi_context=False))]
+
+
+def im_proposal(predictor, data_batch, data_names, scales):
+ output_all = predictor.predict(data_batch)
+
+ data_dict_all = [dict(zip(data_names, data_batch.data[i])) for i in xrange(len(data_batch.data))]
+ scores_all = []
+ boxes_all = []
+
+ for output, data_dict, scale in zip(output_all, data_dict_all, scales):
+ # drop the batch index
+ boxes = output['rois_output'].asnumpy()[:, 1:]
+ scores = output['rois_score'].asnumpy()
+
+ # transform to original scale
+ boxes = boxes / scale
+ scores_all.append(scores)
+ boxes_all.append(boxes)
+
+ return scores_all, boxes_all, data_dict_all
+
+
+def generate_proposals(predictor, test_data, imdb, cfg, vis=False, thresh=0.):
+ """
+ Generate detections results using RPN.
+ :param predictor: Predictor
+ :param test_data: data iterator, must be non-shuffled
+ :param imdb: image database
+ :param vis: controls visualization
+ :param thresh: thresh for valid detections
+ :return: list of detected boxes
+ """
+ assert vis or not test_data.shuffle
+ data_names = [k[0] for k in test_data.provide_data[0]]
+
+ if not isinstance(test_data, PrefetchingIter):
+ test_data = PrefetchingIter(test_data)
+
+ idx = 0
+ t = time.time()
+ imdb_boxes = list()
+ original_boxes = list()
+ for im_info, data_batch in test_data:
+ t1 = time.time() - t
+ t = time.time()
+
+ scales = [iim_info[0, 2] for iim_info in im_info]
+ scores_all, boxes_all, data_dict_all = im_proposal(predictor, data_batch, data_names, scales)
+ t2 = time.time() - t
+ t = time.time()
+ for delta, (scores, boxes, data_dict, scale) in enumerate(zip(scores_all, boxes_all, data_dict_all, scales)):
+ # assemble proposals
+ dets = np.hstack((boxes, scores))
+ original_boxes.append(dets)
+
+ # filter proposals
+ keep = np.where(dets[:, 4:] > thresh)[0]
+ dets = dets[keep, :]
+ imdb_boxes.append(dets)
+
+ if vis:
+ vis_all_detection(data_dict['data'].asnumpy(), [dets], ['obj'], scale, cfg)
+
+ print 'generating %d/%d' % (idx + 1, imdb.num_images), 'proposal %d' % (dets.shape[0]), \
+ 'data %.4fs net %.4fs' % (t1, t2 / test_data.batch_size)
+ idx += 1
+
+
+ assert len(imdb_boxes) == imdb.num_images, 'calculations not complete'
+
+ # save results
+ rpn_folder = os.path.join(imdb.result_path, 'rpn_data')
+ if not os.path.exists(rpn_folder):
+ os.mkdir(rpn_folder)
+
+ rpn_file = os.path.join(rpn_folder, imdb.name + '_rpn.pkl')
+ with open(rpn_file, 'wb') as f:
+ cPickle.dump(imdb_boxes, f, cPickle.HIGHEST_PROTOCOL)
+
+ if thresh > 0:
+ full_rpn_file = os.path.join(rpn_folder, imdb.name + '_full_rpn.pkl')
+ with open(full_rpn_file, 'wb') as f:
+ cPickle.dump(original_boxes, f, cPickle.HIGHEST_PROTOCOL)
+
+ print 'wrote rpn proposals to {}'.format(rpn_file)
+ return imdb_boxes
+
+
+def im_detect(predictor, data_batch, data_names, scales, cfg):
+ output_all = predictor.predict(data_batch)
+
+ data_dict_all = [dict(zip(data_names, idata)) for idata in data_batch.data]
+ scores_all = []
+ pred_boxes_all = []
+ for output, data_dict, scale in zip(output_all, data_dict_all, scales):
+ if cfg.TEST.HAS_RPN:
+ rois = output['rois_output'].asnumpy()[:, 1:]
+ else:
+ rois = data_dict['rois'].asnumpy().reshape((-1, 5))[:, 1:]
+ im_shape = data_dict['data'].shape
+
+ # save output
+ scores = output['cls_prob_reshape_output'].asnumpy()[0]
+ bbox_deltas = output['bbox_pred_reshape_output'].asnumpy()[0]
+
+ # post processing
+ pred_boxes = bbox_pred(rois, bbox_deltas)
+ pred_boxes = clip_boxes(pred_boxes, im_shape[-2:])
+
+ # we used scaled image & roi to train, so it is necessary to transform them back
+ pred_boxes = pred_boxes / scale
+
+ scores_all.append(scores)
+ pred_boxes_all.append(pred_boxes)
+ return scores_all, pred_boxes_all, data_dict_all
+
+
+def pred_eval(predictor, test_data, imdb, cfg, vis=False, thresh=1e-3, logger=None, ignore_cache=True):
+ """
+ wrapper for calculating offline validation for faster data analysis
+ in this example, all threshold are set by hand
+ :param predictor: Predictor
+ :param test_data: data iterator, must be non-shuffle
+ :param imdb: image database
+ :param vis: controls visualization
+ :param thresh: valid detection threshold
+ :return:
+ """
+
+ det_file = os.path.join(imdb.result_path, imdb.name + '_detections.pkl')
+ if os.path.exists(det_file) and not ignore_cache:
+ with open(det_file, 'rb') as fid:
+ all_boxes = cPickle.load(fid)
+ info_str = imdb.evaluate_detections(all_boxes)
+ if logger:
+ logger.info('evaluate detections: \n{}'.format(info_str))
+ return
+
+ assert vis or not test_data.shuffle
+ data_names = [k[0] for k in test_data.provide_data[0]]
+
+ if not isinstance(test_data, PrefetchingIter):
+ test_data = PrefetchingIter(test_data)
+
+ nms = py_nms_wrapper(cfg.TEST.NMS)
+
+ # limit detections to max_per_image over all classes
+ max_per_image = cfg.TEST.max_per_image
+
+ num_images = imdb.num_images
+ # all detections are collected into:
+ # all_boxes[cls][image] = N x 5 array of detections in
+ # (x1, y1, x2, y2, score)
+ all_boxes = [[[] for _ in range(num_images)]
+ for _ in range(imdb.num_classes)]
+
+ idx = 0
+ data_time, net_time, post_time = 0.0, 0.0, 0.0
+ t = time.time()
+ for im_info, data_batch in test_data:
+ t1 = time.time() - t
+ t = time.time()
+
+ scales = [iim_info[0, 2] for iim_info in im_info]
+ scores_all, boxes_all, data_dict_all = im_detect(predictor, data_batch, data_names, scales, cfg)
+
+ t2 = time.time() - t
+ t = time.time()
+ for delta, (scores, boxes, data_dict) in enumerate(zip(scores_all, boxes_all, data_dict_all)):
+ for j in range(1, imdb.num_classes):
+ indexes = np.where(scores[:, j] > thresh)[0]
+ cls_scores = scores[indexes, j, np.newaxis]
+ cls_boxes = boxes[indexes, 4:8] if cfg.CLASS_AGNOSTIC else boxes[indexes, j * 4:(j + 1) * 4]
+ cls_dets = np.hstack((cls_boxes, cls_scores))
+ keep = nms(cls_dets)
+ all_boxes[j][idx+delta] = cls_dets[keep, :]
+
+ if max_per_image > 0:
+ image_scores = np.hstack([all_boxes[j][idx+delta][:, -1]
+ for j in range(1, imdb.num_classes)])
+ if len(image_scores) > max_per_image:
+ image_thresh = np.sort(image_scores)[-max_per_image]
+ for j in range(1, imdb.num_classes):
+ keep = np.where(all_boxes[j][idx+delta][:, -1] >= image_thresh)[0]
+ all_boxes[j][idx+delta] = all_boxes[j][idx+delta][keep, :]
+
+ if vis:
+ boxes_this_image = [[]] + [all_boxes[j][idx+delta] for j in range(1, imdb.num_classes)]
+ vis_all_detection(data_dict['data'].asnumpy(), boxes_this_image, imdb.classes, scales[delta], cfg)
+
+ idx += test_data.batch_size
+ t3 = time.time() - t
+ t = time.time()
+ data_time += t1
+ net_time += t2
+ post_time += t3
+ print 'testing {}/{} data {:.4f}s net {:.4f}s post {:.4f}s'.format(idx, imdb.num_images, data_time / idx * test_data.batch_size, net_time / idx * test_data.batch_size, post_time / idx * test_data.batch_size)
+ if logger:
+ logger.info('testing {}/{} data {:.4f}s net {:.4f}s post {:.4f}s'.format(idx, imdb.num_images, data_time / idx * test_data.batch_size, net_time / idx * test_data.batch_size, post_time / idx * test_data.batch_size))
+
+ with open(det_file, 'wb') as f:
+ cPickle.dump(all_boxes, f, protocol=cPickle.HIGHEST_PROTOCOL)
+
+ info_str = imdb.evaluate_detections(all_boxes)
+ if logger:
+ logger.info('evaluate detections: \n{}'.format(info_str))
+
+
+def vis_all_detection(im_array, detections, class_names, scale, cfg, threshold=1e-3):
+ """
+ visualize all detections in one image
+ :param im_array: [b=1 c h w] in rgb
+ :param detections: [ numpy.ndarray([[x1 y1 x2 y2 score]]) for j in classes ]
+ :param class_names: list of names in imdb
+ :param scale: visualize the scaled image
+ :return:
+ """
+ import matplotlib.pyplot as plt
+ import random
+ im = image.transform_inverse(im_array, cfg.network.PIXEL_MEANS)
+ plt.imshow(im)
+ for j, name in enumerate(class_names):
+ if name == '__background__':
+ continue
+ color = (random.random(), random.random(), random.random()) # generate a random color
+ dets = detections[j]
+ for det in dets:
+ bbox = det[:4] * scale
+ score = det[-1]
+ if score < threshold:
+ continue
+ rect = plt.Rectangle((bbox[0], bbox[1]),
+ bbox[2] - bbox[0],
+ bbox[3] - bbox[1], fill=False,
+ edgecolor=color, linewidth=3.5)
+ plt.gca().add_patch(rect)
+ plt.gca().text(bbox[0], bbox[1] - 2,
+ '{:s} {:.3f}'.format(name, score),
+ bbox=dict(facecolor=color, alpha=0.5), fontsize=12, color='white')
+ plt.show()
+
+
+def draw_all_detection(im_array, detections, class_names, scale, cfg, threshold=1e-1):
+ """
+ visualize all detections in one image
+ :param im_array: [b=1 c h w] in rgb
+ :param detections: [ numpy.ndarray([[x1 y1 x2 y2 score]]) for j in classes ]
+ :param class_names: list of names in imdb
+ :param scale: visualize the scaled image
+ :return:
+ """
+ import cv2
+ import random
+ color_white = (255, 255, 255)
+ im = image.transform_inverse(im_array, cfg.network.PIXEL_MEANS)
+ # change to bgr
+ im = cv2.cvtColor(im, cv2.COLOR_RGB2BGR)
+ for j, name in enumerate(class_names):
+ if name == '__background__':
+ continue
+ color = (random.randint(0, 256), random.randint(0, 256), random.randint(0, 256)) # generate a random color
+ dets = detections[j]
+ for det in dets:
+ bbox = det[:4] * scale
+ score = det[-1]
+ if score < threshold:
+ continue
+ bbox = map(int, bbox)
+ cv2.rectangle(im, (bbox[0], bbox[1]), (bbox[2], bbox[3]), color=color, thickness=2)
+ cv2.putText(im, '%s %.3f' % (class_names[j], score), (bbox[0], bbox[1] + 10),
+ color=color_white, fontFace=cv2.FONT_HERSHEY_COMPLEX, fontScale=0.5)
+ return im
diff --git a/faster_rcnn/function/__init__.py b/faster_rcnn/function/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/faster_rcnn/function/test_rcnn.py b/faster_rcnn/function/test_rcnn.py
new file mode 100644
index 0000000..f25de84
--- /dev/null
+++ b/faster_rcnn/function/test_rcnn.py
@@ -0,0 +1,73 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Guodong Zhang
+# --------------------------------------------------------
+
+import argparse
+import pprint
+import logging
+import time
+import os
+import mxnet as mx
+
+from symbols import *
+from dataset import *
+from core.loader import TestLoader
+from core.tester import Predictor, pred_eval
+from utils.load_model import load_param
+
+
+def test_rcnn(cfg, dataset, image_set, root_path, dataset_path,
+ ctx, prefix, epoch,
+ vis, ignore_cache, shuffle, has_rpn, proposal, thresh, logger=None, output_path=None):
+ if not logger:
+ assert False, 'require a logger'
+
+ # print cfg
+ pprint.pprint(cfg)
+ logger.info('testing cfg:{}\n'.format(pprint.pformat(cfg)))
+
+ # load symbol and testing data
+ if has_rpn:
+ sym_instance = eval(cfg.symbol + '.' + cfg.symbol)()
+ sym = sym_instance.get_symbol(cfg, is_train=False)
+ imdb = eval(dataset)(image_set, root_path, dataset_path, result_path=output_path)
+ roidb = imdb.gt_roidb()
+ else:
+ sym_instance = eval(cfg.symbol + '.' + cfg.symbol)()
+ sym = sym_instance.get_symbol_rcnn(cfg, is_train=False)
+ imdb = eval(dataset)(image_set, root_path, dataset_path, result_path=output_path)
+ gt_roidb = imdb.gt_roidb()
+ roidb = eval('imdb.' + proposal + '_roidb')(gt_roidb)
+
+ # get test data iter
+ test_data = TestLoader(roidb, cfg, batch_size=len(ctx), shuffle=shuffle, has_rpn=has_rpn)
+
+ # load model
+ arg_params, aux_params = load_param(prefix, epoch, process=True)
+
+ # infer shape
+ data_shape_dict = dict(test_data.provide_data_single)
+ sym_instance.infer_shape(data_shape_dict)
+
+ sym_instance.check_parameter_shapes(arg_params, aux_params, data_shape_dict, is_train=False)
+
+ # decide maximum shape
+ data_names = [k[0] for k in test_data.provide_data_single]
+ label_names = None
+ max_data_shape = [[('data', (1, 3, max([v[0] for v in cfg.SCALES]), max([v[1] for v in cfg.SCALES])))]]
+ if not has_rpn:
+ max_data_shape.append(('rois', (cfg.TEST.PROPOSAL_POST_NMS_TOP_N + 30, 5)))
+
+ # create predictor
+ predictor = Predictor(sym, data_names, label_names,
+ context=ctx, max_data_shapes=max_data_shape,
+ provide_data=test_data.provide_data, provide_label=test_data.provide_label,
+ arg_params=arg_params, aux_params=aux_params)
+
+ # start detection
+ pred_eval(predictor, test_data, imdb, cfg, vis=vis, ignore_cache=ignore_cache, thresh=thresh, logger=logger)
+
diff --git a/faster_rcnn/function/test_rpn.py b/faster_rcnn/function/test_rpn.py
new file mode 100644
index 0000000..8393495
--- /dev/null
+++ b/faster_rcnn/function/test_rpn.py
@@ -0,0 +1,71 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Yuwen Xiong
+# --------------------------------------------------------
+
+import argparse
+import pprint
+import logging
+import mxnet as mx
+
+from symbols import *
+from dataset import *
+from core.loader import TestLoader
+from core.tester import Predictor, generate_proposals
+from utils.load_model import load_param
+
+
+def test_rpn(cfg, dataset, image_set, root_path, dataset_path,
+ ctx, prefix, epoch,
+ vis, shuffle, thresh, logger=None, output_path=None):
+ # set up logger
+ if not logger:
+ logging.basicConfig()
+ logger = logging.getLogger()
+ logger.setLevel(logging.INFO)
+
+ # rpn generate proposal cfg
+ cfg.TEST.HAS_RPN = True
+
+ # print cfg
+ pprint.pprint(cfg)
+ logger.info('testing rpn cfg:{}\n'.format(pprint.pformat(cfg)))
+
+ # load symbol
+ sym_instance = eval(cfg.symbol + '.' + cfg.symbol)()
+ sym = sym_instance.get_symbol_rpn(cfg, is_train=False)
+
+ # load dataset and prepare imdb for training
+ imdb = eval(dataset)(image_set, root_path, dataset_path, result_path=output_path)
+ roidb = imdb.gt_roidb()
+ test_data = TestLoader(roidb, cfg, batch_size=len(ctx), shuffle=shuffle, has_rpn=True)
+
+ # load model
+ arg_params, aux_params = load_param(prefix, epoch)
+
+ # infer shape
+ data_shape_dict = dict(test_data.provide_data_single)
+ sym_instance.infer_shape(data_shape_dict)
+
+ # check parameters
+ sym_instance.check_parameter_shapes(arg_params, aux_params, data_shape_dict, is_train=False)
+
+ # decide maximum shape
+ data_names = [k[0] for k in test_data.provide_data[0]]
+ label_names = None if test_data.provide_label[0] is None else [k[0] for k in test_data.provide_label[0]]
+ max_data_shape = [[('data', (1, 3, max([v[0] for v in cfg.SCALES]), max([v[1] for v in cfg.SCALES])))]]
+
+ # create predictor
+ predictor = Predictor(sym, data_names, label_names,
+ context=ctx, max_data_shapes=max_data_shape,
+ provide_data=test_data.provide_data, provide_label=test_data.provide_label,
+ arg_params=arg_params, aux_params=aux_params)
+
+ # start testing
+ imdb_boxes = generate_proposals(predictor, test_data, imdb, cfg, vis=vis, thresh=thresh)
+
+ all_log_info = imdb.evaluate_recall(roidb, candidate_boxes=imdb_boxes)
+ logger.info(all_log_info)
diff --git a/faster_rcnn/function/train_rcnn.py b/faster_rcnn/function/train_rcnn.py
new file mode 100644
index 0000000..c9e2691
--- /dev/null
+++ b/faster_rcnn/function/train_rcnn.py
@@ -0,0 +1,136 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Guodong Zhang
+# --------------------------------------------------------
+
+import argparse
+import logging
+import pprint
+import os
+import mxnet as mx
+import numpy as np
+
+from symbols import *
+from core import callback, metric
+from core.loader import ROIIter
+from core.module import MutableModule
+from bbox.bbox_regression import add_bbox_regression_targets
+from utils.load_data import load_proposal_roidb, merge_roidb, filter_roidb
+from utils.load_model import load_param
+from utils.PrefetchingIter import PrefetchingIter
+from utils.lr_scheduler import WarmupMultiFactorScheduler
+
+
+def train_rcnn(cfg, dataset, image_set, root_path, dataset_path,
+ frequent, kvstore, flip, shuffle, resume,
+ ctx, pretrained, epoch, prefix, begin_epoch, end_epoch,
+ train_shared, lr, lr_step, proposal, logger=None, output_path=None):
+ mx.random.seed(3)
+ np.random.seed(3)
+ # set up logger
+ if not logger:
+ logging.basicConfig()
+ logger = logging.getLogger()
+ logger.setLevel(logging.INFO)
+
+ # load symbol
+ sym_instance = eval(cfg.symbol + '.' + cfg.symbol)()
+ sym = sym_instance.get_symbol_rcnn(cfg, is_train=True)
+
+ # setup multi-gpu
+ batch_size = len(ctx)
+ input_batch_size = cfg.TRAIN.BATCH_IMAGES * batch_size
+
+ # print cfg
+ pprint.pprint(cfg)
+ logger.info('training rcnn cfg:{}\n'.format(pprint.pformat(cfg)))
+
+ # load dataset and prepare imdb for training
+ image_sets = [iset for iset in image_set.split('+')]
+ roidbs = [load_proposal_roidb(dataset, image_set, root_path, dataset_path,
+ proposal=proposal, append_gt=True, flip=flip, result_path=output_path)
+ for image_set in image_sets]
+ roidb = merge_roidb(roidbs)
+ roidb = filter_roidb(roidb, cfg)
+ means, stds = add_bbox_regression_targets(roidb, cfg)
+
+ # load training data
+ train_data = ROIIter(roidb, cfg, batch_size=input_batch_size, shuffle=shuffle,
+ ctx=ctx, aspect_grouping=cfg.TRAIN.ASPECT_GROUPING)
+
+ # infer max shape
+ max_data_shape = [('data', (cfg.TRAIN.BATCH_IMAGES, 3, max([v[0] for v in cfg.SCALES]), max([v[1] for v in cfg.SCALES])))]
+
+ # infer shape
+ data_shape_dict = dict(train_data.provide_data_single + train_data.provide_label_single)
+ sym_instance.infer_shape(data_shape_dict)
+
+ # load and initialize params
+ if resume:
+ print('continue training from ', begin_epoch)
+ arg_params, aux_params = load_param(prefix, begin_epoch, convert=True)
+ else:
+ arg_params, aux_params = load_param(pretrained, epoch, convert=True)
+ sym_instance.init_weight_rcnn(cfg, arg_params, aux_params)
+
+ # check parameter shapes
+ sym_instance.check_parameter_shapes(arg_params, aux_params, data_shape_dict)
+
+ # prepare training
+ # create solver
+ data_names = [k[0] for k in train_data.provide_data_single]
+ label_names = [k[0] for k in train_data.provide_label_single]
+ if train_shared:
+ fixed_param_prefix = cfg.network.FIXED_PARAMS_SHARED
+ else:
+ fixed_param_prefix = cfg.network.FIXED_PARAMS
+ mod = MutableModule(sym, data_names=data_names, label_names=label_names,
+ logger=logger, context=ctx,
+ max_data_shapes=[max_data_shape for _ in range(batch_size)], fixed_param_prefix=fixed_param_prefix)
+
+ if cfg.TRAIN.RESUME:
+ mod._preload_opt_states = '%s-%04d.states'%(prefix, begin_epoch)
+
+
+ # decide training params
+ # metric
+ eval_metric = metric.RCNNAccMetric(cfg)
+ cls_metric = metric.RCNNLogLossMetric(cfg)
+ bbox_metric = metric.RCNNL1LossMetric(cfg)
+ eval_metrics = mx.metric.CompositeEvalMetric()
+ for child_metric in [eval_metric, cls_metric, bbox_metric]:
+ eval_metrics.add(child_metric)
+ # callback
+ batch_end_callback = callback.Speedometer(train_data.batch_size, frequent=frequent)
+ epoch_end_callback = [mx.callback.module_checkpoint(mod, prefix, period=1, save_optimizer_states=True),
+ callback.do_checkpoint(prefix, means, stds)]
+ # decide learning rate
+ base_lr = lr
+ lr_factor = cfg.TRAIN.lr_factor
+ lr_epoch = [float(epoch) for epoch in lr_step.split(',')]
+ lr_epoch_diff = [epoch - begin_epoch for epoch in lr_epoch if epoch > begin_epoch]
+ lr = base_lr * (lr_factor ** (len(lr_epoch) - len(lr_epoch_diff)))
+ lr_iters = [int(epoch * len(roidb) / batch_size) for epoch in lr_epoch_diff]
+ print('lr', lr, 'lr_epoch_diff', lr_epoch_diff, 'lr_iters', lr_iters)
+ lr_scheduler = WarmupMultiFactorScheduler(lr_iters, lr_factor, cfg.TRAIN.warmup, cfg.TRAIN.warmup_lr, cfg.TRAIN.warmup_step)
+ # optimizer
+ optimizer_params = {'momentum': cfg.TRAIN.momentum,
+ 'wd': cfg.TRAIN.wd,
+ 'learning_rate': lr,
+ 'lr_scheduler': lr_scheduler,
+ 'rescale_grad': 1.0,
+ 'clip_gradient': None}
+
+ # train
+
+ if not isinstance(train_data, PrefetchingIter):
+ train_data = PrefetchingIter(train_data)
+
+ mod.fit(train_data, eval_metric=eval_metrics, epoch_end_callback=epoch_end_callback,
+ batch_end_callback=batch_end_callback, kvstore=kvstore,
+ optimizer='sgd', optimizer_params=optimizer_params,
+ arg_params=arg_params, aux_params=aux_params, begin_epoch=begin_epoch, num_epoch=end_epoch)
+
diff --git a/faster_rcnn/function/train_rpn.py b/faster_rcnn/function/train_rpn.py
new file mode 100644
index 0000000..be1be47
--- /dev/null
+++ b/faster_rcnn/function/train_rpn.py
@@ -0,0 +1,131 @@
+# --------------------------------------------------------
+# Deformable Convolutional Networks
+# Copyright (c) 2016 by Contributors
+# Copyright (c) 2017 Microsoft
+# Licensed under The Apache-2.0 License [see LICENSE for details]
+# Modified by Yuwen Xiong
+# --------------------------------------------------------
+
+import argparse
+import logging
+import pprint
+import mxnet as mx
+
+from symbols import *
+from core import callback, metric
+from core.loader import AnchorLoader
+from core.module import MutableModule
+from utils.load_data import load_gt_roidb, merge_roidb, filter_roidb
+from utils.load_model import load_param
+from utils.PrefetchingIter import PrefetchingIter
+from utils.lr_scheduler import WarmupMultiFactorScheduler
+
+
+def train_rpn(cfg, dataset, image_set, root_path, dataset_path,
+ frequent, kvstore, flip, shuffle, resume,
+ ctx, pretrained, epoch, prefix, begin_epoch, end_epoch,
+ train_shared, lr, lr_step, logger=None, output_path=None):
+ # set up logger
+ if not logger:
+ logging.basicConfig()
+ logger = logging.getLogger()
+ logger.setLevel(logging.INFO)
+
+ # set up config
+ cfg.TRAIN.BATCH_IMAGES = cfg.TRAIN.ALTERNATE.RPN_BATCH_IMAGES
+
+ # load symbol
+ sym_instance = eval(cfg.symbol + '.' + cfg.symbol)()
+ sym = sym_instance.get_symbol_rpn(cfg, is_train=True)
+ feat_sym = sym.get_internals()['rpn_cls_score_output']
+
+ # setup multi-gpu
+ batch_size = len(ctx)
+ input_batch_size = cfg.TRAIN.BATCH_IMAGES * batch_size
+
+ # print cfg
+ pprint.pprint(cfg)
+ logger.info('training rpn cfg:{}\n'.format(pprint.pformat(cfg)))
+
+ # load dataset and prepare imdb for training
+ image_sets = [iset for iset in image_set.split('+')]
+ roidbs = [load_gt_roidb(dataset, image_set, root_path, dataset_path, result_path=output_path,
+ flip=flip)
+ for image_set in image_sets]
+ roidb = merge_roidb(roidbs)
+ roidb = filter_roidb(roidb, cfg)
+
+ # load training data
+ train_data = AnchorLoader(feat_sym, roidb, cfg, batch_size=input_batch_size, shuffle=shuffle,
+ ctx=ctx, feat_stride=cfg.network.RPN_FEAT_STRIDE, anchor_scales=cfg.network.ANCHOR_SCALES,
+ anchor_ratios=cfg.network.ANCHOR_RATIOS, aspect_grouping=cfg.TRAIN.ASPECT_GROUPING)
+
+ # infer max shape
+ max_data_shape = [('data', (cfg.TRAIN.BATCH_IMAGES, 3, max([v[0] for v in cfg.SCALES]), max([v[1] for v in cfg.SCALES])))]
+ max_data_shape, max_label_shape = train_data.infer_shape(max_data_shape)
+ print('providing maximum shape', max_data_shape, max_label_shape)
+
+ # infer shape
+ data_shape_dict = dict(train_data.provide_data_single + train_data.provide_label_single)
+ sym_instance.infer_shape(data_shape_dict)
+
+ # load and initialize params
+ if resume:
+ print('continue training from ', begin_epoch)
+ arg_params, aux_params = load_param(prefix, begin_epoch, convert=True)
+ else:
+ arg_params, aux_params = load_param(pretrained, epoch, convert=True)
+ sym_instance.init_weight_rpn(cfg, arg_params, aux_params)
+
+ # check parameter shapes
+ sym_instance.check_parameter_shapes(arg_params, aux_params, data_shape_dict)
+
+ # create solver
+ data_names = [k[0] for k in train_data.provide_data_single]
+ label_names = [k[0] for k in train_data.provide_label_single]
+ if train_shared:
+ fixed_param_prefix = cfg.network.FIXED_PARAMS_SHARED
+ else:
+ fixed_param_prefix = cfg.network.FIXED_PARAMS
+ mod = MutableModule(sym, data_names=data_names, label_names=label_names,
+ logger=logger, context=ctx, max_data_shapes=[max_data_shape for _ in xrange(batch_size)],
+ max_label_shapes=[max_label_shape for _ in xrange(batch_size)], fixed_param_prefix=fixed_param_prefix)
+
+ # decide training params
+ # metric
+ eval_metric = metric.RPNAccMetric()
+ cls_metric = metric.RPNLogLossMetric()
+ bbox_metric = metric.RPNL1LossMetric()
+ eval_metrics = mx.metric.CompositeEvalMetric()
+ for child_metric in [eval_metric, cls_metric, bbox_metric]:
+ eval_metrics.add(child_metric)
+ # callback
+ batch_end_callback = callback.Speedometer(train_data.batch_size, frequent=frequent)
+ # epoch_end_callback = mx.callback.do_checkpoint(prefix)
+ epoch_end_callback = mx.callback.module_checkpoint(mod, prefix, period=1, save_optimizer_states=True)
+ # decide learning rate
+ base_lr = lr
+ lr_factor = cfg.TRAIN.lr_factor
+ lr_epoch = [int(epoch) for epoch in lr_step.split(',')]
+ lr_epoch_diff = [epoch - begin_epoch for epoch in lr_epoch if epoch > begin_epoch]
+ lr = base_lr * (lr_factor ** (len(lr_epoch) - len(lr_epoch_diff)))
+ lr_iters = [int(epoch * len(roidb) / batch_size) for epoch in lr_epoch_diff]
+ print('lr', lr, 'lr_epoch_diff', lr_epoch_diff, 'lr_iters', lr_iters)
+ lr_scheduler = WarmupMultiFactorScheduler(lr_iters, lr_factor, cfg.TRAIN.warmup, cfg.TRAIN.warmup_lr, cfg.TRAIN.warmup_step)
+ # optimizer
+ optimizer_params = {'momentum': cfg.TRAIN.momentum,
+ 'wd': cfg.TRAIN.wd,
+ 'learning_rate': lr,
+ 'lr_scheduler': lr_scheduler,
+ 'rescale_grad': 1.0,
+ 'clip_gradient': None}
+
+ if not isinstance(train_data, PrefetchingIter):
+ train_data = PrefetchingIter(train_data)
+
+ # train
+ mod.fit(train_data, eval_metric=eval_metrics, epoch_end_callback=epoch_end_callback,
+ batch_end_callback=batch_end_callback, kvstore=kvstore,
+ optimizer='sgd', optimizer_params=optimizer_params,
+ arg_params=arg_params, aux_params=aux_params, begin_epoch=begin_epoch, num_epoch=end_epoch)
+
diff --git a/faster_rcnn/operator_cxx/deformable_convolution-inl.h b/faster_rcnn/operator_cxx/deformable_convolution-inl.h
new file mode 100644
index 0000000..ccc6bb3
--- /dev/null
+++ b/faster_rcnn/operator_cxx/deformable_convolution-inl.h
@@ -0,0 +1,487 @@
+/*!
+ * Copyright (c) 2017 Microsoft
+ * Licensed under The Apache-2.0 License [see LICENSE for details]
+ * \file deformable_convolution-inl.h
+ * \brief
+ * \ref: https://github.com/Yangqing/caffe/wiki/Convolution-in-Caffe:-a-memo
+ * \ref: https://arxiv.org/abs/1703.06211
+ * \author Yuwen Xiong, Haozhi Qi, Jifeng Dai
+*/
+#ifndef MXNET_OPERATOR_DEFORMABLE_CONVOLUTION_INL_H_
+#define MXNET_OPERATOR_DEFORMABLE_CONVOLUTION_INL_H_
+
+#include
+#include
+#include
+#include
+#include
+#include
+#include
+#include
+#include