Skip to content

Commit

Permalink
Merge branch 'mindspore-lab:main' into main
Browse files Browse the repository at this point in the history
  • Loading branch information
XixinYang authored Aug 17, 2023
2 parents a9a7e75 + 1fb2453 commit 5150af5
Show file tree
Hide file tree
Showing 28 changed files with 3,413 additions and 150 deletions.
1 change: 1 addition & 0 deletions configs/crossvit/crossvit_15_ascend.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ ckpt_save_dir: './ckpt'
epoch_size: 600
dataset_sink_mode: True
amp_level: 'O3'
val_amp_level: 'O3'
drop_path_rate: 0.1

# loss
Expand Down
8 changes: 4 additions & 4 deletions configs/crossvit/crossvit_18_ascend.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ ckpt_save_dir: './ckpt'
epoch_size: 300
dataset_sink_mode: True
amp_level: 'O3'
val_amp_level: 'O3'
drop_path_rate: 0.1

# loss
Expand All @@ -45,7 +46,7 @@ label_smoothing: 0.1

# lr scheduler
scheduler: 'warmup_cosine_decay'
lr: 0.004
lr: 0.0015
min_lr: 0.00001
warmup_epochs: 30
decay_epochs: 270
Expand All @@ -56,8 +57,7 @@ opt: 'adamw'
weight_decay: 0.05
filter_bias_and_bn: True
loss_scale: 1024
drop_overflow_update: True
loss_scale_type: 'dynamic'
use_nesterov: False
eps: 1e-8

# Scheduler parameters
lr_epoch_stair: True
1 change: 1 addition & 0 deletions configs/crossvit/crossvit_9_ascend.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ ckpt_save_dir: './ckpt'
epoch_size: 300
dataset_sink_mode: True
amp_level: 'O2'
val_amp_level: 'O2'
drop_path_rate: 0.1

# loss
Expand Down
1 change: 1 addition & 0 deletions configs/edgenext/edgenext_base_ascend.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ ckpt_save_dir: './ckpt'
epoch_size: 350
dataset_sink_mode: True
amp_level: 'O2'
val_amp_level: 'O2'
drop_path_rate: 0.1

# loss
Expand Down
1 change: 1 addition & 0 deletions configs/edgenext/edgenext_small_ascend.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ ckpt_save_dir: './ckpt'
epoch_size: 350
dataset_sink_mode: True
amp_level: 'O3'
val_amp_level: 'O3'
drop_path_rate: 0.1

# loss
Expand Down
1 change: 1 addition & 0 deletions configs/edgenext/edgenext_x_small_ascend.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ ckpt_save_dir: './ckpt'
epoch_size: 350
dataset_sink_mode: True
amp_level: 'O3'
val_amp_level: 'O3'
drop_path_rate: 0.1

# loss
Expand Down
1 change: 1 addition & 0 deletions configs/edgenext/edgenext_xx_small_ascend.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ ckpt_save_dir: './ckpt'
epoch_size: 350
dataset_sink_mode: True
amp_level: 'O2'
val_amp_level: 'O2'
drop_path_rate: 0.0

# loss
Expand Down
97 changes: 97 additions & 0 deletions configs/halonet/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
# HaloNet

> [Scaling Local Self-Attention for Parameter Efficient Visual Backbones](https://arxiv.org/abs/2103.12731)
## Introduction

Researchers from Google Research and UC Berkeley have developed a new model of self-attention that can outperform standard baseline models and even high-performance convolutional models.[[1](#references)]

Blocked Self-Attention:The whole input image is divided into multiple blocks and self-attention is applied to each block.However, if only the information inside the block is considered each time, it will inevitably lead to the loss of information.Therefore, before calculating the SA, a haloing operation is performed on each block, i.e., outside of each block, the information of the original image is used to padding a circle, so that the sensory field of each block can be appropriately larger and focus on more information.

<p align="center">
<img src="https://github-production-user-asset-6210df.s3.amazonaws.com/50255437/257577202-3ac43b82-785a-42c5-9b6c-ca58b0fa7ab8.png" width=800 />
</p>
<p align="center">
<em>Figure 1. Architecture of Blocked Self-Attention [<a href="#references">1</a>] </em>
</p>

Down Sampling:In order to reduce the amount of computation, each block is sampled separately, and then attentions are performed on this sampled information to reach the effect of down sampling.

<p align="center">
<img src="https://github-production-user-asset-6210df.s3.amazonaws.com/50255437/257578183-fe45c2c2-5006-492b-b30a-5b049a0e2531.png" width=800 />
</p>
<p align="center">
<em>Figure 2. Architecture of Down Sampling [<a href="#references">1</a>] </em>
</p>


## Results

Our reproduced model performance on ImageNet-1K is reported as follows.

<div align="center">

| Model | Context | Top-1 (%) | Top-5 (%) | Params (M) | Recipe | Download |
| ----------- | -------- | --------- | --------- | ---------- | ------------------------------------------------------------ | ------------------------------------------------------------ |
| halonet_50t | D910X8-G | 79.53 | 94.79 | 22.79 | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/halonet/halonet_50t_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/halonet/halonet_50t-533da6be.ckpt) |

</div>

#### Notes

- Context: Training context denoted as {device}x{pieces}-{MS mode}, where mindspore mode can be G - graph mode or F - pynative mode with ms function. For example, D910x8-G is for training on 8 pieces of Ascend 910 NPU using graph mode.
- Top-1 and Top-5: Accuracy reported on the validation set of ImageNet-1K.

## Quick Start

### Preparation

#### Installation
Please refer to the [installation instruction](https://github.com/mindspore-ecosystem/mindcv#installation) in MindCV.

#### Dataset Preparation
Please download the [ImageNet-1K](https://www.image-net.org/challenges/LSVRC/2012/index.php) dataset for model training and validation.

### Training

* Distributed Training

It is easy to reproduce the reported results with the pre-defined training recipe. For distributed training on multiple Ascend 910 devices, please run

```shell
# distributed training on multiple GPU/Ascend devices
mpirun -n 8 python train.py --config configs/halonet/halonet_50t_ascend.yaml --data_dir /path/to/imagenet
```

> If the script is executed by the root user, the `--allow-run-as-root` parameter must be added to `mpirun`.
Similarly, you can train the model on multiple GPU devices with the above `mpirun` command.

For detailed illustration of all hyper-parameters, please refer to [config.py](https://github.com/mindspore-lab/mindcv/blob/main/config.py).

**Note:** As the global batch size (batch_size x num_devices) is an important hyper-parameter, it is recommended to keep the global batch size unchanged for reproduction or adjust the learning rate linearly to a new global batch size.

* Standalone Training

If you want to train or finetune the model on a smaller dataset without distributed training, please run:

```shell
# standalone training on a CPU/GPU/Ascend device
python train.py --config configs/halonet/halonet_50t_ascend.yaml --data_dir /path/to/dataset --distribute False
```

### Validation

To validate the accuracy of the trained model, you can use `validate.py` and parse the checkpoint path with `--ckpt_path`.

```shell
python validate.py -c configs/halonet/halonet_50t_ascend.yaml --data_dir /path/to/imagenet --ckpt_path /path/to/ckpt
```

### Deployment

Please refer to the [deployment tutorial](https://mindspore-lab.github.io/mindcv/tutorials/deployment/) in MindCV.

## References

[1] Vaswani A, Ramachandran P, Srinivas A, et al. Scaling local self-attention for parameter efficient visual backbones[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 12894-12904.
60 changes: 60 additions & 0 deletions configs/halonet/halonet_50t_ascend.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# system
mode: 0
distribute: True
num_parallel_workers: 8
val_while_train: True

# dataset
dataset: 'imagenet'
data_dir: '/path/to/imagenet'
shuffle: True
dataset_download: False
batch_size: 64
drop_remainder: True
val_split: val

# augmentation
image_resize: 256
scale: [0.08, 1.0]
ratio: [0.75, 1.333]
hflip: 0.5
interpolation: 'bilinear'
crop_pct: 0.95

#color_jitter:
auto_augment: 'randaug-m9-n2-mstd0.5-inc1'
re_prob: 0.25
re_max_attempts: 1
mixup: 0.8
color_jitter: 0.4

# model
model: 'halonet_50t'
num_classes: 1000
pretrained: False
ckpt_path: ''
keep_checkpoint_max: 20
val_interval: 5
ckpt_save_dir: './ckpt'
epoch_size: 300
dataset_sink_mode: True
amp_level: 'O3'
val_amp_level: 'O2'

# optimizer
opt: 'adamw'
filter_bias_and_bn: True
weight_decay: 0.04
loss_scale: 1024
use_nesterov: False

# lr scheduler
scheduler: 'warmup_cosine_decay'
min_lr: 0.000006
lr: 0.00125
warmup_epochs: 3
decay_epochs: 297

# loss
loss: 'CE'
label_smoothing: 0.1
143 changes: 143 additions & 0 deletions examples/det/ssd/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
# SSD Based on MindCV Backbones

> [SSD: Single Shot MultiBox Detector](https://arxiv.org/abs/1512.02325)
## Introduction

SSD is an single-staged object detector. It discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location, and combines predictions from multi-scale feature maps to detect objects with various sizes. At prediction time, SSD generates scores for the presence of each object category in each default box and produces adjustments to the box to better match the object shape.

<p align="center">
<img src="https://github.com/DexterJZ/mindcv/assets/16130861/50bc9627-c71c-4b1a-9de4-9e6040a43279" width=800 />
</p>
<p align="center">
<em>Figure 1. Architecture of SSD [<a href="#references">1</a>] </em>
</p>

In this example, by leveraging [the multi-scale feature extraction of MindCV](https://github.com/mindspore-lab/mindcv/blob/main/docs/en/how_to_guides/feature_extraction.md), we demonstrate that using backbones from MindCV much simplifies the implementation of SSD.

## Configurations

Here, we provide three configurations of SSD.
* Using [MobileNetV2](https://github.com/mindspore-lab/mindcv/tree/main/configs/mobilenetv2) as the backbone and the original detector described in the paper.
* Using [ResNet50](https://github.com/mindspore-lab/mindcv/tree/main/configs/resnet) as the backbone with a FPN and a shared-weight-based detector.
* Using [MobileNetV3](https://github.com/mindspore-lab/mindcv/tree/main/configs/mobilenetv3) as the backbone and the original detector described in the paper.

## Dataset

We train and test SSD using [COCO 2017 Dataset](https://cocodataset.org/#download). The dataset contains
* 118000 images about 18 GB for training, and
* 5000 images about 1 GB for testing.

## Quick Start

### Preparation

1. Clone MindCV repository by running
```
git clone https://github.com/mindspore-lab/mindcv.git
```

2. Install dependencies as shown [here](https://mindspore-lab.github.io/mindcv/installation/).

3. Download [COCO 2017 Dataset](https://cocodataset.org/#download), prepare the dataset as follows.
```
.
└─cocodataset
├─annotations
├─instance_train2017.json
└─instance_val2017.json
├─val2017
└─train2017
```
Run the following commands to preprocess the dataset and convert it to [MindRecord format](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore.mindrecord.html) for reducing preprocessing time during training and testing.
```
cd mindcv # change directory to the root of MindCV repository
python examples/det/ssd/create_data.py coco --data_path [root of COCO 2017 Dataset] --out_path [directory for storing MindRecord files]
```
Specify the path of the preprocessed dataset at keyword `data_dir` in the config file.

4. Download the pretrained backbone weights from the table below, and specify the path to the backbone weights at keyword `backbone_ckpt_path` in the config file.
<div align="center">

| MobileNetV2 | ResNet50 | MobileNetV3 |
|:----------------:|:----------------:|:----------------:|
| [backbone weights](https://download.mindspore.cn/toolkits/mindcv/mobilenet/mobilenetv2/mobilenet_v2_100-d5532038.ckpt) | [backbone weights](https://download.mindspore.cn/toolkits/mindcv/resnet/resnet50-e0733ab8.ckpt) | [backbone weights](https://download.mindspore.cn/toolkits/mindcv/mobilenet/mobilenetv3/mobilenet_v3_large_100-1279ad5f.ckpt) |

</div>

### Train

It is highly recommended to use **distributed training** for this SSD implementation.

For distributed training using **OpenMPI's `mpirun`**, simply run
```
cd mindcv # change directory to the root of MindCV repository
mpirun -n [# of devices] python examples/det/ssd/train.py --config [the path to the config file]
```
For example, if train SSD distributively with the `MobileNetV2` configuration on 8 devices, run
```
cd mindcv # change directory to the root of MindCV repository
mpirun -n 8 python examples/det/ssd/train.py --config examples/det/ssd/ssd_mobilenetv2.yaml
```

For distributed training with [Ascend rank table](https://github.com/mindspore-lab/mindocr/blob/main/docs/en/tutorials/distribute_train.md#12-configure-rank_table_file-for-training), configure `ascend8p.sh` as follows
```
#!/bin/bash
export DEVICE_NUM=8
export RANK_SIZE=8
export RANK_TABLE_FILE="./hccl_8p_01234567_127.0.0.1.json"
for ((i = 0; i < ${DEVICE_NUM}; i++)); do
export DEVICE_ID=$i
export RANK_ID=$i
echo "Launching rank: ${RANK_ID}, device: ${DEVICE_ID}"
if [ $i -eq 0 ]; then
echo 'i am 0'
python examples/det/ssd/train.py --config [the path to the config file] &> ./train.log &
else
echo 'not 0'
python -u examples/det/ssd/train.py --config [the path to the config file] &> /dev/null &
fi
done
```
and start training by running
```
cd mindcv # change directory to the root of MindCV repository
bash ascend8p.sh
```

For single-device training, please run
```
cd mindcv # change directory to the root of MindCV repository
python examples/det/ssd/train.py --config [the path to the config file]
```

### Test

For testing the trained model, first specify the path to the model checkpoint at keyword `ckpt_path` in the config file, then run
```
cd mindcv # change directory to the root of MindCV repository
python examples/det/ssd/eval.py --config [the path to the config file]
```
For example, for testing SSD with the `MobileNetV2` configuration, run
```
cd mindcv # change directory to the root of MindCV repository
python examples/det/ssd/eval.py --config examples/det/ssd/ssd_mobilenetv2.yaml
```

## Performance

Here are the performance resutls and the pretrained model weights for each configuration.
<div align="center">

| Configuration | Mixed Precision | mAP | Config | Download |
|:-----------------:|:---------------:|:----:|:------:|:--------:|
| MobileNetV2 | O2 | 23.2 | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/examples/det/ssd/ssd_mobilenetv2.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/ssd/ssd_mobilenetv2-5bbd7411.ckpt) |
| ResNet50 with FPN | O3 | 38.3 | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/examples/det/ssd/ssd_resnet50_fpn.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/ssd/ssd_resnet50_fpn-ac87ddac.ckpt) |
| MobileNetV3 | O2 | 23.8 | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/examples/det/ssd/ssd_mobilenetv3.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/ssd/ssd_mobilenetv3-53d9f6e9.ckpt) |

</div>

## References

[1] Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016). SSD: Single Shot Multibox Detector. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14 (pp. 21-37). Springer International Publishing.
Loading

0 comments on commit 5150af5

Please sign in to comment.