Skip to content

Commit

Permalink
Merge pull request #171 from HaoyangLee/main
Browse files Browse the repository at this point in the history
update docs
  • Loading branch information
HaoyangLee authored Apr 4, 2023
2 parents ccb8dfb + c4fdabd commit 0effe3b
Show file tree
Hide file tree
Showing 6 changed files with 60 additions and 41 deletions.
8 changes: 7 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,13 @@ For distributed training, please install [openmpi 4.0.3](https://www.open-mpi.or
| MindSpore | >=1.9 |
| Python | >=3.7 |

> Notes: If you [use MX Engine for Inference](#21-inference-with-mx-engine), the version of Python should be 3.9.
> Notes:
> - If you [use MX Engine for Inference](#21-inference-with-mx-engine), the version of Python should be 3.9.
> - If scikit_image cannot be imported, you can use the following command line to set environment variable `$LD_PRELOAD` referring to [here](https://github.com/opencv/opencv/issues/14884). Change `path/to` to your directory.
> ```shell
> export LD_PRELOAD=path/to/scikit_image.libs/libgomp-d22c30c5.so.1.0.0:$LD_PRELOAD
> ```
### Install with PyPI
Expand Down
10 changes: 8 additions & 2 deletions README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,13 @@ pip install -r requirements.txt
| MindSpore | >=1.9 |
| Python | >=3.7 |

> 注意:如果[使用MX Engine推理](#21-使用mx-engine推理),Python版本需为3.9。

> 注意:
> - 如果[使用MX Engine推理](#21-使用mx-engine推理),Python版本需为3.9。
> - 如果遇到scikit_image导入错误,参考[此处](https://github.com/opencv/opencv/issues/14884),你需要设置环境变量`$LD_PRELOAD`,命令如下。替换`path/to`为你的目录。
> ```shell
> export LD_PRELOAD=path/to/scikit_image.libs/libgomp-d22c30c5.so.1.0.0:$LD_PRELOAD
> ```
### 通过PyPI安装
Expand Down Expand Up @@ -83,7 +89,7 @@ MindOCR支持多种文本识别模型及数据集,在此我们使用**CRNN**

MX ([MindX](https://www.hiascend.com/zh/software/mindx-sdk)的缩写) 是一个支持昇腾设备高效推理与部署的工具。

MindOCR集成了MX推理引擎,支持文本检测识别任务,请参考[mx_infer](docs/cn/inference_tutorial_cn.md).
MindOCR集成了MX推理引擎,支持文本检测识别任务,请参考[mx_infer](docs/cn/inference_tutorial_cn.md)


#### 2.2 使用Lite推理
Expand Down
14 changes: 9 additions & 5 deletions configs/det/dbnet/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,13 +37,17 @@ The overall architecture of DBNet is presented in _Figure 1._ It consists of mul
### ICDAR2015
<div align="center">

| **Model** | **Context** | **Backbone** | **Pretrained** | **Recall** | **Precision** | **F-score** | **Train T. (s/epoch)** | **Recipe** | **Download** |
|---------------|--------------|----------------|------------|---------------|-------------|-----------------------------|-------------------------------------------|---------------------------------------------------| ----------|
| DBNet (ours) | D910x1-MS1.9-G | ResNet-50 | ImageNet | 81.70% | 85.84% | 83.72% | 35 | [yaml](db_r50_icdar15.yaml) | [weights](https://download.mindspore.cn/toolkits/mindocr/dbnet/dbnet_resnet50-db1df47a.ckpt) |
| DBNet (PaddleOCR) | - | ResNet50_vd | SynthText | 78.72% | 86.41% | 82.38% | - | - | -|
| **Model** | **Context** | **Backbone** | **Pretrained** | **Recall** | **Precision** | **F-score** | **Train T. (s/epoch)** | **Recipe** | **Download** |
|--------------------|----------------|----------------|----------------|-------------|-------------|-------------|------------------------|-----------------------------|----------------------------------------------------------------------------------------------|
| DBNet (ours) | D910x1-MS1.9-G | ResNet-50 | ImageNet | 81.70% | 85.84% | 83.72% | 35 | [yaml](db_r50_icdar15.yaml) | [weights](https://download.mindspore.cn/toolkits/mindocr/dbnet/dbnet_resnet50-db1df47a.ckpt) |
| DBNet (PaddleOCR) | - | ResNet50_vd | SynthText | 78.72% | 86.41% | 82.38% | - | - | - |
| DBNet++ | D910x1-MS1.9-G | ResNet-50 | ImageNet | 82.02% | 87.38% | 84.62% | - | - | - |

</div>

> More information of DBNet++ is coming soon. The only difference between _DBNet_ and _DBNet++_ is in the _Adaptive Scale Fusion_ module, which is controlled by the `use_asf` parameter in the `neck` module in yaml config file.

#### Notes
- Context: Training context denoted as {device}x{pieces}-{MS version}{MS mode}, where mindspore mode can be G - graph mode or F - pynative mode with ms function. For example, D910x8-G is for training on 8 pieces of Ascend 910 NPU using graph mode.
- Note that the training time of DBNet is highly affected by data processing and varies on different machines.
Expand Down Expand Up @@ -88,7 +92,7 @@ specifically the following parts. The `dataset_root` will be concatenated with `
...
train:
ckpt_save_dir: './tmp_det'
dataset_sink_mode: True
dataset_sink_mode: False
dataset:
type: DetDataset
dataset_root: dir/to/dataset <--- Update
Expand Down
13 changes: 8 additions & 5 deletions configs/det/dbnet/README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,13 +28,16 @@ DBNet的整体架构图如图1所示,包含以下阶段:
### ICDAR2015
<div align="center">

| **模型** | **环境配置** | **骨干网络** | **预训练数据集** | **Recall** | **Precision** | **F-score** | **训练时间(s/epoch)** | **配置文件** | **模型权重下载** |
|------------------|------------|-------------|----------------|------------|---------------|-------------|-------------------------------|-----------------------------|-----------------------------------------------------------------|
| DBNet (ours) | D910x1-MS1.9-G | ResNet-50 | ImageNet | 81.70% | 85.84% | 83.72% | 35 | [yaml](db_r50_icdar15.yaml) | [weights](https://download.mindspore.cn/toolkits/mindocr/dbnet/dbnet_resnet50-db1df47a.ckpt) |
| DBNet (PaddleOCR)| - | ResNet50_vd | SynthText | 78.72% | 86.41% | 82.38% | - | - | -|
| **模型** | **环境配置** | **骨干网络** | **预训练数据集** | **Recall** | **Precision** | **F-score** | **训练时间(s/epoch)** | **配置文件** | **模型权重下载** |
|-------------------|----------------|---------------|-------------|-------------|---------------|-------------|-------------------|-----------------------------|----------------------------------------------------------------------------------------------|
| DBNet (ours) | D910x1-MS1.9-G | ResNet-50 | ImageNet | 81.70% | 85.84% | 83.72% | 35 | [yaml](db_r50_icdar15.yaml) | [weights](https://download.mindspore.cn/toolkits/mindocr/dbnet/dbnet_resnet50-db1df47a.ckpt) |
| DBNet (PaddleOCR) | - | ResNet50_vd | SynthText | 78.72% | 86.41% | 82.38% | - | - | - |
| DBNet++ | D910x1-MS1.9-G | ResNet-50 | ImageNet | 82.02% | 87.38% | 84.62% | - | - | - |

</div>

> DBNet++的详细信息即将发布,敬请期待。DBNet和DBNet++的唯一区别在于_Adaptive Scale Fusion_模块, 在yaml配置文件`neck`模块中的 `use_asf`参数进行设置。
#### 注释:
- 环境配置:训练的环境配置表示为 {处理器}x{处理器数量}-{MS模式},其中 Mindspore 模式可以是 G-graph 模式或 F-pynative 模式。
- DBNet的训练时长受数据处理部分和不同运行环境的影响非常大。
Expand Down Expand Up @@ -75,7 +78,7 @@ DBNet的整体架构图如图1所示,包含以下阶段:
...
train:
ckpt_save_dir: './tmp_det'
dataset_sink_mode: True
dataset_sink_mode: False
dataset:
type: DetDataset
dataset_root: dir/to/dataset <--- 更新
Expand Down
28 changes: 14 additions & 14 deletions configs/rec/crnn/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,19 +37,19 @@ According to our experiments, the evaluation results on public benchmark dataset

<div align="center">

| **Model** | **Context** | **Backbone** | **Avg Accuracy** | **Train T. (s/epoch)** | **Recipe** | **Download** |
|-----------|--------------|------------------|------------|--------------| ------ |------ |
| CRNN (ours) | D910x8-MS1.8-G | VGG7 | 82.03% | 2445 | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_vgg7.yaml) | [weights](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_vgg7-ea7e996c.ckpt) |
| CRNN (ours) | D910x8-MS1.8-G | ResNet34_vd | 84.45% | 2118 | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_resnet34.yaml) | [weights](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34-83f37f07.ckpt) |
| CRNN (PaddleOCR) | - | ResNet34_vd | 83.99% | -| -| - |
| **Model** | **Context** | **Backbone** | **Avg Accuracy** | **Train T. (s/epoch)** | **Recipe** | **Download** |
|------------------|----------------|---------------|------------------|------------------------|------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------|
| CRNN (ours) | D910x8-MS1.8-G | VGG7 | 82.03% | 2445 | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_vgg7.yaml) | [weights](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_vgg7-ea7e996c.ckpt) |
| CRNN (ours) | D910x8-MS1.8-G | ResNet34_vd | 84.45% | 2118 | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_resnet34.yaml) | [weights](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34-83f37f07.ckpt) |
| CRNN (PaddleOCR) | - | ResNet34_vd | 83.99% | - | - | - |

</div>

**Notes:**
- Context: Training context denoted as {device}x{pieces}-{MS mode}, where mindspore mode can be G-graph mode or F-pynative mode with ms function. For example, D910x8-MS1.8-G is for training on 8 pieces of Ascend 910 NPU using graph mode based on Minspore version 1.8.
- To reproduce the result on other contexts, please ensure the global batch size is the same.
- Both VGG and ResNet models are trained from scratch without any pre-training.
- The above models are trained with MJSynth (MJ) and SynthText (ST) datasets. For more data details, please refer to section 3.1.2 Dataset Preparation].
- The above models are trained with MJSynth (MJ) and SynthText (ST) datasets. For more data details, please refer to [Dataset Preparation](#312-dataset-preparation) section.
- **Evaluations are tested individually on each benchmark dataset, and Avg Accuracy is the average of accuracies across all sub-datasets.**
- For the PaddleOCR version of CRNN, the performance is reported on the trained model provided on their [github](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_en/algorithm_rec_crnn_en.md).

Expand All @@ -62,15 +62,15 @@ Please refer to the [installation instruction](https://github.com/mindspore-lab/

#### 3.1.2 Dataset Preparation
Please download lmdb dataset for traininig and evaluation from [here](https://www.dropbox.com/sh/i39abvnefllx2si/AAAbAYRvxzRp3cIE5HzqUw3ra?dl=0) (ref: [deep-text-recognition-benchmark](https://github.com/clovaai/deep-text-recognition-benchmark#download-lmdb-dataset-for-traininig-and-evaluation-from-here)). There're several zip files:
- `data_lmdb_release.zip` contains the entire datasets including train, valid and evaluation.
- `validation.zip` is the union dataset for Validation
- `data_lmdb_release.zip` contains the **entire** datasets including training.zip, validation.zip and evaluation.zip.
- `validation.zip` is the union dataset for Validation.
- `evaluation.zip` contains several benchmarking datasets.

Unzip the data and after preparation, the data structure should be like

``` text
.
├── train
├── training
│   ├── MJ
│   │   ├── data.mdb
│   │   ├── lock.mdb
Expand All @@ -91,8 +91,8 @@ Unzip the data and after preparation, the data structure should be like
```

#### 3.1.3 Check YAML Config Files
Please check the following important args: `system.distribute`, `system.val_while_train`, `common.batch_size`, `train.dataset.dataset_root`, `train.dataset.data_dir`, `train.dataset.label_file`,
`eval.dataset.dataset_root`, `eval.dataset.data_dir`, `eval.dataset.label_file`, `eval.loader.batch_size`. Explanations of these important args:
Please check the following important args: `system.distribute`, `system.val_while_train`, `common.batch_size`, `train.ckpt_save_dir`, `train.dataset.dataset_root`, `train.dataset.data_dir`, `train.dataset.label_file`,
`eval.ckpt_load_path`, `eval.dataset.dataset_root`, `eval.dataset.data_dir`, `eval.dataset.label_file`, `eval.loader.batch_size`. Explanations of these important args:

```yaml
system:
Expand All @@ -106,7 +106,7 @@ common:
batch_size: &batch_size 64 # Batch size for training
...
train:
ckpt_save_dir: './tmp_rec'
ckpt_save_dir: './tmp_rec' # The training result (including checkpoints, per-epoch performance and curves) saving directory
dataset_sink_mode: False
dataset:
type: LMDBDataset
Expand All @@ -115,7 +115,7 @@ train:
# label_file: # Path of training label file, concatenated with `dataset_root` to be the complete path of training label file, not required when using LMDBDataset
...
eval:
ckpt_load_path: './tmp_rec/best.ckpt'
ckpt_load_path: './tmp_rec/best.ckpt' # checkpoint file path
dataset_sink_mode: False
dataset:
type: LMDBDataset
Expand Down Expand Up @@ -164,7 +164,7 @@ The training result (including checkpoints, per-epoch performance and curves) wi
To evaluate the accuracy of the trained model, you can use `eval.py`. Please set the checkpoint path to the arg `ckpt_load_path` in the `eval` section of yaml config file, set `distribute` to be False, and then run:

```
python tools/eval.py --config configs/rec/crnn/crnn_vgg7.yaml
python tools/eval.py --config configs/rec/crnn/crnn_resnet34.yaml
```

## References
Expand Down
Loading

0 comments on commit 0effe3b

Please sign in to comment.