You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 20 kB

5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409
  1. # Contents
  2. - [Contents](#contents)
  3. - [SSD Description](#ssd-description)
  4. - [Model Architecture](#model-architecture)
  5. - [Dataset](#dataset)
  6. - [Environment Requirements](#environment-requirements)
  7. - [Quick Start](#quick-start)
  8. - [Prepare the model](#prepare-the-model)
  9. - [Run the scripts](#run-the-scripts)
  10. - [Script Description](#script-description)
  11. - [Script and Sample Code](#script-and-sample-code)
  12. - [Script Parameters](#script-parameters)
  13. - [Training Process](#training-process)
  14. - [Training on Ascend](#training-on-ascend)
  15. - [Training on GPU](#training-on-gpu)
  16. - [Evaluation Process](#evaluation-process)
  17. - [Evaluation on Ascend](#evaluation-on-ascend)
  18. - [Evaluation on GPU](#evaluation-on-gpu)
  19. - [Export MindIR](#export-mindir)
  20. - [Model Description](#model-description)
  21. - [Performance](#performance)
  22. - [Evaluation Performance](#evaluation-performance)
  23. - [Inference Performance](#inference-performance)
  24. - [Description of Random Situation](#description-of-random-situation)
  25. - [ModelZoo Homepage](#modelzoo-homepage)
  26. ## [SSD Description](#contents)
  27. SSD discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location. At prediction time, the network generates scores for the presence of each object category in each default box and produces adjustments to the box to better match the object shape.Additionally, the network combines predictions from multiple feature maps with different resolutions to naturally handle objects of various sizes.
  28. [Paper](https://arxiv.org/abs/1512.02325): Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg.European Conference on Computer Vision (ECCV), 2016 (In press).
  29. ## [Model Architecture](#contents)
  30. The SSD approach is based on a feed-forward convolutional network that produces a fixed-size collection of bounding boxes and scores for the presence of object class instances in those boxes, followed by a non-maximum suppression step to produce the final detections. The early network layers are based on a standard architecture used for high quality image classification, which is called the base network. Then add auxiliary structure to the network to produce detections.
  31. We present two different base architecture.
  32. - **ssd300**, reference from the paper. Using mobilenetv2 as backbone and the same bbox predictor as the paper pressent.
  33. - ***ssd-mobilenet-v1-fpn**, using mobilenet-v1 and FPN as feature extractor with weight-shared box predcitors.
  34. ## [Dataset](#contents)
  35. Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below.
  36. Dataset used: [COCO2017](<http://images.cocodataset.org/>)
  37. - Dataset size:19G
  38. - Train:18G,118000 images
  39. - Val:1G,5000 images
  40. - Annotations:241M,instances,captions,person_keypoints etc
  41. - Data format:image and json files
  42. - Note:Data will be processed in dataset.py
  43. ## [Environment Requirements](#contents)
  44. - Install [MindSpore](https://www.mindspore.cn/install/en).
  45. - Download the dataset COCO2017.
  46. - We use COCO2017 as training dataset in this example by default, and you can also use your own datasets.
  47. First, install Cython ,pycocotool and opencv to process data and to get evaluation result.
  48. ```shell
  49. pip install Cython
  50. pip install pycocotools
  51. pip install opencv-python
  52. ```
  53. 1. If coco dataset is used. **Select dataset to coco when run script.**
  54. Change the `coco_root` and other settings you need in `src/config.py`. The directory structure is as follows:
  55. ```shell
  56. .
  57. └─coco_dataset
  58. ├─annotations
  59. ├─instance_train2017.json
  60. └─instance_val2017.json
  61. ├─val2017
  62. └─train2017
  63. ```
  64. 2. If VOC dataset is used. **Select dataset to voc when run script.**
  65. Change `classes`, `num_classes`, `voc_json` and `voc_root` in `src/config.py`. `voc_json` is the path of json file with coco format for evalution, `voc_root` is the path of VOC dataset, the directory structure is as follows:
  66. ```shell
  67. .
  68. └─voc_dataset
  69. └─train
  70. ├─0001.jpg
  71. └─0001.xml
  72. ...
  73. ├─xxxx.jpg
  74. └─xxxx.xml
  75. └─eval
  76. ├─0001.jpg
  77. └─0001.xml
  78. ...
  79. ├─xxxx.jpg
  80. └─xxxx.xml
  81. ```
  82. 3. If your own dataset is used. **Select dataset to other when run script.**
  83. Organize the dataset infomation into a TXT file, each row in the file is as follows:
  84. ```shell
  85. train2017/0000001.jpg 0,259,401,459,7 35,28,324,201,2 0,30,59,80,2
  86. ```
  87. Each row is an image annotation which split by space, the first column is a relative path of image, the others are box and class infomations of the format [xmin,ymin,xmax,ymax,class]. We read image from an image path joined by the `image_dir`(dataset directory) and the relative path in `anno_path`(the TXT file path), `image_dir` and `anno_path` are setting in `src/config.py`.
  88. ## [Quick Start](#contents)
  89. ### Prepare the model
  90. 1. Chose the model by chaning the `using_model` in `src/confgi.py`. The optional models are: `ssd300`, `ssd_mobilenet_v1_fpn`.
  91. 2. Change the datset config in the corresponding config. `src/config_ssd300.py` or `src/config_ssd_mobilenet_v1_fpn.py`.
  92. 3. If you are running with `ssd_mobilenet_v1_fpn`, you need a pretrained model for `mobilenet_v1`. Set the checkpoint path to `feature_extractor_base_param` in `src/config_ssd_mobilenet_v1_fpn.py`. For more detail about training mobilnet_v1, please refer to the mobilenetv1 model.
  93. ### Run the scripts
  94. After installing MindSpore via the official website, you can start training and evaluation as follows:
  95. - runing on Ascend
  96. ```shell
  97. # distributed training on Ascend
  98. sh run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] [RANK_TABLE_FILE]
  99. # run eval on Ascend
  100. sh run_eval.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID]
  101. ```
  102. - runing on GPU
  103. ```shell
  104. # distributed training on GPU
  105. sh run_distribute_train_gpu.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET]
  106. # run eval on GPU
  107. sh run_eval_gpu.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID]
  108. ```
  109. - runing on CPU(support Windows and Ubuntu)
  110. **CPU is usually used for fine-tuning, which needs pre_trained checkpoint.**
  111. ```shell
  112. # training on CPU
  113. python train.py --run_platform=CPU --lr=[LR] --dataset=[DATASET] --epoch_size=[EPOCH_SIZE] --batch_size=[BATCH_SIZE] --pre_trained=[PRETRAINED_CKPT] --filter_weight=True --save_checkpoint_epochs=1
  114. # run eval on GPU
  115. python eval.py --run_platform=CPU --dataset=[DATASET] --checkpoint_path=[PRETRAINED_CKPT]
  116. ```
  117. ## [Script Description](#contents)
  118. ### [Script and Sample Code](#contents)
  119. ```shell
  120. .
  121. └─ cv
  122. └─ ssd
  123. ├─ README.md # descriptions about SSD
  124. ├─ scripts
  125. ├─ run_distribute_train.sh # shell script for distributed on ascend
  126. ├─ run_distribute_train_gpu.sh # shell script for distributed on gpu
  127. ├─ run_eval.sh # shell script for eval on ascend
  128. └─ run_eval_gpu.sh # shell script for eval on gpu
  129. ├─ src
  130. ├─ __init__.py # init file
  131. ├─ box_utils.py # bbox utils
  132. ├─ eval_utils.py # metrics utils
  133. ├─ config.py # total config
  134. ├─ dataset.py # create dataset and process dataset
  135. ├─ init_params.py # parameters utils
  136. ├─ lr_schedule.py # learning ratio generator
  137. └─ ssd.py # ssd architecture
  138. ├─ eval.py # eval scripts
  139. ├─ train.py # train scripts
  140. ├─ export.py # export mindir script
  141. └─ mindspore_hub_conf.py # mindspore hub interface
  142. ```
  143. ### [Script Parameters](#contents)
  144. ```shell
  145. Major parameters in train.py and config.py as follows:
  146. "device_num": 1 # Use device nums
  147. "lr": 0.05 # Learning rate init value
  148. "dataset": coco # Dataset name
  149. "epoch_size": 500 # Epoch size
  150. "batch_size": 32 # Batch size of input tensor
  151. "pre_trained": None # Pretrained checkpoint file path
  152. "pre_trained_epoch_size": 0 # Pretrained epoch size
  153. "save_checkpoint_epochs": 10 # The epoch interval between two checkpoints. By default, the checkpoint will be saved per 10 epochs
  154. "loss_scale": 1024 # Loss scale
  155. "filter_weight": False # Load paramters in head layer or not. If the class numbers of train dataset is different from the class numbers in pre_trained checkpoint, please set True.
  156. "freeze_layer": "none" # Freeze the backbone paramters or not, support none and backbone.
  157. "class_num": 81 # Dataset class number
  158. "image_shape": [300, 300] # Image height and width used as input to the model
  159. "mindrecord_dir": "/data/MindRecord_COCO" # MindRecord path
  160. "coco_root": "/data/coco2017" # COCO2017 dataset path
  161. "voc_root": "/data/voc_dataset" # VOC original dataset path
  162. "voc_json": "annotations/voc_instances_val.json" # is the path of json file with coco format for evalution
  163. "image_dir": "" # Other dataset image path, if coco or voc used, it will be useless
  164. "anno_path": "" # Other dataset annotation path, if coco or voc used, it will be useless
  165. ```
  166. ### [Training Process](#contents)
  167. To train the model, run `train.py`. If the `mindrecord_dir` is empty, it will generate [mindrecord](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/convert_dataset.html) files by `coco_root`(coco dataset), `voc_root`(voc dataset) or `image_dir` and `anno_path`(own dataset). **Note if mindrecord_dir isn't empty, it will use mindrecord_dir instead of raw images.**
  168. #### Training on Ascend
  169. - Distribute mode
  170. ```shell
  171. sh run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] [RANK_TABLE_FILE] [PRE_TRAINED](optional) [PRE_TRAINED_EPOCH_SIZE](optional)
  172. ```
  173. We need five or seven parameters for this scripts.
  174. - `DEVICE_NUM`: the device number for distributed train.
  175. - `EPOCH_NUM`: epoch num for distributed train.
  176. - `LR`: learning rate init value for distributed train.
  177. - `DATASET`:the dataset mode for distributed train.
  178. - `RANK_TABLE_FILE :` the path of [rank_table.json](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools), it is better to use absolute path.
  179. - `PRE_TRAINED :` the path of pretrained checkpoint file, it is better to use absolute path.
  180. - `PRE_TRAINED_EPOCH_SIZE :` the epoch num of pretrained.
  181. Training result will be stored in the current path, whose folder name begins with "LOG". Under this, you can find checkpoint file together with result like the followings in log
  182. ```shell
  183. epoch: 1 step: 458, loss is 3.1681802
  184. epoch time: 228752.4654865265, per step time: 499.4595316299705
  185. epoch: 2 step: 458, loss is 2.8847265
  186. epoch time: 38912.93382644653, per step time: 84.96273761232868
  187. epoch: 3 step: 458, loss is 2.8398118
  188. epoch time: 38769.184827804565, per step time: 84.64887516987896
  189. ...
  190. epoch: 498 step: 458, loss is 0.70908034
  191. epoch time: 38771.079778671265, per step time: 84.65301261718616
  192. epoch: 499 step: 458, loss is 0.7974688
  193. epoch time: 38787.413120269775, per step time: 84.68867493508685
  194. epoch: 500 step: 458, loss is 0.5548882
  195. epoch time: 39064.8467540741, per step time: 85.29442522723602
  196. ```
  197. #### Training on GPU
  198. - Distribute mode
  199. ```shell
  200. sh run_distribute_train_gpu.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] [PRE_TRAINED](optional) [PRE_TRAINED_EPOCH_SIZE](optional)
  201. ```
  202. We need five or seven parameters for this scripts.
  203. - `DEVICE_NUM`: the device number for distributed train.
  204. - `EPOCH_NUM`: epoch num for distributed train.
  205. - `LR`: learning rate init value for distributed train.
  206. - `DATASET`:the dataset mode for distributed train.
  207. - `PRE_TRAINED :` the path of pretrained checkpoint file, it is better to use absolute path.
  208. - `PRE_TRAINED_EPOCH_SIZE :` the epoch num of pretrained.
  209. Training result will be stored in the current path, whose folder name is "LOG". Under this, you can find checkpoint files together with result like the followings in log
  210. ```shell
  211. epoch: 1 step: 1, loss is 420.11783
  212. epoch: 1 step: 2, loss is 434.11032
  213. epoch: 1 step: 3, loss is 476.802
  214. ...
  215. epoch: 1 step: 458, loss is 3.1283689
  216. epoch time: 150753.701, per step time: 329.157
  217. ...
  218. ```
  219. ### [Evaluation Process](#contents)
  220. #### Evaluation on Ascend
  221. ```shell
  222. sh run_eval.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID]
  223. ```
  224. We need two parameters for this scripts.
  225. - `DATASET`:the dataset mode of evaluation dataset.
  226. - `CHECKPOINT_PATH`: the absolute path for checkpoint file.
  227. - `DEVICE_ID`: the device id for eval.
  228. > checkpoint can be produced in training process.
  229. Inference result will be stored in the example path, whose folder name begins with "eval". Under this, you can find result like the followings in log.
  230. ```shell
  231. Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.238
  232. Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.400
  233. Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.240
  234. Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.039
  235. Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.198
  236. Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.438
  237. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.250
  238. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.389
  239. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.424
  240. Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.122
  241. Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.434
  242. Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.697
  243. ========================================
  244. mAP: 0.23808886505483504
  245. ```
  246. #### Evaluation on GPU
  247. ```shell
  248. sh run_eval_gpu.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID]
  249. ```
  250. We need two parameters for this scripts.
  251. - `DATASET`:the dataset mode of evaluation dataset.
  252. - `CHECKPOINT_PATH`: the absolute path for checkpoint file.
  253. - `DEVICE_ID`: the device id for eval.
  254. > checkpoint can be produced in training process.
  255. Inference result will be stored in the example path, whose folder name begins with "eval". Under this, you can find result like the followings in log.
  256. ```shell
  257. Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.224
  258. Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.375
  259. Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.228
  260. Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.034
  261. Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.189
  262. Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.407
  263. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.243
  264. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.382
  265. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.417
  266. Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.120
  267. Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.425
  268. Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.686
  269. ========================================
  270. mAP: 0.2244936111705981
  271. ```
  272. ### [Export MindIR](#contents)
  273. ```shell
  274. python export.py --ckpt_file [CKPT_PATH] --file_name [FILE_NAME] --file_format [FILE_FORMAT]
  275. ```
  276. The ckpt_file parameter is required.
  277. ## [Model Description](#contents)
  278. ### [Performance](#contents)
  279. #### Evaluation Performance
  280. | Parameters | Ascend | GPU |
  281. | -------------------------- | -------------------------------------------------------------| -------------------------------------------------------------|
  282. | Model Version | SSD V1 | SSD V1 |
  283. | Resource | Ascend 910 ;CPU 2.60GHz,192cores;Memory,755G | NV SMX2 V100-16G |
  284. | uploaded Date | 09/15/2020 (month/day/year) | 09/24/2020 (month/day/year) |
  285. | MindSpore Version | 1.0.0 | 1.0.0 |
  286. | Dataset | COCO2017 | COCO2017 |
  287. | Training Parameters | epoch = 500, batch_size = 32 | epoch = 800, batch_size = 32 |
  288. | Optimizer | Momentum | Momentum |
  289. | Loss Function | Sigmoid Cross Entropy,SmoothL1Loss | Sigmoid Cross Entropy,SmoothL1Loss |
  290. | Speed | 8pcs: 90ms/step | 8pcs: 121ms/step |
  291. | Total time | 8pcs: 4.81hours | 8pcs: 12.31hours |
  292. | Parameters (M) | 34 | 34 |
  293. | Scripts | <https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/ssd> | <https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/ssd> |
  294. #### Inference Performance
  295. | Parameters | Ascend | GPU |
  296. | ------------------- | ----------------------------| ----------------------------|
  297. | Model Version | SSD V1 | SSD V1 |
  298. | Resource | Ascend 910 | GPU |
  299. | Uploaded Date | 09/15/2020 (month/day/year) | 09/24/2020 (month/day/year) |
  300. | MindSpore Version | 1.0.0 | 1.0.0 |
  301. | Dataset | COCO2017 | COCO2017 |
  302. | batch_size | 1 | 1 |
  303. | outputs | mAP | mAP |
  304. | Accuracy | IoU=0.50: 23.8% | IoU=0.50: 22.4% |
  305. | Model for inference | 34M(.ckpt file) | 34M(.ckpt file) |
  306. ## [Description of Random Situation](#contents)
  307. In dataset.py, we set the seed inside “create_dataset" function. We also use random seed in train.py.
  308. ## [ModelZoo Homepage](#contents)
  309. Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).