You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 23 kB

4 years ago
4 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526
  1. # Contents
  2. - [DeepLabV3 Description](#DeepLabV3-description)
  3. - [Model Architecture](#model-architecture)
  4. - [Dataset](#dataset)
  5. - [Features](#features)
  6. - [Mixed Precision](#mixed-precision)
  7. - [Environment Requirements](#environment-requirements)
  8. - [Quick Start](#quick-start)
  9. - [Script Description](#script-description)
  10. - [Script and Sample Code](#script-and-sample-code)
  11. - [Script Parameters](#script-parameters)
  12. - [Training Process](#training-process)
  13. - [Evaluation Process](#evaluation-process)
  14. - [Model Description](#model-description)
  15. - [Performance](#performance)
  16. - [Evaluation Performance](#evaluation-performance)
  17. - [Description of Random Situation](#description-of-random-situation)
  18. - [ModelZoo Homepage](#modelzoo-homepage)
  19. # [DeepLabV3 Description](#contents)
  20. ## Description
  21. DeepLab is a series of image semantic segmentation models, DeepLabV3 improves significantly over previous versions. Two keypoints of DeepLabV3: Its multi-grid atrous convolution makes it better to deal with segmenting objects at multiple scales, and augmented ASPP makes image-level features available to capture long range information.
  22. This repository provides a script and recipe to DeepLabV3 model and achieve state-of-the-art performance.
  23. Refer to [this paper][1] for network details.
  24. `Chen L C, Papandreou G, Schroff F, et al. Rethinking atrous convolution for semantic image segmentation[J]. arXiv preprint arXiv:1706.05587, 2017.`
  25. [1]: https://arxiv.org/abs/1706.05587
  26. # [Model Architecture](#contents)
  27. Resnet101 as backbone, atrous convolution for dense feature extraction.
  28. # [Dataset](#contents)
  29. Pascal VOC datasets and Semantic Boundaries Dataset
  30. - Download segmentation dataset.
  31. - Prepare the training data list file. The list file saves the relative path to image and annotation pairs. Lines are like:
  32. ```shell
  33. JPEGImages/00001.jpg SegmentationClassGray/00001.png
  34. JPEGImages/00002.jpg SegmentationClassGray/00002.png
  35. JPEGImages/00003.jpg SegmentationClassGray/00003.png
  36. JPEGImages/00004.jpg SegmentationClassGray/00004.png
  37. ......
  38. ```
  39. You can also generate the list file automatically by run script: `python get_dataset_lst.py --data_root=/PATH/TO/DATA`
  40. - Configure and run build_data.sh to convert dataset to mindrecords. Arguments in scripts/build_data.sh:
  41. ```shell
  42. --data_root root path of training data
  43. --data_lst list of training data(prepared above)
  44. --dst_path where mindrecords are saved
  45. --num_shards number of shards of the mindrecords
  46. --shuffle shuffle or not
  47. ```
  48. # [Features](#contents)
  49. ## Mixed Precision
  50. The [mixed precision](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/enable_mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data types, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware.
  51. For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching ‘reduce precision’.
  52. # [Environment Requirements](#contents)
  53. - Hardware(Ascend)
  54. - Prepare hardware environment with Ascend. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
  55. - Framework
  56. - [MindSpore](https://www.mindspore.cn/install/en)
  57. - For more information, please check the resources below:
  58. - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html)
  59. - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html)
  60. - Install python packages in requirements.txt
  61. - Generate config json file for 8pcs training
  62. ```
  63. # From the root of this project
  64. cd src/tools/
  65. python3 get_multicards_json.py 10.111.*.*
  66. # 10.111.*.* is the computer's ip address.
  67. ```
  68. # [Quick Start](#contents)
  69. After installing MindSpore via the official website, you can start training and evaluation as follows:
  70. - Running on Ascend
  71. Based on original DeepLabV3 paper, we reproduce two training experiments on vocaug (also as trainaug) dataset and evaluate on voc val dataset.
  72. For single device training, please config parameters, training script is:
  73. ```shell
  74. run_standalone_train.sh
  75. ```
  76. For 8 devices training, training steps are as follows:
  77. 1. Train s16 with vocaug dataset, finetuning from resnet101 pretrained model, script is:
  78. ```shell
  79. run_distribute_train_s16_r1.sh
  80. ```
  81. 2. Train s8 with vocaug dataset, finetuning from model in previous step, training script is:
  82. ```shell
  83. run_distribute_train_s8_r1.sh
  84. ```
  85. 3. Train s8 with voctrain dataset, finetuning from model in previous step, training script is:
  86. ```shell
  87. run_distribute_train_s8_r2.sh
  88. ```
  89. For evaluation, evaluating steps are as follows:
  90. 1. Eval s16 with voc val dataset, eval script is:
  91. ```shell
  92. run_eval_s16.sh
  93. ```
  94. 2. Eval s8 with voc val dataset, eval script is:
  95. ```shell
  96. run_eval_s8.sh
  97. ```
  98. 3. Eval s8 multiscale with voc val dataset, eval script is:
  99. ```shell
  100. run_eval_s8_multiscale.sh
  101. ```
  102. 4. Eval s8 multiscale and flip with voc val dataset, eval script is:
  103. ```shell
  104. run_eval_s8_multiscale_flip.sh
  105. ```
  106. # [Script Description](#contents)
  107. ## [Script and Sample Code](#contents)
  108. ```shell
  109. .
  110. └──deeplabv3
  111. ├── README.md
  112. ├── scripts
  113. ├── build_data.sh # convert raw data to mindrecord dataset
  114. ├── run_distribute_train_s16_r1.sh # launch ascend distributed training(8 pcs) with vocaug dataset in s16 structure
  115. ├── run_distribute_train_s8_r1.sh # launch ascend distributed training(8 pcs) with vocaug dataset in s8 structure
  116. ├── run_distribute_train_s8_r2.sh # launch ascend distributed training(8 pcs) with voctrain dataset in s8 structure
  117. ├── run_eval_s16.sh # launch ascend evaluation in s16 structure
  118. ├── run_eval_s8.sh # launch ascend evaluation in s8 structure
  119. ├── run_eval_s8_multiscale.sh # launch ascend evaluation with multiscale in s8 structure
  120. ├── run_eval_s8_multiscale_filp.sh # launch ascend evaluation with multiscale and filp in s8 structure
  121. ├── run_standalone_train.sh # launch ascend standalone training(1 pc)
  122. ├── run_standalone_train_cpu.sh # launch CPU standalone training
  123. ├── src
  124. ├── data
  125. ├── dataset.py # mindrecord data generator
  126. ├── build_seg_data.py # data preprocessing
  127. ├── get_dataset_lst.py # dataset list file generator
  128. ├── loss
  129. ├── loss.py # loss definition for deeplabv3
  130. ├── nets
  131. ├── deeplab_v3
  132. ├── deeplab_v3.py # DeepLabV3 network structure
  133. ├── net_factory.py # set S16 and S8 structures
  134. ├── tools
  135. ├── get_multicards_json.py # get rank table file
  136. └── utils
  137. └── learning_rates.py # generate learning rate
  138. ├── eval.py # eval net
  139. ├── train.py # train net
  140. └── requirements.txt # requirements file
  141. ```
  142. ## [Script Parameters](#contents)
  143. Default configuration
  144. ```shell
  145. "data_file":"/PATH/TO/MINDRECORD_NAME" # dataset path
  146. "device_target":Ascend # device target
  147. "train_epochs":300 # total epochs
  148. "batch_size":32 # batch size of input tensor
  149. "crop_size":513 # crop size
  150. "base_lr":0.08 # initial learning rate
  151. "lr_type":cos # decay mode for generating learning rate
  152. "min_scale":0.5 # minimum scale of data argumentation
  153. "max_scale":2.0 # maximum scale of data argumentation
  154. "ignore_label":255 # ignore label
  155. "num_classes":21 # number of classes
  156. "model":deeplab_v3_s16 # select model
  157. "ckpt_pre_trained":"/PATH/TO/PRETRAIN_MODEL" # path to load pretrain checkpoint
  158. "is_distributed": # distributed training, it will be True if the parameter is set
  159. "save_steps":410 # steps interval for saving
  160. "keep_checkpoint_max":200 # max checkpoint for saving
  161. ```
  162. ## [Training Process](#contents)
  163. ### Usage
  164. #### Running on Ascend
  165. Based on original DeepLabV3 paper, we reproduce two training experiments on vocaug (also as trainaug) dataset and evaluate on voc val dataset.
  166. For single device training, please config parameters, training script is as follows:
  167. ```shell
  168. # run_standalone_train.sh
  169. python ${train_code_path}/train.py --data_file=/PATH/TO/MINDRECORD_NAME \
  170. --train_dir=${train_path}/ckpt \
  171. --train_epochs=200 \
  172. --batch_size=32 \
  173. --crop_size=513 \
  174. --base_lr=0.015 \
  175. --lr_type=cos \
  176. --min_scale=0.5 \
  177. --max_scale=2.0 \
  178. --ignore_label=255 \
  179. --num_classes=21 \
  180. --model=deeplab_v3_s16 \
  181. --ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
  182. --save_steps=1500 \
  183. --keep_checkpoint_max=200 >log 2>&1 &
  184. ```
  185. For 8 devices training, training steps are as follows:
  186. 1. Train s16 with vocaug dataset, finetuning from resnet101 pretrained model, script is as follows:
  187. ```shell
  188. # run_distribute_train_s16_r1.sh
  189. for((i=0;i<=$RANK_SIZE-1;i++));
  190. do
  191. export RANK_ID=${i}
  192. export DEVICE_ID=$((i + RANK_START_ID))
  193. echo 'start rank='${i}', device id='${DEVICE_ID}'...'
  194. mkdir ${train_path}/device${DEVICE_ID}
  195. cd ${train_path}/device${DEVICE_ID} || exit
  196. python ${train_code_path}/train.py --train_dir=${train_path}/ckpt \
  197. --data_file=/PATH/TO/MINDRECORD_NAME \
  198. --train_epochs=300 \
  199. --batch_size=32 \
  200. --crop_size=513 \
  201. --base_lr=0.08 \
  202. --lr_type=cos \
  203. --min_scale=0.5 \
  204. --max_scale=2.0 \
  205. --ignore_label=255 \
  206. --num_classes=21 \
  207. --model=deeplab_v3_s16 \
  208. --ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
  209. --is_distributed \
  210. --save_steps=410 \
  211. --keep_checkpoint_max=200 >log 2>&1 &
  212. done
  213. ```
  214. 2. Train s8 with vocaug dataset, finetuning from model in previous step, training script is as follows:
  215. ```shell
  216. # run_distribute_train_s8_r1.sh
  217. for((i=0;i<=$RANK_SIZE-1;i++));
  218. do
  219. export RANK_ID=${i}
  220. export DEVICE_ID=$((i + RANK_START_ID))
  221. echo 'start rank='${i}', device id='${DEVICE_ID}'...'
  222. mkdir ${train_path}/device${DEVICE_ID}
  223. cd ${train_path}/device${DEVICE_ID} || exit
  224. python ${train_code_path}/train.py --train_dir=${train_path}/ckpt \
  225. --data_file=/PATH/TO/MINDRECORD_NAME \
  226. --train_epochs=800 \
  227. --batch_size=16 \
  228. --crop_size=513 \
  229. --base_lr=0.02 \
  230. --lr_type=cos \
  231. --min_scale=0.5 \
  232. --max_scale=2.0 \
  233. --ignore_label=255 \
  234. --num_classes=21 \
  235. --model=deeplab_v3_s8 \
  236. --loss_scale=2048 \
  237. --ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
  238. --is_distributed \
  239. --save_steps=820 \
  240. --keep_checkpoint_max=200 >log 2>&1 &
  241. done
  242. ```
  243. 3. Train s8 with voctrain dataset, finetuning from model in previous step, training script is as follows:
  244. ```shell
  245. # run_distribute_train_s8_r2.sh
  246. for((i=0;i<=$RANK_SIZE-1;i++));
  247. do
  248. export RANK_ID=${i}
  249. export DEVICE_ID=$((i + RANK_START_ID))
  250. echo 'start rank='${i}', device id='${DEVICE_ID}'...'
  251. mkdir ${train_path}/device${DEVICE_ID}
  252. cd ${train_path}/device${DEVICE_ID} || exit
  253. python ${train_code_path}/train.py --train_dir=${train_path}/ckpt \
  254. --data_file=/PATH/TO/MINDRECORD_NAME \
  255. --train_epochs=300 \
  256. --batch_size=16 \
  257. --crop_size=513 \
  258. --base_lr=0.008 \
  259. --lr_type=cos \
  260. --min_scale=0.5 \
  261. --max_scale=2.0 \
  262. --ignore_label=255 \
  263. --num_classes=21 \
  264. --model=deeplab_v3_s8 \
  265. --loss_scale=2048 \
  266. --ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
  267. --is_distributed \
  268. --save_steps=110 \
  269. --keep_checkpoint_max=200 >log 2>&1 &
  270. done
  271. ```
  272. #### Running on CPU
  273. For CPU training, please config parameters, training script is as follows:
  274. ```shell
  275. # run_standalone_train_cpu.sh
  276. python ${train_code_path}/train.py --data_file=/PATH/TO/MINDRECORD_NAME \
  277. --device_target=CPU \
  278. --train_dir=${train_path}/ckpt \
  279. --train_epochs=200 \
  280. --batch_size=32 \
  281. --crop_size=513 \
  282. --base_lr=0.015 \
  283. --lr_type=cos \
  284. --min_scale=0.5 \
  285. --max_scale=2.0 \
  286. --ignore_label=255 \
  287. --num_classes=21 \
  288. --model=deeplab_v3_s16 \
  289. --ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
  290. --save_steps=1500 \
  291. --keep_checkpoint_max=200 >log 2>&1 &
  292. ```
  293. ### Result
  294. #### Running on Ascend
  295. - Training vocaug in s16 structure
  296. ```shell
  297. # distribute training result(8p)
  298. epoch: 1 step: 41, loss is 0.8319108
  299. epoch time: 213856.477 ms, per step time: 5216.012 ms
  300. epoch: 2 step: 41, loss is 0.46052963
  301. epoch time: 21233.183 ms, per step time: 517.883 ms
  302. epoch: 3 step: 41, loss is 0.45012417
  303. epoch time: 21231.951 ms, per step time: 517.852 ms
  304. epoch: 4 step: 41, loss is 0.30687785
  305. epoch time: 21199.911 ms, per step time: 517.071 ms
  306. epoch: 5 step: 41, loss is 0.22769661
  307. epoch time: 21240.281 ms, per step time: 518.056 ms
  308. epoch: 6 step: 41, loss is 0.25470978
  309. ...
  310. ```
  311. - Training vocaug in s8 structure
  312. ```shell
  313. # distribute training result(8p)
  314. epoch: 1 step: 82, loss is 0.024167
  315. epoch time: 322663.456 ms, per step time: 3934.920 ms
  316. epoch: 2 step: 82, loss is 0.019832281
  317. epoch time: 43107.238 ms, per step time: 525.698 ms
  318. epoch: 3 step: 82, loss is 0.021008959
  319. epoch time: 43109.519 ms, per step time: 525.726 ms
  320. epoch: 4 step: 82, loss is 0.01912349
  321. epoch time: 43177.287 ms, per step time: 526.552 ms
  322. epoch: 5 step: 82, loss is 0.022886964
  323. epoch time: 43095.915 ms, per step time: 525.560 ms
  324. epoch: 6 step: 82, loss is 0.018708453
  325. epoch time: 43107.458 ms per step time: 525.701 ms
  326. ...
  327. ```
  328. - Training voctrain in s8 structure
  329. ```shell
  330. # distribute training result(8p)
  331. epoch: 1 step: 11, loss is 0.00554624
  332. epoch time: 199412.913 ms, per step time: 18128.447 ms
  333. epoch: 2 step: 11, loss is 0.007181881
  334. epoch time: 6119.375 ms, per step time: 556.307 ms
  335. epoch: 3 step: 11, loss is 0.004980865
  336. epoch time: 5996.978 ms, per step time: 545.180 ms
  337. epoch: 4 step: 11, loss is 0.0047651967
  338. epoch time: 5987.412 ms, per step time: 544.310 ms
  339. epoch: 5 step: 11, loss is 0.006262637
  340. epoch time: 5956.682 ms, per step time: 541.517 ms
  341. epoch: 6 step: 11, loss is 0.0060750707
  342. epoch time: 5962.164 ms, per step time: 542.015 ms
  343. ...
  344. ```
  345. #### Running on CPU
  346. - Training voctrain in s16 structure
  347. ```bash
  348. epoch: 1 step: 1, loss is 3.655448
  349. epoch: 2 step: 1, loss is 1.5531876
  350. epoch: 3 step: 1, loss is 1.5099041
  351. ...
  352. ```
  353. ## [Evaluation Process](#contents)
  354. ### Usage
  355. #### Running on Ascend
  356. Configure checkpoint with --ckpt_path and dataset path. Then run script, mIOU will be printed in eval_path/eval_log.
  357. ```shell
  358. ./run_eval_s16.sh # test s16
  359. ./run_eval_s8.sh # test s8
  360. ./run_eval_s8_multiscale.sh # test s8 + multiscale
  361. ./run_eval_s8_multiscale_flip.sh # test s8 + multiscale + flip
  362. ```
  363. Example of test script is as follows:
  364. ```shell
  365. python ${train_code_path}/eval.py --data_root=/PATH/TO/DATA \
  366. --data_lst=/PATH/TO/DATA_lst.txt \
  367. --batch_size=16 \
  368. --crop_size=513 \
  369. --ignore_label=255 \
  370. --num_classes=21 \
  371. --model=deeplab_v3_s8 \
  372. --scales=0.5 \
  373. --scales=0.75 \
  374. --scales=1.0 \
  375. --scales=1.25 \
  376. --scales=1.75 \
  377. --flip \
  378. --freeze_bn \
  379. --ckpt_path=/PATH/TO/PRETRAIN_MODEL >${eval_path}/eval_log 2>&1 &
  380. ```
  381. ### Result
  382. Our result were obtained by running the applicable training script. To achieve the same results, follow the steps in the Quick Start Guide.
  383. #### Training accuracy
  384. | **Network** | OS=16 | OS=8 | MS | Flip | mIOU | mIOU in paper |
  385. | :----------: | :-----: | :----: | :----: | :-----: | :-----: | :-------------: |
  386. | deeplab_v3 | √ | | | | 77.37 | 77.21 |
  387. | deeplab_v3 | | √ | | | 78.84 | 78.51 |
  388. | deeplab_v3 | | √ | √ | | 79.70 |79.45 |
  389. | deeplab_v3 | | √ | √ | √ | 79.89 | 79.77 |
  390. Note: There OS is output stride, and MS is multiscale.
  391. # [Model Description](#contents)
  392. ## [Performance](#contents)
  393. ### Evaluation Performance
  394. | Parameters | Ascend 910
  395. | -------------------------- | -------------------------------------- |
  396. | Model Version | DeepLabV3
  397. | Resource | Ascend 910 |
  398. | Uploaded Date | 09/04/2020 (month/day/year) |
  399. | MindSpore Version | 0.7.0-alpha |
  400. | Dataset | PASCAL VOC2012 + SBD |
  401. | Training Parameters | epoch = 300, batch_size = 32 (s16_r1) <br> epoch = 800, batch_size = 16 (s8_r1) <br> epoch = 300, batch_size = 16 (s8_r2) |
  402. | Optimizer | Momentum |
  403. | Loss Function | Softmax Cross Entropy |
  404. | Outputs | probability |
  405. | Loss | 0.0065883575 |
  406. | Speed | 60 fps(1pc, s16)<br> 480 fps(8pcs, s16) <br> 244 fps (8pcs, s8) |
  407. | Total time | 8pcs: 706 mins |
  408. | Parameters (M) | 58.2 |
  409. | Checkpoint for Fine tuning | 443M (.ckpt file) |
  410. | Model for inference | 223M (.air file) |
  411. | Scripts | [Link](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/deeplabv3) |
  412. ## Inference Performance
  413. | Parameters | Ascend |
  414. | ------------------- | --------------------------- |
  415. | Model Version | DeepLabV3 V1 |
  416. | Resource | Ascend 910 |
  417. | Uploaded Date | 09/04/2020 (month/day/year) |
  418. | MindSpore Version | 0.7.0-alpha |
  419. | Dataset | VOC datasets |
  420. | batch_size | 32 (s16); 16 (s8) |
  421. | outputs | probability |
  422. | Accuracy | 8pcs: <br> s16: 77.37 <br> s8: 78.84% <br> s8_multiscale: 79.70% <br> s8_Flip: 79.89% |
  423. | Model for inference | 443M (.ckpt file) |
  424. # [Description of Random Situation](#contents)
  425. In dataset.py, we set the seed inside "create_dataset" function. We also use random seed in train.py.
  426. # [ModelZoo Homepage](#contents)
  427. Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).