You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 21 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421
  1. # Contents
  2. - [DeepLabV3 Description](#DeepLabV3-description)
  3. - [Model Architecture](#model-architecture)
  4. - [Dataset](#dataset)
  5. - [Features](#features)
  6. - [Mixed Precision](#mixed-precision)
  7. - [Environment Requirements](#environment-requirements)
  8. - [Quick Start](#quick-start)
  9. - [Script Description](#script-description)
  10. - [Script and Sample Code](#script-and-sample-code)
  11. - [Script Parameters](#script-parameters)
  12. - [Training Process](#training-process)
  13. - [Evaluation Process](#evaluation-process)
  14. - [Model Description](#model-description)
  15. - [Performance](#performance)
  16. - [Evaluation Performance](#evaluation-performance)
  17. - [Description of Random Situation](#description-of-random-situation)
  18. - [ModelZoo Homepage](#modelzoo-homepage)
  19. # [DeepLabV3 Description](#contents)
  20. ## Description
  21. DeepLab is a series of image semantic segmentation models, DeepLabV3 improves significantly over previous versions. Two keypoints of DeepLabV3: Its multi-grid atrous convolution makes it better to deal with segmenting objects at multiple scales, and augmented ASPP makes image-level features available to capture long range information.
  22. This repository provides a script and recipe to DeepLabV3 model and achieve state-of-the-art performance.
  23. Refer to [this paper][1] for network details.
  24. `Chen L C, Papandreou G, Schroff F, et al. Rethinking atrous convolution for semantic image segmentation[J]. arXiv preprint arXiv:1706.05587, 2017.`
  25. [1]: https://arxiv.org/abs/1706.05587
  26. # [Model Architecture](#contents)
  27. Resnet101 as backbone, atrous convolution for dense feature extraction.
  28. # [Dataset](#contents)
  29. Pascal VOC datasets and Semantic Boundaries Dataset
  30. - Download segmentation dataset.
  31. - Prepare the training data list file. The list file saves the relative path to image and annotation pairs. Lines are like:
  32. ```
  33. JPEGImages/00001.jpg SegmentationClassGray/00001.png
  34. JPEGImages/00002.jpg SegmentationClassGray/00002.png
  35. JPEGImages/00003.jpg SegmentationClassGray/00003.png
  36. JPEGImages/00004.jpg SegmentationClassGray/00004.png
  37. ......
  38. ```
  39. - Configure and run build_data.sh to convert dataset to mindrecords. Arguments in scripts/build_data.sh:
  40. ```
  41. --data_root root path of training data
  42. --data_lst list of training data(prepared above)
  43. --dst_path where mindrecords are saved
  44. --num_shards number of shards of the mindrecords
  45. --shuffle shuffle or not
  46. ```
  47. # [Features](#contents)
  48. ## Mixed Precision
  49. The [mixed precision](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/enable_mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data types, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware.
  50. For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching ‘reduce precision’.
  51. # [Environment Requirements](#contents)
  52. - Hardware(Ascend)
  53. - Prepare hardware environment with Ascend. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
  54. - Framework
  55. - [MindSpore](https://www.mindspore.cn/install/en)
  56. - For more information, please check the resources below:
  57. - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html)
  58. - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html)
  59. - Install python packages in requirements.txt
  60. - Generate config json file for 8pcs training
  61. ```
  62. # From the root of this project
  63. cd src/tools/
  64. python3 get_multicards_json.py 10.111.*.*
  65. # 10.111.*.* is the computer's ip address.
  66. ```
  67. # [Quick Start](#contents)
  68. After installing MindSpore via the official website, you can start training and evaluation as follows:
  69. - Runing on Ascend
  70. Based on original DeepLabV3 paper, we reproduce two training experiments on vocaug (also as trainaug) dataset and evaluate on voc val dataset.
  71. For single device training, please config parameters, training script is:
  72. ```
  73. run_standalone_train.sh
  74. ```
  75. For 8 devices training, training steps are as follows:
  76. 1. Train s16 with vocaug dataset, finetuning from resnet101 pretrained model, script is:
  77. ```
  78. run_distribute_train_s16_r1.sh
  79. ```
  80. 2. Train s8 with vocaug dataset, finetuning from model in previous step, training script is:
  81. ```
  82. run_distribute_train_s8_r1.sh
  83. ```
  84. 3. Train s8 with voctrain dataset, finetuning from model in pervious step, training script is:
  85. ```
  86. run_distribute_train_s8_r2.sh
  87. ```
  88. For evaluation, evaluating steps are as follows:
  89. 1. Eval s16 with voc val dataset, eval script is:
  90. ```
  91. run_eval_s16.sh
  92. ```
  93. 2. Eval s8 with voc val dataset, eval script is:
  94. ```
  95. run_eval_s8.sh
  96. ```
  97. 3. Eval s8 multiscale with voc val dataset, eval script is:
  98. ```
  99. run_eval_s8_multiscale.sh
  100. ```
  101. 4. Eval s8 multiscale and flip with voc val dataset, eval script is:
  102. ```
  103. run_eval_s8_multiscale_flip.sh
  104. ```
  105. # [Script Description](#contents)
  106. ## [Script and Sample Code](#contents)
  107. ```shell
  108. .
  109. └──deeplabv3
  110. ├── README.md
  111. ├── script
  112. ├── build_data.sh # convert raw data to mindrecord dataset
  113. ├── run_distribute_train_s16_r1.sh # launch ascend distributed training(8 pcs) with vocaug dataset in s16 structure
  114. ├── run_distribute_train_s8_r1.sh # launch ascend distributed training(8 pcs) with vocaug dataset in s8 structure
  115. ├── run_distribute_train_s8_r2.sh # launch ascend distributed training(8 pcs) with voctrain dataset in s8 structure
  116. ├── run_eval_s16.sh # launch ascend evaluation in s16 structure
  117. ├── run_eval_s8.sh # launch ascend evaluation in s8 structure
  118. ├── run_eval_s8_multiscale.sh # launch ascend evaluation with multiscale in s8 structure
  119. ├── run_eval_s8_multiscale_filp.sh # launch ascend evaluation with multiscale and filp in s8 structure
  120. ├── run_standalone_train.sh # launch ascend standalone training(1 pc)
  121. ├── src
  122. ├── data
  123. ├── dataset.py # mindrecord data generator
  124. ├── build_seg_data.py # data preprocessing
  125. ├── loss
  126. ├── loss.py # loss definition for deeplabv3
  127. ├── nets
  128. ├── deeplab_v3
  129. ├── deeplab_v3.py # DeepLabV3 network structure
  130. ├── net_factory.py # set S16 and S8 structures
  131. ├── tools
  132. ├── get_multicards_json.py # get rank table file
  133. └── utils
  134. └── learning_rates.py # generate learning rate
  135. ├── eval.py # eval net
  136. ├── train.py # train net
  137. └── requirements.txt # requirements file
  138. ```
  139. ## [Script Parameters](#contents)
  140. Default Configuration
  141. ```
  142. "data_file":"/PATH/TO/MINDRECORD_NAME" # dataset path
  143. "train_epochs":300 # total epochs
  144. "batch_size":32 # batch size of input tensor
  145. "crop_size":513 # crop size
  146. "base_lr":0.08 # initial learning rate
  147. "lr_type":cos # decay mode for generating learning rate
  148. "min_scale":0.5 # minimum scale of data argumentation
  149. "max_scale":2.0 # maximum scale of data argumentation
  150. "ignore_label":255 # ignore label
  151. "num_classes":21 # number of classes
  152. "model":deeplab_v3_s16 # select model
  153. "ckpt_pre_trained":"/PATH/TO/PRETRAIN_MODEL" # path to load pretrain checkpoint
  154. "is_distributed": # distributed training, it will be True if the parameter is set
  155. "save_steps":410 # steps interval for saving
  156. "freeze_bn": # freeze_bn, it will be True if the parameter is set
  157. "keep_checkpoint_max":200 # max checkpoint for saving
  158. ```
  159. ## [Training Process](#contents)
  160. ### Usage
  161. #### Running on Ascend
  162. Based on original DeepLabV3 paper, we reproduce two training experiments on vocaug (also as trainaug) dataset and evaluate on voc val dataset.
  163. For single device training, please config parameters, training script is as follows:
  164. ```
  165. # run_standalone_train.sh
  166. python ${train_code_path}/train.py --data_file=/PATH/TO/MINDRECORD_NAME \
  167. --train_dir=${train_path}/ckpt \
  168. --train_epochs=200 \
  169. --batch_size=32 \
  170. --crop_size=513 \
  171. --base_lr=0.015 \
  172. --lr_type=cos \
  173. --min_scale=0.5 \
  174. --max_scale=2.0 \
  175. --ignore_label=255 \
  176. --num_classes=21 \
  177. --model=deeplab_v3_s16 \
  178. --ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
  179. --save_steps=1500 \
  180. --keep_checkpoint_max=200 >log 2>&1 &
  181. ```
  182. For 8 devices training, training steps are as follows:
  183. 1. Train s16 with vocaug dataset, finetuning from resnet101 pretrained model, script is as follows:
  184. ```
  185. # run_distribute_train_s16_r1.sh
  186. for((i=0;i<=$RANK_SIZE-1;i++));
  187. do
  188. export RANK_ID=$i
  189. export DEVICE_ID=`expr $i + $RANK_START_ID`
  190. echo 'start rank='$i', device id='$DEVICE_ID'...'
  191. mkdir ${train_path}/device$DEVICE_ID
  192. cd ${train_path}/device$DEVICE_ID
  193. python ${train_code_path}/train.py --train_dir=${train_path}/ckpt \
  194. --data_file=/PATH/TO/MINDRECORD_NAME \
  195. --train_epochs=300 \
  196. --batch_size=32 \
  197. --crop_size=513 \
  198. --base_lr=0.08 \
  199. --lr_type=cos \
  200. --min_scale=0.5 \
  201. --max_scale=2.0 \
  202. --ignore_label=255 \
  203. --num_classes=21 \
  204. --model=deeplab_v3_s16 \
  205. --ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
  206. --is_distributed \
  207. --save_steps=410 \
  208. --keep_checkpoint_max=200 >log 2>&1 &
  209. done
  210. ```
  211. 2. Train s8 with vocaug dataset, finetuning from model in previous step, training script is as follows:
  212. ```
  213. # run_distribute_train_s8_r1.sh
  214. for((i=0;i<=$RANK_SIZE-1;i++));
  215. do
  216. export RANK_ID=$i
  217. export DEVICE_ID=`expr $i + $RANK_START_ID`
  218. echo 'start rank='$i', device id='$DEVICE_ID'...'
  219. mkdir ${train_path}/device$DEVICE_ID
  220. cd ${train_path}/device$DEVICE_ID
  221. python ${train_code_path}/train.py --train_dir=${train_path}/ckpt \
  222. --data_file=/PATH/TO/MINDRECORD_NAME \
  223. --train_epochs=800 \
  224. --batch_size=16 \
  225. --crop_size=513 \
  226. --base_lr=0.02 \
  227. --lr_type=cos \
  228. --min_scale=0.5 \
  229. --max_scale=2.0 \
  230. --ignore_label=255 \
  231. --num_classes=21 \
  232. --model=deeplab_v3_s8 \
  233. --loss_scale=2048 \
  234. --ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
  235. --is_distributed \
  236. --save_steps=820 \
  237. --keep_checkpoint_max=200 >log 2>&1 &
  238. done
  239. ```
  240. 3. Train s8 with voctrain dataset, finetuning from model in pervious step, training script is as follows:
  241. ```
  242. # run_distribute_train_s8_r2.sh
  243. for((i=0;i<=$RANK_SIZE-1;i++));
  244. do
  245. export RANK_ID=$i
  246. export DEVICE_ID=`expr $i + $RANK_START_ID`
  247. echo 'start rank='$i', device id='$DEVICE_ID'...'
  248. mkdir ${train_path}/device$DEVICE_ID
  249. cd ${train_path}/device$DEVICE_ID
  250. python ${train_code_path}/train.py --train_dir=${train_path}/ckpt \
  251. --data_file=/PATH/TO/MINDRECORD_NAME \
  252. --train_epochs=300 \
  253. --batch_size=16 \
  254. --crop_size=513 \
  255. --base_lr=0.008 \
  256. --lr_type=cos \
  257. --min_scale=0.5 \
  258. --max_scale=2.0 \
  259. --ignore_label=255 \
  260. --num_classes=21 \
  261. --model=deeplab_v3_s8 \
  262. --loss_scale=2048 \
  263. --ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
  264. --is_distributed \
  265. --save_steps=110 \
  266. --keep_checkpoint_max=200 >log 2>&1 &
  267. done
  268. ```
  269. ### Result
  270. - Training vocaug in s16 structure
  271. ```
  272. # distribute training result(8p)
  273. epoch: 1 step: 41, loss is 0.8319108
  274. Epoch time: 213856.477, per step time: 5216.012
  275. epoch: 2 step: 41, loss is 0.46052963
  276. Epoch time: 21233.183, per step time: 517.883
  277. epoch: 3 step: 41, loss is 0.45012417
  278. Epoch time: 21231.951, per step time: 517.852
  279. epoch: 4 step: 41, loss is 0.30687785
  280. Epoch time: 21199.911, per step time: 517.071
  281. epoch: 5 step: 41, loss is 0.22769661
  282. Epoch time: 21240.281, per step time: 518.056
  283. epoch: 6 step: 41, loss is 0.25470978
  284. ...
  285. ```
  286. - Training vocaug in s8 structure
  287. ```
  288. # distribute training result(8p)
  289. epoch: 1 step: 82, loss is 0.024167
  290. Epoch time: 322663.456, per step time: 3934.920
  291. epoch: 2 step: 82, loss is 0.019832281
  292. Epoch time: 43107.238, per step time: 525.698
  293. epoch: 3 step: 82, loss is 0.021008959
  294. Epoch time: 43109.519, per step time: 525.726
  295. epoch: 4 step: 82, loss is 0.01912349
  296. Epoch time: 43177.287, per step time: 526.552
  297. epoch: 5 step: 82, loss is 0.022886964
  298. Epoch time: 43095.915, per step time: 525.560
  299. epoch: 6 step: 82, loss is 0.018708453
  300. Epoch time: 43107.458, per step time: 525.701
  301. ...
  302. ```
  303. - Training voctrain in s8 structure
  304. ```
  305. # distribute training result(8p)
  306. epoch: 1 step: 11, loss is 0.00554624
  307. Epoch time: 199412.913, per step time: 18128.447
  308. epoch: 2 step: 11, loss is 0.007181881
  309. Epoch time: 6119.375, per step time: 556.307
  310. epoch: 3 step: 11, loss is 0.004980865
  311. Epoch time: 5996.978, per step time: 545.180
  312. epoch: 4 step: 11, loss is 0.0047651967
  313. Epoch time: 5987.412, per step time: 544.310
  314. epoch: 5 step: 11, loss is 0.006262637
  315. Epoch time: 5956.682, per step time: 541.517
  316. epoch: 6 step: 11, loss is 0.0060750707
  317. Epoch time: 5962.164, per step time: 542.015
  318. ...
  319. ```
  320. ## [Evaluation Process](#contents)
  321. ### Usage
  322. #### Running on Ascend
  323. Config checkpoint with --ckpt_path, run script, mIOU with print in eval_path/eval_log.
  324. ```
  325. ./run_eval_s16.sh # test s16
  326. ./run_eval_s8.sh # test s8
  327. ./run_eval_s8_multiscale.sh # test s8 + multiscale
  328. ./run_eval_s8_multiscale_flip.sh # test s8 + multiscale + flip
  329. ```
  330. Example of test script is as follows:
  331. ```
  332. python ${train_code_path}/eval.py --data_root=/PATH/TO/DATA \
  333. --data_lst=/PATH/TO/DATA_lst.txt \
  334. --batch_size=16 \
  335. --crop_size=513 \
  336. --ignore_label=255 \
  337. --num_classes=21 \
  338. --model=deeplab_v3_s8 \
  339. --scales=0.5 \
  340. --scales=0.75 \
  341. --scales=1.0 \
  342. --scales=1.25 \
  343. --scales=1.75 \
  344. --flip \
  345. --freeze_bn \
  346. --ckpt_path=/PATH/TO/PRETRAIN_MODEL >${eval_path}/eval_log 2>&1 &
  347. ```
  348. ### Result
  349. Our result were obtained by running the applicable training script. To achieve the same results, follow the steps in the Quick Start Guide.
  350. #### Training accuracy
  351. | **Network** | OS=16 | OS=8 | MS | Flip | mIOU | mIOU in paper |
  352. | :----------: | :-----: | :----: | :----: | :-----: | :-----: | :-------------: |
  353. | deeplab_v3 | √ | | | | 77.37 | 77.21 |
  354. | deeplab_v3 | | √ | | | 78.84 | 78.51 |
  355. | deeplab_v3 | | √ | √ | | 79.70 |79.45 |
  356. | deeplab_v3 | | √ | √ | √ | 79.89 | 79.77 |
  357. Note: There OS is output stride, and MS is multiscale.
  358. # [Model Description](#contents)
  359. ## [Performance](#contents)
  360. ### Evaluation Performance
  361. | Parameters | Ascend 910
  362. | -------------------------- | -------------------------------------- |
  363. | Model Version | DeepLabV3
  364. | Resource | Ascend 910 |
  365. | Uploaded Date | 09/04/2020 (month/day/year) |
  366. | MindSpore Version | 0.7.0-alpha |
  367. | Dataset | PASCAL VOC2012 + SBD |
  368. | Training Parameters | epoch = 300, batch_size = 32 (s16_r1) <br> epoch = 800, batch_size = 16 (s8_r1) <br> epoch = 300, batch_size = 16 (s8_r2) |
  369. | Optimizer | Momentum |
  370. | Loss Function | Softmax Cross Entropy |
  371. | Outputs | probability |
  372. | Loss | 0.0065883575 |
  373. | Speed | 31ms/step(1pc, s8)<br> 234ms/step(8pcs, s8) |
  374. | Checkpoint for Fine tuning | 443M (.ckpt file) |
  375. | Scripts | [Link](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/deeplabv3) |
  376. # [Description of Random Situation](#contents)
  377. In dataset.py, we set the seed inside "create_dataset" function. We also use random seed in train.py.
  378. # [ModelZoo Homepage](#contents)
  379. Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).