You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 19 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378
  1. # Contents
  2. - [Contents](#contents)
  3. - [YOLOv3-DarkNet53 Description](#yolov3-darknet53-description)
  4. - [Model Architecture](#model-architecture)
  5. - [Dataset](#dataset)
  6. - [Environment Requirements](#environment-requirements)
  7. - [Quick Start](#quick-start)
  8. - [Script Description](#script-description)
  9. - [Script and Sample Code](#script-and-sample-code)
  10. - [Script Parameters](#script-parameters)
  11. - [Training Process](#training-process)
  12. - [Training](#training)
  13. - [Distributed Training](#distributed-training)
  14. - [Evaluation Process](#evaluation-process)
  15. - [Evaluation](#evaluation)
  16. - [Model Description](#model-description)
  17. - [Performance](#performance)
  18. - [Evaluation Performance](#evaluation-performance)
  19. - [Inference Performance](#inference-performance)
  20. - [Description of Random Situation](#description-of-random-situation)
  21. - [ModelZoo Homepage](#modelzoo-homepage)
  22. ## [YOLOv3-DarkNet53 Description](#contents)
  23. You only look once (YOLO) is a state-of-the-art, real-time object detection system. YOLOv3 is extremely fast and accurate.
  24. Prior detection systems repurpose classifiers or localizers to perform detection. They apply the model to an image at multiple locations and scales. High scoring regions of the image are considered detections.
  25. YOLOv3 use a totally different approach. It apply a single neural network to the full image. This network divides the image into regions and predicts bounding boxes and probabilities for each region. These bounding boxes are weighted by the predicted probabilities.
  26. YOLOv3 uses a few tricks to improve training and increase performance, including: multi-scale predictions, a better backbone classifier, and more. The full details are in the paper!
  27. [Paper](https://pjreddie.com/media/files/papers/YOLOv3.pdf): YOLOv3: An Incremental Improvement. Joseph Redmon, Ali Farhadi,
  28. University of Washington
  29. ## [Model Architecture](#contents)
  30. YOLOv3 use DarkNet53 for performing feature extraction, which is a hybrid approach between the network used in YOLOv2, Darknet-19, and that newfangled residual network stuff. DarkNet53 uses successive 3 × 3 and 1 × 1 convolutional layers and has some shortcut connections as well and is significantly larger. It has 53 convolutional layers.
  31. ## [Dataset](#contents)
  32. Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below.
  33. Dataset used: [COCO2014](https://cocodataset.org/#download)
  34. - Dataset size: 19G, 123,287 images, 80 object categories.
  35. - Train:13G, 82,783 images
  36. - Val:6G, 40,504 images
  37. - Annotations: 241M, Train/Val annotations
  38. - The directory structure is as follows.
  39. ```text
  40. ├── dataset
  41. ├── coco2014
  42. ├── annotations
  43. │ ├─ train.json
  44. │ └─ val.json
  45. ├─ train
  46. │ ├─picture1.jpg
  47. │ ├─ ...
  48. │ └─picturen.jpg
  49. └─ val
  50. ├─picture1.jpg
  51. ├─ ...
  52. └─picturen.jpg
  53. ```
  54. ## [Environment Requirements](#contents)
  55. - Hardware(Ascend/GPU)
  56. - Prepare hardware environment with Ascend or GPU processor.
  57. - Framework
  58. - [MindSpore](https://www.mindspore.cn/install/en)
  59. - For more information, please check the resources below:
  60. - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
  61. - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
  62. ## [Quick Start](#contents)
  63. - After installing MindSpore via the official website, you can start training and evaluation in as follows. If running on GPU, please add `--device_target=GPU` in the python command or use the "_gpu" shell script ("xxx_gpu.sh").
  64. - Prepare the backbone_darknet53.ckpt and hccl_8p.json files, before run network.
  65. - Pretrained_backbone can use src/convert_weight.py, convert darknet53.conv.74 to mindspore ckpt.
  66. ```
  67. python convert_weight.py --input_file ./darknet53.conv.74
  68. ```
  69. darknet53.conv.74 can get from [download](https://pjreddie.com/media/files/darknet53.conv.74) .
  70. you can use command in linux os.
  71. ```
  72. wget https://pjreddie.com/media/files/darknet53.conv.74
  73. ```
  74. - Genatating hccl_8p.json, Run the script of model_zoo/utils/hccl_tools/hccl_tools.py.
  75. The following parameter "[0-8)" indicates that the hccl_8p.json file of cards 0 to 7 is generated.
  76. ```
  77. python hccl_tools.py --device_num "[0,8)"
  78. ```
  79. ```network
  80. # The parameter of training_shape define image shape for network, default is "".
  81. # It means use 10 kinds of shape as input shape, or it can be set some kind of shape.
  82. # run training example(1p) by python command.
  83. python train.py \
  84. --data_dir=./dataset/coco2014 \
  85. --pretrained_backbone=darknet53_backbone.ckpt \
  86. --is_distributed=0 \
  87. --lr=0.001 \
  88. --loss_scale=1024 \
  89. --weight_decay=0.016 \
  90. --T_max=320 \
  91. --max_epoch=320 \
  92. --warmup_epochs=4 \
  93. --training_shape=416 \
  94. --lr_scheduler=cosine_annealing > log.txt 2>&1 &
  95. # standalone training example(1p) by shell script
  96. sh run_standalone_train.sh dataset/coco2014 darknet53_backbone.ckpt
  97. # For Ascend device, distributed training example(8p) by shell script
  98. sh run_distribute_train.sh dataset/coco2014 darknet53_backbone.ckpt rank_table_8p.json
  99. # For GPU device, distributed training example(8p) by shell script
  100. sh run_distribute_train_gpu.sh dataset/coco2014 darknet53_backbone.ckpt
  101. # run evaluation by python command
  102. python eval.py \
  103. --data_dir=./dataset/coco2014 \
  104. --pretrained=yolov3.ckpt \
  105. --testing_shape=416 > log.txt 2>&1 &
  106. # run evaluation by shell script
  107. sh run_eval.sh dataset/coco2014/ checkpoint/0-319_102400.ckpt
  108. ```
  109. ## [Script Description](#contents)
  110. ### [Script and Sample Code](#contents)
  111. ```contents
  112. .
  113. └─yolov3_darknet53
  114. ├─README.md
  115. ├─mindspore_hub_conf.md # config for mindspore hub
  116. ├─scripts
  117. ├─run_standalone_train.sh # launch standalone training(1p) in ascend
  118. ├─run_distribute_train.sh # launch distributed training(8p) in ascend
  119. └─run_eval.sh # launch evaluating in ascend
  120. ├─run_standalone_train_gpu.sh # launch standalone training(1p) in gpu
  121. ├─run_distribute_train_gpu.sh # launch distributed training(8p) in gpu
  122. └─run_eval_gpu.sh # launch evaluating in gpu
  123. ├─src
  124. ├─__init__.py # python init file
  125. ├─config.py # parameter configuration
  126. ├─darknet.py # backbone of network
  127. ├─distributed_sampler.py # iterator of dataset
  128. ├─initializer.py # initializer of parameters
  129. ├─logger.py # log function
  130. ├─loss.py # loss function
  131. ├─lr_scheduler.py # generate learning rate
  132. ├─transforms.py # Preprocess data
  133. ├─util.py # util function
  134. ├─yolo.py # yolov3 network
  135. ├─yolo_dataset.py # create dataset for YOLOV3
  136. ├─eval.py # eval net
  137. └─train.py # train net
  138. ```
  139. ### [Script Parameters](#contents)
  140. ```parameters
  141. Major parameters in train.py as follow.
  142. optional arguments:
  143. -h, --help show this help message and exit
  144. --device_target device where the code will be implemented: "Ascend" | "GPU", default is "Ascend"
  145. --data_dir DATA_DIR Train dataset directory.
  146. --per_batch_size PER_BATCH_SIZE
  147. Batch size for Training. Default: 32.
  148. --pretrained_backbone PRETRAINED_BACKBONE
  149. The ckpt file of DarkNet53. Default: "".
  150. --resume_yolov3 RESUME_YOLOV3
  151. The ckpt file of YOLOv3, which used to fine tune.
  152. Default: ""
  153. --lr_scheduler LR_SCHEDULER
  154. Learning rate scheduler, options: exponential,
  155. cosine_annealing. Default: exponential
  156. --lr LR Learning rate. Default: 0.001
  157. --lr_epochs LR_EPOCHS
  158. Epoch of changing of lr changing, split with ",".
  159. Default: 220,250
  160. --lr_gamma LR_GAMMA Decrease lr by a factor of exponential lr_scheduler.
  161. Default: 0.1
  162. --eta_min ETA_MIN Eta_min in cosine_annealing scheduler. Default: 0
  163. --T_max T_MAX T-max in cosine_annealing scheduler. Default: 320
  164. --max_epoch MAX_EPOCH
  165. Max epoch num to train the model. Default: 320
  166. --warmup_epochs WARMUP_EPOCHS
  167. Warmup epochs. Default: 0
  168. --weight_decay WEIGHT_DECAY
  169. Weight decay factor. Default: 0.0005
  170. --momentum MOMENTUM Momentum. Default: 0.9
  171. --loss_scale LOSS_SCALE
  172. Static loss scale. Default: 1024
  173. --label_smooth LABEL_SMOOTH
  174. Whether to use label smooth in CE. Default:0
  175. --label_smooth_factor LABEL_SMOOTH_FACTOR
  176. Smooth strength of original one-hot. Default: 0.1
  177. --log_interval LOG_INTERVAL
  178. Logging interval steps. Default: 100
  179. --ckpt_path CKPT_PATH
  180. Checkpoint save location. Default: outputs/
  181. --ckpt_interval CKPT_INTERVAL
  182. Save checkpoint interval. Default: None
  183. --is_save_on_master IS_SAVE_ON_MASTER
  184. Save ckpt on master or all rank, 1 for master, 0 for
  185. all ranks. Default: 1
  186. --is_distributed IS_DISTRIBUTED
  187. Distribute train or not, 1 for yes, 0 for no. Default:
  188. 1
  189. --rank RANK Local rank of distributed. Default: 0
  190. --group_size GROUP_SIZE
  191. World size of device. Default: 1
  192. --need_profiler NEED_PROFILER
  193. Whether use profiler. 0 for no, 1 for yes. Default: 0
  194. --training_shape TRAINING_SHAPE
  195. Fix training shape. Default: ""
  196. --resize_rate RESIZE_RATE
  197. Resize rate for multi-scale training. Default: None
  198. ```
  199. ### [Training Process](#contents)
  200. #### Training
  201. ```command
  202. python train.py \
  203. --data_dir=./dataset/coco2014 \
  204. --pretrained_backbone=darknet53_backbone.ckpt \
  205. --is_distributed=0 \
  206. --lr=0.001 \
  207. --loss_scale=1024 \
  208. --weight_decay=0.016 \
  209. --T_max=320 \
  210. --max_epoch=320 \
  211. --warmup_epochs=4 \
  212. --training_shape=416 \
  213. --lr_scheduler=cosine_annealing > log.txt 2>&1 &
  214. ```
  215. The python command above will run in the background, you can view the results through the file `log.txt`. If running on GPU, please add `--device_target=GPU` in the python command.
  216. After training, you'll get some checkpoint files under the outputs folder by default. The loss value will be achieved as follows:
  217. ```log
  218. # grep "loss:" train/log.txt
  219. 2020-08-20 14:14:43,640:INFO:epoch[0], iter[0], loss:7809.262695, 0.15 imgs/sec, lr:9.746589057613164e-06
  220. 2020-08-20 14:15:05,142:INFO:epoch[0], iter[100], loss:2778.349033, 133.92 imgs/sec, lr:0.0009844054002314806
  221. 2020-08-20 14:15:31,796:INFO:epoch[0], iter[200], loss:535.517361, 130.54 imgs/sec, lr:0.0019590642768889666
  222. ...
  223. ```
  224. The model checkpoint will be saved in outputs directory.
  225. #### Distributed Training
  226. For Ascend device, distributed training example(8p) by shell script
  227. ```command
  228. sh run_distribute_train.sh dataset/coco2014 darknet53_backbone.ckpt rank_table_8p.json
  229. ```
  230. For GPU device, distributed training example(8p) by shell script
  231. ```command
  232. sh run_distribute_train_gpu.sh dataset/coco2014 darknet53_backbone.ckpt
  233. ```
  234. The above shell script will run distribute training in the background. You can view the results through the file `train_parallel[X]/log.txt`. The loss value will be achieved as follows:
  235. ```log
  236. # distribute training result(8p)
  237. epoch[0], iter[0], loss:14623.384766, 1.23 imgs/sec, lr:7.812499825377017e-07
  238. epoch[0], iter[100], loss:746.253051, 22.01 imgs/sec, lr:7.890690624925494e-05
  239. epoch[0], iter[200], loss:101.579535, 344.41 imgs/sec, lr:0.00015703124925494192
  240. epoch[0], iter[300], loss:85.136754, 341.99 imgs/sec, lr:0.00023515624925494185
  241. epoch[1], iter[400], loss:79.429322, 405.14 imgs/sec, lr:0.00031328126788139345
  242. ...
  243. epoch[318], iter[102000], loss:30.504046, 458.03 imgs/sec, lr:9.63797575082026e-08
  244. epoch[319], iter[102100], loss:31.599150, 341.08 imgs/sec, lr:2.409552052995423e-08
  245. epoch[319], iter[102200], loss:31.652273, 372.57 imgs/sec, lr:2.409552052995423e-08
  246. epoch[319], iter[102300], loss:31.952403, 496.02 imgs/sec, lr:2.409552052995423e-08
  247. ...
  248. ```
  249. ### [Evaluation Process](#contents)
  250. #### Evaluation
  251. Before running the command below. If running on GPU, please add `--device_target=GPU` in the python command or use the "_gpu" shell script ("xxx_gpu.sh").
  252. ```command
  253. python eval.py \
  254. --data_dir=./dataset/coco2014 \
  255. --pretrained=yolov3.ckpt \
  256. --testing_shape=416 > log.txt 2>&1 &
  257. OR
  258. sh run_eval.sh dataset/coco2014/ checkpoint/0-319_102400.ckpt
  259. ```
  260. The above python command will run in the background. You can view the results through the file "log.txt". The mAP of the test dataset will be as follows:
  261. This the standard format from `pycocotools`, you can refer to [cocodataset](https://cocodataset.org/#detection-eval) for more detail.
  262. ```eval log
  263. # log.txt
  264. =============coco eval reulst=========
  265. Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.311
  266. Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.528
  267. Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.322
  268. Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.127
  269. Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.323
  270. Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.428
  271. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.259
  272. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.398
  273. Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.423
  274. Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.224
  275. Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.442
  276. Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.551
  277. ```
  278. ## [Model Description](#contents)
  279. ### [Performance](#contents)
  280. #### Evaluation Performance
  281. | Parameters | YOLO |YOLO |
  282. | -------------------------- | ----------------------------------------------------------- |------------------------------------------------------------ |
  283. | Model Version | YOLOv3 |YOLOv3 |
  284. | Resource | Ascend 910; CPU 2.60GHz, 192cores; Memory, 755G | NV SMX2 V100-16G; CPU 2.10GHz, 96cores; Memory, 251G |
  285. | uploaded Date | 09/15/2020 (month/day/year) | 09/02/2020 (month/day/year) |
  286. | MindSpore Version | 1.1.1 | 1.1.1 |
  287. | Dataset | COCO2014 | COCO2014 |
  288. | Training Parameters | epoch=320, batch_size=32, lr=0.001, momentum=0.9 | epoch=320, batch_size=32, lr=0.1, momentum=0.9 |
  289. | Optimizer | Momentum | Momentum |
  290. | Loss Function | Sigmoid Cross Entropy with logits | Sigmoid Cross Entropy with logits |
  291. | outputs | boxes and label | boxes and label |
  292. | Loss | 34 | 34 |
  293. | Speed | 1pc: 350 ms/step; | 1pc: 600 ms/step; |
  294. | Total time | 8pc: 13 hours | 8pc: 18 hours(shape=416) |
  295. | Parameters (M) | 62.1 | 62.1 |
  296. | Checkpoint for Fine tuning | 474M (.ckpt file) | 474M (.ckpt file) |
  297. | Scripts | https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/yolov3_darknet53 | https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/yolov3_darknet53 |
  298. #### Inference Performance
  299. | Parameters | YOLO |YOLO |
  300. | ------------------- | --------------------------- |------------------------------|
  301. | Model Version | YOLOv3 | YOLOv3 |
  302. | Resource | Ascend 910 | NV SMX2 V100-16G |
  303. | Uploaded Date | 09/15/2020 (month/day/year) | 08/20/2020 (month/day/year) |
  304. | MindSpore Version | 1.1.1 | 1.1.1 |
  305. | Dataset | COCO2014, 40,504 images | COCO2014, 40,504 images |
  306. | batch_size | 1 | 1 |
  307. | outputs | mAP | mAP |
  308. | Accuracy | 8pcs: 31.1% | 8pcs: 29.7%~30.3% (shape=416)|
  309. | Model for inference | 474M (.ckpt file) | 474M (.ckpt file) |
  310. ## [Description of Random Situation](#contents)
  311. There are random seeds in distributed_sampler.py, transforms.py, yolo_dataset.py files.
  312. ## [ModelZoo Homepage](#contents)
  313. Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).