You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 12 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275
  1. # Contents
  2. - [Contents](#contents)
  3. - [Tiny-DarkNet Description](#tiny-darknet-description)
  4. - [Model Architecture](#model-architecture)
  5. - [Dataset](#dataset)
  6. - [Environment Requirements](#environment-requirements)
  7. - [Quick Start](#quick-start)
  8. - [Script Description](#script-description)
  9. - [Script and Sample Code](#script-and-sample-code)
  10. - [Script Parameters](#script-parameters)
  11. - [Training Process](#training-process)
  12. - [Training](#training)
  13. - [Distributed Training](#distributed-training)
  14. - [Evaluation Procsee](#evaluation-process)
  15. - [Evaluation](#evaluation)
  16. - [Model Description](#model-description)
  17. - [Performance](#performance)
  18. - [Training Performance](#training-performance)
  19. - [Inference Performance](#inference-performance)
  20. - [ModelZoo Homepage](#modelzoo-homepage)
  21. # [Tiny-DarkNet Description](#contents)
  22. Tiny-DarkNet is a 16-layer image classification network model for the classic image classification data set ImageNet proposed by Joseph Chet Redmon and others. Tiny-DarkNet, as a simplified version of Darknet designed by the author to minimize the size of the model to meet the needs of users for smaller model sizes, has better image classification capabilities than AlexNet and SqueezeNet, and at the same time it uses only fewer model parameters than them. In order to reduce the scale of the model, the Tiny-DarkNet network does not use a fully connected layer, but only consists of a convolutional layer, a maximum pooling layer, and an average pooling layer.
  23. For more detailed information on Tiny-DarkNet, please refer to the [official introduction.](https://pjreddie.com/darknet/tiny-darknet/)
  24. # [Model Architecture](#contents)
  25. Specifically, the Tiny-DarkNet network consists of 1×1 conv , 3×3 conv , 2×2 max and a global average pooling layer. These modules form each other to convert the input picture into a 1×1000 vector.
  26. # [Dataset](#contents)
  27. In the following sections, we will introduce how to run the scripts using the related dataset below.:
  28. <!-- Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below. -->
  29. <!-- Dataset used: [CIFAR-10](<http://www.cs.toronto.edu/~kriz/cifar.html>) -->
  30. <!-- Dataset used ImageNet can refer to [paper](<https://ieeexplore.ieee.org/abstract/document/5206848>)
  31. - Dataset size: 125G, 1250k colorful images in 1000 classes
  32. - Train: 120G, 1200k images
  33. - Test: 5G, 50k images
  34. - Data format: RGB images.
  35. - Note: Data will be processed in src/dataset.py -->
  36. Dataset used can refer to [paper](<https://ieeexplore.ieee.org/abstract/document/5206848>)
  37. - Dataset size:125G,1250k colorful images in 1000 classes
  38. - Train: 120G,1200k images
  39. - Test: 5G, 50k images
  40. - Data format: RGB images
  41. - Note: Data will be processed in src/dataset.py
  42. # [Environment Requirements](#contents)
  43. - Hardware(Ascend)
  44. - Prepare hardware environment with Ascend or GPU processor.
  45. - Framework
  46. - [MindSpore](https://www.mindspore.cn/install/en)
  47. - For more information,please check the resources below:
  48. - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
  49. - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
  50. # [Quick Start](#contents)
  51. After installing MindSpore via the official website, you can start training and evaluation as follows:
  52. - running on Ascend:
  53. ```python
  54. # run training example
  55. bash ./scripts/run_standalone_train.sh 0
  56. # run distributed training example
  57. bash ./scripts/run_distribute_train.sh rank_table.json
  58. # run evaluation example
  59. python eval.py > eval.log 2>&1 &
  60. OR
  61. bash ./script/run_eval.sh
  62. ```
  63. For distributed training, a hccl configuration file with JSON format needs to be created in advance.
  64. Please follow the instructions in the link below:
  65. <https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools.>
  66. For more details, please refer the specify script.
  67. # [Script Description](#contents)
  68. ## [Script and Sample Code](#contents)
  69. ```bash
  70. ├── tinydarknet
  71. ├── README.md // descriptions about Tiny-Darknet in English
  72. ├── README_CN.md // descriptions about Tiny-Darknet in Chinese
  73. ├── scripts
  74. ├──run_standalone_train.sh // shell script for single on Ascend
  75. ├──run_distribute_train.sh // shell script for distributed on Ascend
  76. ├──run_eval.sh // shell script for evaluation on Ascend
  77. ├── src
  78. ├─lr_scheduler //learning rate scheduler
  79. ├─__init__.py // init
  80. ├─linear_warmup.py // linear_warmup
  81. ├─warmup_cosine_annealing_lr.py // warmup_cosine_annealing_lr
  82. ├─warmup_step_lr.py // warmup_step_lr
  83. ├──dataset.py // creating dataset
  84. ├──CrossEntropySmooth.py // loss function
  85. ├──tinydarknet.py // Tiny-Darknet architecture
  86. ├──config.py // parameter configuration
  87. ├── train.py // training script
  88. ├── eval.py // evaluation script
  89. ├── export.py // export checkpoint file into air/onnx
  90. ├── mindspore_hub_conf.py // hub config
  91. ```
  92. ## [Script Parameters](#contents)
  93. Parameters for both training and evaluation can be set in config.py
  94. - config for Tiny-Darknet
  95. ```python
  96. 'pre_trained': 'False' # whether training based on the pre-trained model
  97. 'num_classes': 1000 # the number of classes in the dataset
  98. 'lr_init': 0.1 # initial learning rate
  99. 'batch_size': 128 # training batch_size
  100. 'epoch_size': 500 # total training epoch
  101. 'momentum': 0.9 # momentum
  102. 'weight_decay': 1e-4 # weight decay value
  103. 'image_height': 224 # image height used as input to the model
  104. 'image_width': 224 # image width used as input to the model
  105. 'data_path': './ImageNet_Original/train/' # absolute full path to the train datasets
  106. 'val_data_path': './ImageNet_Original/val/' # absolute full path to the evaluation datasets
  107. 'device_target': 'Ascend' # device running the program
  108. 'keep_checkpoint_max': 10 # only keep the last keep_checkpoint_max checkpoint
  109. 'checkpoint_path': '/train_tinydarknet.ckpt' # the absolute full path to save the checkpoint file
  110. 'onnx_filename': 'tinydarknet.onnx' # file name of the onnx model used in export.py
  111. 'air_filename': 'tinydarknet.air' # file name of the air model used in export.py
  112. 'lr_scheduler': 'exponential' # learning rate scheduler
  113. 'lr_epochs': [70, 140, 210, 280] # epoch of lr changing
  114. 'lr_gamma': 0.3 # decrease lr by a factor of exponential lr_scheduler
  115. 'eta_min': 0.0 # eta_min in cosine_annealing scheduler
  116. 'T_max': 150 # T-max in cosine_annealing scheduler
  117. 'warmup_epochs': 0 # warmup epoch
  118. 'is_dynamic_loss_scale': 0 # dynamic loss scale
  119. 'loss_scale': 1024 # loss scale
  120. 'label_smooth_factor': 0.1 # label_smooth_factor
  121. 'use_label_smooth': True # label smooth
  122. ```
  123. For more configuration details, please refer the script config.py.
  124. ## [Training Process](#contents)
  125. ### [Training](#contents)
  126. - running on Ascend:
  127. ```python
  128. bash scripts/run_standalone_train.sh 0
  129. ```
  130. The command above will run in the background, you can view the results through the file train.log.
  131. After training, you'll get some checkpoint files under the script folder by default. The loss value will be achieved as follows:
  132. <!-- After training, you'll get some checkpoint files under the script folder by default. The loss value will be achieved as follows: -->
  133. ```python
  134. # grep "loss is " train.log
  135. epoch: 498 step: 1251, loss is 2.7798953
  136. Epoch time: 130690.544, per step time: 104.469
  137. epoch: 499 step: 1251, loss is 2.9261637
  138. Epoch time: 130511.081, per step time: 104.325
  139. epoch: 500 step: 1251, loss is 2.69412
  140. Epoch time: 127067.548, per step time: 101.573
  141. ...
  142. ```
  143. The model checkpoint file will be saved in the current folder.
  144. <!-- The model checkpoint will be saved in the current directory. -->
  145. ### [Distributed Training](#contents)
  146. - running on Ascend:
  147. ```python
  148. bash ./scripts/run_distribute_train.sh rank_table.json
  149. ```
  150. The above shell script will run distribute training in the background. You can view the results through the file train_parallel[X]/log. The loss value will be achieved as follows:
  151. ```python
  152. # grep "result: " train_parallel*/log
  153. epoch: 498 step: 1251, loss is 2.7798953
  154. Epoch time: 130690.544, per step time: 104.469
  155. epoch: 499 step: 1251, loss is 2.9261637
  156. Epoch time: 130511.081, per step time: 104.325
  157. epoch: 500 step: 1251, loss is 2.69412
  158. Epoch time: 127067.548, per step time: 101.573
  159. ...
  160. ```
  161. ## [Evaluation Process](#contents)
  162. ### [Evaluation](#contents)
  163. - evaluation on Imagenet dataset when running on Ascend:
  164. Before running the command below, please check the checkpoint path used for evaluation. Please set the checkpoint path to be the absolute full path, e.g., "/username/tinydaeknet/train_tinydarknet.ckpt".
  165. ```python
  166. python eval.py > eval.log 2>&1 &
  167. OR
  168. bash scripts/run_eval.sh
  169. ```
  170. The above python command will run in the background. You can view the results through the file "eval.log". The accuracy of the test dataset will be as follows:
  171. ```python
  172. # grep "accuracy: " eval.log
  173. accuracy: {'top_1_accuracy': 0.5871979166666667, 'top_5_accuracy': 0.8175280448717949}
  174. ```
  175. Note that for evaluation after distributed training, please set the checkpoint_path to be the last saved checkpoint file. The accuracy of the test dataset will be as follows:
  176. ```python
  177. # grep "accuracy: " eval.log
  178. accuracy: {'top_1_accuracy': 0.5871979166666667, 'top_5_accuracy': 0.8175280448717949}
  179. ```
  180. # [Model Description](#contents)
  181. ## [Performance](#contents)
  182. ### [Training Performance](#contents)
  183. | Parameters | Ascend |
  184. | -------------------------- | ----------------------------------------------------------- |
  185. | Model Version | V1 |
  186. | Resource | Ascend 910; CPU 2.60GHz, 56cores; Memory 314G; OS Euler2.8 |
  187. | Uploaded Date | 2020/12/22 |
  188. | MindSpore Version | 1.1.0 |
  189. | Dataset | 1200k images |
  190. | Training Parameters | epoch=500, steps=1251, batch_size=128, lr=0.1 |
  191. | Optimizer | Momentum |
  192. | Loss Function | Softmax Cross Entropy |
  193. | Speed | 8 pc: 104 ms/step |
  194. | Total Time | 8 pc: 17.8 hours |
  195. | Parameters(M) | 4.0M |
  196. | Scripts | [Tiny-Darknet Scripts](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/tinydarknet) |
  197. ### [Inference Performance](#contents)
  198. | Parameters | Ascend |
  199. | ------------------- | --------------------------- |
  200. | Model Version | V1 |
  201. | Resource | Ascend 910; OS Euler2.8 |
  202. | Uploaded Date | 2020/12/22 |
  203. | MindSpore Version | 1.1.0 |
  204. | Dataset | 200k images |
  205. | batch_size | 128 |
  206. | Outputs | probability |
  207. | Accuracy | 8 pc Top-1: 58.7%; Top-5: 81.7% |
  208. | Model for inference | 11.6M (.ckpt file) |
  209. # [ModelZoo Homepage](#contents)
  210. Please check the official[homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).