|
|
|
@@ -91,7 +91,7 @@ For FP16 operators, if the input data type is FP32, the backend of MindSpore wil |
|
|
|
|
|
|
|
After installing MindSpore via the official website, you can start training and evaluation as follows: |
|
|
|
|
|
|
|
- Runing on Ascend |
|
|
|
- Running on Ascend |
|
|
|
``` |
|
|
|
# distributed training |
|
|
|
Usage: sh run_distribute_train.sh [resnet50|resnet101|se-resnet50] [cifar10|imagenet2012] [RANK_TABLE_FILE] [DATASET_PATH] [PRETRAINED_CKPT_PATH](optional) |
|
|
|
@@ -104,7 +104,7 @@ Usage: sh run_standalone_train.sh [resnet50|resnet101|se-resnet50] [cifar10|imag |
|
|
|
Usage: sh run_eval.sh [resnet50|resnet101|se-resnet50] [cifar10|imagenet2012] [DATASET_PATH] [CHECKPOINT_PATH] |
|
|
|
``` |
|
|
|
|
|
|
|
- Runing on GPU |
|
|
|
- Running on GPU |
|
|
|
``` |
|
|
|
# distributed training example |
|
|
|
sh run_distribute_train_gpu.sh [resnet50|resnet101] [cifar10|imagenet2012] [DATASET_PATH] [PRETRAINED_CKPT_PATH](optional) |
|
|
|
@@ -124,7 +124,7 @@ sh run_eval_gpu.sh [resnet50|resnet101] [cifar10|imagenet2012] [DATASET_PATH] [C |
|
|
|
. |
|
|
|
└──resnet |
|
|
|
├── README.md |
|
|
|
├── script |
|
|
|
├── scripts |
|
|
|
├── run_distribute_train.sh # launch ascend distributed training(8 pcs) |
|
|
|
├── run_parameter_server_train.sh # launch ascend parameter server training(8 pcs) |
|
|
|
├── run_eval.sh # launch ascend evaluation |
|
|
|
@@ -136,7 +136,7 @@ sh run_eval_gpu.sh [resnet50|resnet101] [cifar10|imagenet2012] [DATASET_PATH] [C |
|
|
|
├── src |
|
|
|
├── config.py # parameter configuration |
|
|
|
├── dataset.py # data preprocessing |
|
|
|
├── crossentropy.py # loss definition for ImageNet2012 dataset |
|
|
|
├── CrossEntropySmooth.py # loss definition for ImageNet2012 dataset |
|
|
|
├── lr_generator.py # generate learning rate for each step |
|
|
|
└── resnet.py # resnet backbone, including resnet50 and resnet101 and se-resnet50 |
|
|
|
├── eval.py # eval net |
|
|
|
@@ -172,7 +172,7 @@ Parameters for both training and evaluation can be set in config.py. |
|
|
|
|
|
|
|
``` |
|
|
|
"class_num": 1001, # dataset class number |
|
|
|
"batch_size": 32, # batch size of input tensor |
|
|
|
"batch_size": 256, # batch size of input tensor |
|
|
|
"loss_scale": 1024, # loss scale |
|
|
|
"momentum": 0.9, # momentum optimizer |
|
|
|
"weight_decay": 1e-4, # weight decay |
|
|
|
@@ -184,10 +184,10 @@ Parameters for both training and evaluation can be set in config.py. |
|
|
|
"save_checkpoint_path": "./", # path to save checkpoint relative to the executed path |
|
|
|
"warmup_epochs": 0, # number of warmup epoch |
|
|
|
"lr_decay_mode": "Linear", # decay mode for generating learning rate |
|
|
|
"label_smooth": True, # label smooth |
|
|
|
"use_label_smooth": True, # label smooth |
|
|
|
"label_smooth_factor": 0.1, # label smooth factor |
|
|
|
"lr_init": 0, # initial learning rate |
|
|
|
"lr_max": 0.1, # maximum learning rate |
|
|
|
"lr_max": 0.8, # maximum learning rate |
|
|
|
"lr_end": 0.0, # minimum learning rate |
|
|
|
``` |
|
|
|
|
|
|
|
@@ -207,7 +207,7 @@ Parameters for both training and evaluation can be set in config.py. |
|
|
|
"save_checkpoint_path": "./", # path to save checkpoint relative to the executed path |
|
|
|
"warmup_epochs": 0, # number of warmup epoch |
|
|
|
"lr_decay_mode": "cosine" # decay mode for generating learning rate |
|
|
|
"label_smooth": 1, # label_smooth |
|
|
|
"use_label_smooth": True, # label_smooth |
|
|
|
"label_smooth_factor": 0.1, # label_smooth_factor |
|
|
|
"lr": 0.1 # base learning rate |
|
|
|
``` |
|
|
|
@@ -229,7 +229,7 @@ Parameters for both training and evaluation can be set in config.py. |
|
|
|
"save_checkpoint_path": "./", # path to save checkpoint relative to the executed path |
|
|
|
"warmup_epochs": 3, # number of warmup epoch |
|
|
|
"lr_decay_mode": "cosine" # decay mode for generating learning rate |
|
|
|
"label_smooth": True, # label_smooth |
|
|
|
"use_label_smooth": True, # label_smooth |
|
|
|
"label_smooth_factor": 0.1, # label_smooth_factor |
|
|
|
"lr_init": 0.0, # initial learning rate |
|
|
|
"lr_max": 0.3, # maximum learning rate |
|
|
|
@@ -421,13 +421,13 @@ result: {'top_5_accuracy': 0.9342589628681178, 'top_1_accuracy': 0.7680657810499 |
|
|
|
| uploaded Date | 04/01/2020 (month/day/year) ; | 08/01/2020 (month/day/year) |
|
|
|
| MindSpore Version | 0.1.0-alpha |0.6.0-alpha | |
|
|
|
| Dataset | ImageNet2012 | ImageNet2012| |
|
|
|
| Training Parameters | epoch=90, steps per epoch=5004, batch_size = 32 |epoch=90, steps per epoch=5004, batch_size = 32 | |
|
|
|
| Training Parameters | epoch=90, steps per epoch=626, batch_size = 256 |epoch=90, steps per epoch=5004, batch_size = 32 | |
|
|
|
| Optimizer | Momentum |Momentum| |
|
|
|
| Loss Function | Softmax Cross Entropy |Softmax Cross Entropy | |
|
|
|
| outputs | probability | probability | |
|
|
|
| Loss | 1.8464266 | 1.9023 | |
|
|
|
| Speed | 18.4ms/step(8pcs) |67.1ms/step(8pcs)| |
|
|
|
| Total time | 139 mins | 500 mins| |
|
|
|
| Speed | 118ms/step(8pcs) |67.1ms/step(8pcs)| |
|
|
|
| Total time | 114 mins | 500 mins| |
|
|
|
| Parameters (M) | 25.5 | 25.5 | |
|
|
|
| Checkpoint for Fine tuning | 197M (.ckpt file) |197M (.ckpt file) | |
|
|
|
| Scripts | [Link](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/resnet) | [Link](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/resnet) | |
|
|
|
|