| @@ -183,7 +183,7 @@ For example, you can run the shell command below to launch the training procedur | |||||
| sh scripts/train_standalone.sh 0 /data/dataset/imagenet/ scripts/pretrian/ 0 | sh scripts/train_standalone.sh 0 /data/dataset/imagenet/ scripts/pretrian/ 0 | ||||
| ``` | ``` | ||||
| If eval_each_epoch is 1, it will evaluate after each epoch and save the parameters with the max accurracy. But in this case, the time of one epoch will be longer. | |||||
| If eval_each_epoch is 1, it will evaluate after each epoch and save the parameters with the max accuracy. But in this case, the time of one epoch will be longer. | |||||
| If eval_each_epoch is 0, it will save parameters every some epochs instead of evaluating in the training process. | If eval_each_epoch is 0, it will save parameters every some epochs instead of evaluating in the training process. | ||||
| @@ -91,7 +91,7 @@ def dpn_train(args): | |||||
| context.set_auto_parallel_context(device_num=args.group_size, parallel_mode=ParallelMode.DATA_PARALLEL, | context.set_auto_parallel_context(device_num=args.group_size, parallel_mode=ParallelMode.DATA_PARALLEL, | ||||
| gradients_mean=True) | gradients_mean=True) | ||||
| # select for master rank save ckpt or all rank save, compatiable for model parallel | |||||
| # select for master rank save ckpt or all rank save, compatible for model parallel | |||||
| args.rank_save_ckpt_flag = 0 | args.rank_save_ckpt_flag = 0 | ||||
| if args.is_save_on_master: | if args.is_save_on_master: | ||||
| if args.rank == 0: | if args.rank == 0: | ||||
| @@ -148,7 +148,7 @@ sh scripts/run_standalone_train.sh DEVICE_ID DATA_PATH | |||||
| ### Result | ### Result | ||||
| Training result will be stored in the example path. Checkpoints will be stored at `. /ckpt_0` by default, and training log will be redirected to `log.txt` like followings. | |||||
| Training result will be stored in the example path. Checkpoints will be stored at `. /ckpt_0` by default, and training log will be redirected to `log.txt` like following. | |||||
| ``` shell | ``` shell | ||||
| epoch: 1 step: 1251, loss is 4.8427444 | epoch: 1 step: 1251, loss is 4.8427444 | ||||
| @@ -182,7 +182,7 @@ sh scripts/run_eval.sh DEVICE_ID DATA_DIR PATH_CHECKPOINT | |||||
| ### Result | ### Result | ||||
| Evaluation result will be stored in the example path, you can find result like the followings in `eval.log`. | |||||
| Evaluation result will be stored in the example path, you can find result like the following in `eval.log`. | |||||
| ```shell | ```shell | ||||
| result: {'Loss': 1.7797744848789312, 'Top_1_Acc': 0.7985777243589743, 'Top_5_Acc': 0.9485777243589744} | result: {'Loss': 1.7797744848789312, 'Top_1_Acc': 0.7985777243589743, 'Top_5_Acc': 0.9485777243589744} | ||||