add distribute train README for vgg16

5 years ago · c3807c17c9
--- a/example/vgg16_cifar10/README.md
+++ b/example/vgg16_cifar10/README.md
@@ -49,6 +49,24 @@ You will get the accuracy as following:
 result: {'acc': 0.92}
 ```

 ### Distribute Training
 ```
 sh run_distribute_train.sh rank_table.json your_data_path
 ```
 The above shell script will run distribute training in the background, you can view the results through the file `train_parallel[X]/log`.

 You will get the loss value as following:
 ```
 # grep "result: " train_parallel*/log
 train_parallel0/log:epoch: 1 step: 97, loss is 1.9060308
 train_parallel0/log:epcoh: 2 step: 97, loss is 1.6003821
 ...
 train_parallel1/log:epoch: 1 step: 97, loss is 1.7095519
 train_parallel1/log:epcoh: 2 step: 97, loss is 1.7133579
 ...
 ...
 ```
 > About rank_table.json, you can refer to the [distributed training tutorial](https://www.mindspore.cn/tutorial/en/master/advanced_use/distributed_training.html).

 ## Usage:

@@ -75,4 +93,14 @@ parameters/options:
  --data_path           the storage path of datasetd 
  --device_id           the device which used to evaluate model.
  --checkpoint_path     the checkpoint file path used to evaluate model.
 ```
 ```

 ### Distribute Training

 ```
 Usage: sh run_distribute_train.sh [MINDSPORE_HCCL_CONFIG_PATH] [DATA_PATH]

 parameters/options:
  MINDSPORE_HCCL_CONFIG_PATH   HCCL configuration file path.
  DATA_PATH                    the storage path of dataset.
 ```