- [Description of Random Situation](#description-of-random-situation)
- [ModelZoo Homepage](#modelzoo-homepage)
@@ -50,8 +50,8 @@ For FP16 operators, if the input data type is FP32, the backend of MindSpore wil
# [Environment Requirements](#contents)
- Hardware(Ascend/GPU)
- Prepare hardware environment with Ascend or GPU processor. If you want to try Ascend, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
- Hardware(Ascend)
- Prepare hardware environment with Ascend processor. If you want to try Ascend, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
- For more information, please check the resources below:
@@ -68,11 +68,8 @@ For FP16 operators, if the input data type is FP32, the backend of MindSpore wil
├─README.md
├─scripts
├─run_standalone_train.sh # launch standalone training with ascend platform(1p)
├─run_standalone_train_gpu.sh # launch standalone training with gpu platform(1p)
├─run_distribute_train.sh # launch distributed training with ascend platform(8p)
├─run_distribute_train_gpu.sh # launch distributed training with gpu platform(8p)
├─run_eval.sh # launch evaluating with ascend platform
└─run_eval_gpu.sh # launch evaluating with gpu platform
└─run_eval.sh # launch evaluating with ascend platform
├─src
├─config.py # parameter configuration
├─dataset.py # data preprocessing
@@ -124,7 +121,7 @@ You can start training using python or shell scripts. The usage of shell scripts
- Ascend:
```shell
# distribute training example(8p)
# distribute training(8p)
sh scripts/run_distribute_train.sh RANK_TABLE_FILE DATA_PATH
# standalone training
sh scripts/run_standalone_train.sh DEVICE_ID DATA_PATH
@@ -134,34 +131,19 @@ sh scripts/run_standalone_train.sh DEVICE_ID DATA_PATH
>
> This is processor cores binding operation regarding the `device_num` and total processor numbers. If you are not expect to do it, remove the operations `taskset` in `scripts/run_distribute_train.sh`
- GPU:
```python
# distribute training example(8p)
sh scripts/run_distribute_train_gpu.sh DATA_DIR
# standalone training
sh scripts/run_standalone_train_gpu.sh DEVICE_ID DATA_DIR