Browse Source

!5974 add notes about bs and update frequency in the resnet_thor readme file

Merge pull request !5974 from wangmin0104/master
tags/v1.0.0
mindspore-ci-bot Gitee 5 years ago
parent
commit
9971298763
1 changed files with 4 additions and 3 deletions
  1. +4
    -3
      model_zoo/official/cv/resnet_thor/README.md

+ 4
- 3
model_zoo/official/cv/resnet_thor/README.md View File

@@ -107,7 +107,7 @@ Parameters for both training and inference can be set in config.py.
- Parameters for Ascend 910
```
"class_num": 1001, # dataset class number
"batch_size": 32, # batch size of input tensor
"batch_size": 32, # batch size of input tensor(only supports 32)
"loss_scale": 128, # loss scale
"momentum": 0.9, # momentum of THOR optimizer
"weight_decay": 5e-4, # weight decay
@@ -123,7 +123,7 @@ Parameters for both training and inference can be set in config.py.
"lr_end_epoch": 70, # learning rate end epoch value
"damping_init": 0.03, # damping init value for Fisher information matrix
"damping_decay": 0.87, # damping decay rate
"frequency": 834, # the step interval to update second-order information matrix
"frequency": 834, # the step interval to update second-order information matrix(should be divisor of the steps of per epoch)
```
- Parameters for GPU
```
@@ -144,8 +144,9 @@ Parameters for both training and inference can be set in config.py.
"lr_end_epoch": 50, # learning rate end epoch value
"damping_init": 0.02345, # damping init value for Fisher information matrix
"damping_decay": 0.5467, # damping decay rate
"frequency": 834, # the step interval to update second-order information matrix
"frequency": 834, # the step interval to update second-order information matrix(should be divisor of the steps of per epoch)
```
> Due to the limitation of operators, the value of batch size only supports 32 in Ascend currently. And the update frequency of second-order information matrix must be set the divisor of the steps of per epoch(for example, 834 is the divisor of 5004). As a word, our algorithm is not very flexible in setting those parameters due to the limitations of the framework and operators. But we will solve these problems in the future versions.
### Training Process

#### Ascend 910


Loading…
Cancel
Save