You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 3.5 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111
  1. # EfficientNet-B0 Example
  2. ## Description
  3. This is an example of training EfficientNet-B0 in MindSpore.
  4. ## Requirements
  5. - Install [Mindspore](http://www.mindspore.cn/install/en).
  6. - Download the dataset.
  7. ## Structure
  8. ```shell
  9. .
  10. └─nasnet
  11. ├─README.md
  12. ├─scripts
  13. ├─run_standalone_train_for_gpu.sh # launch standalone training with gpu platform(1p)
  14. ├─run_distribute_train_for_gpu.sh # launch distributed training with gpu platform(8p)
  15. └─run_eval_for_gpu.sh # launch evaluating with gpu platform
  16. ├─src
  17. ├─config.py # parameter configuration
  18. ├─dataset.py # data preprocessing
  19. ├─efficientnet.py # network definition
  20. ├─loss.py # Customized loss function
  21. ├─transform_utils.py # random augment utils
  22. ├─transform.py # random augment class
  23. ├─eval.py # eval net
  24. └─train.py # train net
  25. ```
  26. ## Parameter Configuration
  27. Parameters for both training and evaluating can be set in config.py
  28. ```
  29. 'random_seed': 1, # fix random seed
  30. 'model': 'efficientnet_b0', # model name
  31. 'drop': 0.2, # dropout rate
  32. 'drop_connect': 0.2, # drop connect rate
  33. 'opt_eps': 0.001, # optimizer epsilon
  34. 'lr': 0.064, # learning rate LR
  35. 'batch_size': 128, # batch size
  36. 'decay_epochs': 2.4, # epoch interval to decay LR
  37. 'warmup_epochs': 5, # epochs to warmup LR
  38. 'decay_rate': 0.97, # LR decay rate
  39. 'weight_decay': 1e-5, # weight decay
  40. 'epochs': 600, # number of epochs to train
  41. 'workers': 8, # number of data processing processes
  42. 'amp_level': 'O0', # amp level
  43. 'opt': 'rmsprop', # optimizer
  44. 'num_classes': 1000, # number of classes
  45. 'gp': 'avg', # type of global pool, "avg", "max", "avgmax", "avgmaxc"
  46. 'momentum': 0.9, # optimizer momentum
  47. 'warmup_lr_init': 0.0001, # init warmup LR
  48. 'smoothing': 0.1, # label smoothing factor
  49. 'bn_tf': False, # use Tensorflow BatchNorm defaults
  50. 'keep_checkpoint_max': 10, # max number ckpts to keep
  51. 'loss_scale': 1024, # loss scale
  52. 'resume_start_epoch': 0, # resume start epoch
  53. ```
  54. ## Running the example
  55. ### Train
  56. #### Usage
  57. ```
  58. # distribute training example(8p)
  59. sh run_distribute_train_for_gpu.sh DATA_DIR
  60. # standalone training
  61. sh run_standalone_train_for_gpu.sh DATA_DIR DEVICE_ID
  62. ```
  63. #### Launch
  64. ```bash
  65. # distributed training example(8p) for GPU
  66. sh scripts/run_distribute_train_for_gpu.sh /dataset
  67. # standalone training example for GPU
  68. sh scripts/run_standalone_train_for_gpu.sh /dataset 0
  69. ```
  70. #### Result
  71. You can find checkpoint file together with result in log.
  72. ### Evaluation
  73. #### Usage
  74. ```
  75. # Evaluation
  76. sh run_eval_for_gpu.sh DATA_DIR DEVICE_ID PATH_CHECKPOINT
  77. ```
  78. #### Launch
  79. ```bash
  80. # Evaluation with checkpoint
  81. sh scripts/run_eval_for_gpu.sh /dataset 0 ./checkpoint/efficientnet_b0-600_1251.ckpt
  82. ```
  83. > checkpoint can be produced in training process.
  84. #### Result
  85. Evaluation result will be stored in the scripts path. Under this, you can find result like the followings in log.