You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 5.3 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150
  1. # Warpctc Example
  2. ## Description
  3. These is an example of training Warpctc with self-generated captcha image dataset in MindSpore.
  4. ## Requirements
  5. - Install [MindSpore](https://www.mindspore.cn/install/en).
  6. - Generate captcha images.
  7. > The [captcha](https://github.com/lepture/captcha) library can be used to generate captcha images. You can generate the train and test dataset by yourself or just run the script `scripts/run_process_data.sh`. By default, the shell script will generate 10000 test images and 50000 train images separately.
  8. > ```
  9. > $ cd scripts
  10. > $ sh run_process_data.sh
  11. >
  12. > # after execution, you will find the dataset like the follows:
  13. > .
  14. > └─warpctc
  15. > └─data
  16. > ├─ train # train dataset
  17. > └─ test # evaluate dataset
  18. > ...
  19. ## Structure
  20. ```shell
  21. .
  22. └──warpctc
  23. ├── README.md
  24. ├── script
  25. ├── run_distribute_train.sh # launch distributed training in Ascend(8 pcs)
  26. ├── run_distribute_train_for_gpu.sh # launch distributed training in GPU
  27. ├── run_eval.sh # launch evaluation
  28. ├── run_process_data.sh # launch dataset generation
  29. └── run_standalone_train.sh # launch standalone training(1 pcs)
  30. ├── src
  31. ├── config.py # parameter configuration
  32. ├── dataset.py # data preprocessing
  33. ├── loss.py # ctcloss definition
  34. ├── lr_generator.py # generate learning rate for each step
  35. ├── metric.py # accuracy metric for warpctc network
  36. ├── warpctc.py # warpctc network definition
  37. └── warpctc_for_train.py # warp network with grad, loss and gradient clip
  38. ├── eval.py # eval net
  39. ├── process_data.py # dataset generation script
  40. └── train.py # train net
  41. ```
  42. ## Parameter configuration
  43. Parameters for both training and evaluation can be set in config.py.
  44. ```
  45. "max_captcha_digits": 4, # max number of digits in each
  46. "captcha_width": 160, # width of captcha images
  47. "captcha_height": 64, # height of capthca images
  48. "batch_size": 64, # batch size of input tensor
  49. "epoch_size": 30, # only valid for taining, which is always 1 for inference
  50. "hidden_size": 512, # hidden size in LSTM layers
  51. "learning_rate": 0.01, # initial learning rate
  52. "momentum": 0.9 # momentum of SGD optimizer
  53. "save_checkpoint": True, # whether save checkpoint or not
  54. "save_checkpoint_steps": 97, # the step interval between two checkpoints. By default, the last checkpoint will be saved after the last step
  55. "keep_checkpoint_max": 30, # only keep the last keep_checkpoint_max checkpoint
  56. "save_checkpoint_path": "./checkpoint", # path to save checkpoint
  57. ```
  58. ## Running the example
  59. ### Train
  60. #### Usage
  61. ```
  62. # distributed training in Ascend
  63. Usage: bash run_distribute_train.sh [RANK_TABLE_FILE] [DATASET_PATH]
  64. # distributed training in GPU
  65. Usage: bash run_distribute_train_for_gpu.sh [RANK_SIZE] [DATASET_PATH]
  66. # standalone training
  67. Usage: bash run_standalone_train.sh [DATASET_PATH] [PLATFORM]
  68. ```
  69. #### Launch
  70. ```
  71. # distribute training example in Ascend
  72. bash run_distribute_train.sh rank_table.json ../data/train
  73. # distribute training example in GPU
  74. bash run_distribute_train_for_gpu.sh 8 ../data/train
  75. # standalone training example in Ascend
  76. bash run_standalone_train.sh ../data/train Ascend
  77. # standalone training example in GPU
  78. bash run_standalone_train.sh ../data/train GPU
  79. ```
  80. > About rank_table.json, you can refer to the [distributed training tutorial](https://www.mindspore.cn/tutorial/en/master/advanced_use/distributed_training.html).
  81. #### Result
  82. Training result will be stored in folder `scripts`, whose name begins with "train" or "train_parallel". Under this, you can find checkpoint file together with result like the followings in log.
  83. ```
  84. # distribute training result(8 pcs)
  85. Epoch: [ 1/ 30], step: [ 97/ 97], loss: [0.5853/0.5853], time: [376813.7944]
  86. Epoch: [ 2/ 30], step: [ 97/ 97], loss: [0.4007/0.4007], time: [75882.0951]
  87. Epoch: [ 3/ 30], step: [ 97/ 97], loss: [0.0921/0.0921], time: [75150.9385]
  88. Epoch: [ 4/ 30], step: [ 97/ 97], loss: [0.1472/0.1472], time: [75135.0193]
  89. Epoch: [ 5/ 30], step: [ 97/ 97], loss: [0.0186/0.0186], time: [75199.5809]
  90. ...
  91. ```
  92. ### Evaluation
  93. #### Usage
  94. ```
  95. # evaluation
  96. Usage: bash run_eval.sh [DATASET_PATH] [CHECKPOINT_PATH] [PLATFORM]
  97. ```
  98. #### Launch
  99. ```
  100. # evaluation example in Ascend
  101. bash run_eval.sh ../data/test warpctc-30-97.ckpt Ascend
  102. # evaluation example in GPU
  103. bash run_eval.sh ../data/test warpctc-30-97.ckpt GPU
  104. ```
  105. > checkpoint can be produced in training process.
  106. #### Result
  107. Evaluation result will be stored in the example path, whose folder name is "eval". Under this, you can find result like the followings in log.
  108. ```
  109. result: {'WarpCTCAccuracy': 0.9901472929936306}
  110. ```