You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 8.6 kB

5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152
  1. # Content
  2. <!-- TOC -->
  3. - [Overview](#overview)
  4. - [Model Architecture](#model-architecture)
  5. - [Dataset](#dataset)
  6. - [Environment Requirements](#environment-requirements)
  7. - [Quick Start](#quick-start)
  8. - [Script Detailed Description](#script-detailed-description)
  9. <!-- /TOC -->
  10. # Overview
  11. This folder holds an example code for on-Device Transfer learning. Part of the code runs on a server using MindSpore infrastructure, another part uses MindSpore Lite conversion utility, and the last part is the actual training of the model on some android-based device.
  12. # Transfer Learning Scheme
  13. In this example of transfer learning we are using a fixed backbone that holds a pretrained [efficientNet](https://arxiv.org/abs/1905.11946) and an adjustable head which is a simple Dense layer. During the training only the head will be updating it's weights. Such training scheme is very efficient in computation power and therefore very relevant for Training-on-Device scenarios.
  14. # Dataset
  15. In this example we use a subset of the [Places dataset](http://places2.csail.mit.edu/) of scenes.
  16. The whole dataset is composed of high resolution as well as small images and sums up to more than 100Gb.
  17. For this demo we will use only the [validation data of small images](http://places2.csail.mit.edu/download.html) which is approximately 500Mb.
  18. - Dataset size:501M,36,500 224*224 images in 365 classes
  19. - Dataiset format:jpg files
  20. - Note:In the current release, data is customely loaded using a proprietary DataSet class (provided in dataset.cc). In the upcoming releases loading will be done using MindSpore MindData infrastructure. In order to fit the data to the model it will be preprocessed using [ImageMagick convert tool](https://imagemagick.org/), namely croping and converting to bmp format.
  21. - Note: Only 10 classes out of the 365 will be used in this demo
  22. - Note: 60% of the data will be used for training, 20% will be used for testing and the remaining 20% for validation
  23. - The original dataset directory structure is as follows:
  24. ```text
  25. places
  26. ├── val_256
  27. │   ├── Places365_val_00000001.jpg
  28. │   ├── Places365_val_00000002.jpg
  29. │   ├── ...
  30. │   ├── Places365_val_00036499.jpg
  31. │   └── Places365_val_00036500.jpg
  32. ```
  33. # Environment Requirements
  34. - Server side
  35. - [MindSpore Framework](https://www.mindspore.cn/install/en) - it is recommended to install a docker image
  36. - MindSpore ToD Framework
  37. - [Downloads](https://www.mindspore.cn/tutorial/lite/en/master/use/downloads.html)
  38. - [Build](https://www.mindspore.cn/tutorial/lite/en/master/use/build.html)
  39. - [Android NDK r20b](https://dl.google.com/android/repository/android-ndk-r20b-linux-x86_64.zip)
  40. - [Android SDK](https://developer.android.com/studio?hl=zh-cn#cmdline-tools)
  41. - [ImageMagick convert tool](https://imagemagick.org/)
  42. - A connected Android device
  43. # Quick Start
  44. After installing all the above mentioned, the script in the home directory could be run with the following arguments:
  45. ```bash
  46. sh ./prepare_and_run.sh -D DATASET_PATH [-d MINDSPORE_DOCKER] [-r RELEASE.tar.gz] [-t arm64|x86]
  47. ```
  48. where:
  49. - DATASET_PATH is the path to the [dataset](#dataset),
  50. - MINDSPORE_DOCKER is the image name of the docker that runs [MindSpore](#environment-requirements). If not provided MindSpore will be run locally
  51. - RELEASE.tar.gz is a pointer to the MindSpore ToD release tar ball. If not provided, the script will attempt to find MindSpore ToD compilation output
  52. - target is defaulted to arm64, i.e., on-device. If x86 is provided, the demo will be run locally. Note that infrastructure is not optimized for running on x86. Also, note that user needs to call "make clean" when switching between targets.
  53. # Script Detailed Description
  54. The provided `prepare_and_run.sh` script is performing the followings:
  55. - Prepare the trainable transfer learning model in a `.ms` format
  56. - Prepare the folder that should be pushed into the device
  57. - Copy this folder into the device and run the scripts on the device
  58. See how to run the script and parameters definitions in the [Quick Start Section](#quick-start)
  59. ## Preparing the model
  60. Within the model folder a `prepare_model.sh` script uses MindSpore infrastructure to export the model into a `.mindir` file. The user can specify a docker image on which MindSpore is installed. Otherwise, the python script will be run locally. As explained above, the head of the network is pre-trained and a `.ckpt` file should be loaded to the head network. In the first time the script is run, it attempts to download the `.ckpt` file using `wget` command.
  61. The script then converts the `.mindir` to a `.ms` format using the MindSpore ToD converter.
  62. The script accepts a tar ball where the converter resides. Otherwise, the script will attempt to find the converter in the MindSpore ToD build output directory.
  63. ## Preparing the Folder
  64. The `transfer_learning_tod.ms` model file is then copied into the `package` folder as well as scripts, the MindSpore ToD library and a subset of the Places dataset. This dataset undergoes pre-processing on the server prior to the packaging.
  65. Finally, the code (in src) is compiled for the target and the binary is copied into the `package` folder.
  66. ### Running the code on the device
  67. To run the code on the device the script first uses `adb` tool to push the `package` folder into the device.
  68. It first runs evaluation on the untrained model (to check the accuracy of the untrained model)
  69. It then runs training (which takes some time) and finally it runs evaluation of the trained model using the test data.
  70. # Folder Directory tree
  71. ```text
  72. transfer_learning/
  73. ├── Makefile # Makefile of src code
  74. ├── model
  75. │   ├── effnet.py # Python implementation of efficientNet
  76. │   ├── transfer_learning_export.py # Python script that exports the LeNet model to .mindir
  77. │   ├── prepare_model.sh # script that export model (using docker) then converts it
  78. │   └── train_utils.py # utility function used during the export
  79. ├── prepare_and_run.sh # main script that creates model, compiles it and send to device for running
  80. ├── prepare_dataset.sh # prepares the Places dataset (crop/convert/organizing folders)
  81. ├── README.md # this manual
  82. ├── scripts
  83. │   ├── eval.sh # script that load the train model and evaluates its accuracy
  84. │   ├── eval_untrained.sh # script that load the untrained model and evaluates its accuracy
  85. │   ├── places365_val.txt # association of images to classes withiin the Places 365 dataset
  86. │   └── train.sh # script that load the initial model and trains it
  87. ├── src
  88. │   ├── dataset.cc # dataset handler
  89. │   ├── dataset.h # dataset class header
  90. │   ├── net_runner.cc # program that runs training/evaluation of models
  91. │   └── net_runner.h # net_runner header
  92. ```
  93. When the `prepare_and_run.sh` script is run, the following folder is prepared. It is pushed to the device and then training runs
  94. ```text
  95. package-arm64/
  96. ├── bin
  97. │   └── net_runner # the executable that performs the training/evaluation
  98. ├── dataset
  99. │   ├── 0 # folder containing images 0-99 belonging to 0'th class
  100. │   │   ├── 0.bmp
  101. │   │   ├── 1.bmp
  102. │   │   ├── ....
  103. │   │   ├── 98.bmp
  104. │   │   └── 99.bmp
  105. │   ├── ... # folders containing images 0-99 belonging to 1'st-8'th classes
  106. │   ├── 9 # folder containing images 0-99 belonging to 9'th class
  107. │   │   ├── 0.bmp
  108. │   │   ├── 1.bmp
  109. │   │   ├── ....
  110. │   │   ├── 98.bmp
  111. │   │   └── 99.bmp
  112. ├── lib
  113. │   └── libmindspore-lite.so # MindSpore Lite library
  114. ├── model
  115. │   └── transfer_learning_tod.ms # model to train
  116. ├── eval.sh # script that load the train model and evaluates its accuracy
  117. ├── eval_untrained.sh # script that load the untrain model and evaluates its accuracy
  118. └── train.sh # script that load the initial model and train it
  119. ```