You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 16 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402
  1. # MindSpore Lite 端侧骨骼检测demo(Android)
  2. 本示例程序演示了如何在端侧利用MindSpore Lite API以及MindSpore Lite骨骼检测模型完成端侧推理,对设备摄像头捕获的内容进行检测,并在App图像预览界面中显示连续目标检测结果。
  3. ## 运行依赖
  4. - Android Studio >= 3.2 (推荐4.0以上版本)
  5. ## 构建与运行
  6. 1. 在Android Studio中加载本示例源码。
  7. ![start_home](images/home.png)
  8. 启动Android Studio后,点击`File->Settings->System Settings->Android SDK`,勾选相应的`SDK Tools`。如下图所示,勾选后,点击`OK`,Android Studio即可自动安装SDK。
  9. ![start_sdk](images/sdk_management.jpg)
  10. > Android SDK Tools为默认安装项,取消`Hide Obsolete Packages`选框之后可看到。
  11. >
  12. > 使用过程中若出现问题,可参考第4项解决。
  13. 2. 连接Android设备,运行该应用程序。
  14. 通过USB连接Android手机。待成功识别到设备后,点击`Run 'app'`即可在您的手机上运行本示例项目。
  15. > 编译过程中Android Studio会自动下载MindSpore Lite、模型文件等相关依赖项,编译过程需做耐心等待。
  16. >
  17. > Android Studio连接设备调试操作,可参考<https://developer.android.com/studio/run/device?hl=zh-cn>。
  18. >
  19. > 手机需开启“USB调试模式”,Android Studio 才能识别到手机。 华为手机一般在设置->系统和更新->开发人员选项->USB调试中开始“USB调试模型”。
  20. ![run_app](images/run_app.PNG)
  21. 3. 在Android设备上,点击“继续安装”,安装完即可查看到设备摄像头捕获的内容和推理结果。
  22. ![install](images/install.jpg)
  23. 如下图所示,识别出的概率最高的物体是植物。
  24. ![result](images/app_result.jpg)
  25. 4. Demo部署问题解决方案。
  26. 4.1 NDK、CMake、JDK等工具问题:
  27. 如果Android Studio内安装的工具出现无法识别等问题,可重新从相应官网下载和安装,并配置路径。
  28. - NDK >= 21.3 [NDK](https://developer.android.google.cn/ndk/downloads?hl=zh-cn)
  29. - CMake >= 3.10.2 [CMake](https://cmake.org/download)
  30. - Android SDK >= 26 [SDK](https://developer.microsoft.com/zh-cn/windows/downloads/windows-10-sdk/)
  31. - JDK >= 1.8 [JDK](https://www.oracle.com/cn/java/technologies/javase/javase-jdk8-downloads.html)
  32. ![project_structure](images/project_structure.png)
  33. 4.2 NDK版本不匹配问题:
  34. 打开`Android SDK`,点击`Show Package Details`,根据报错信息选择安装合适的NDK版本。
  35. ![NDK_version](images/NDK_version.jpg)
  36. 4.3 Android Studio版本问题:
  37. 在`工具栏-help-Checkout for Updates`中更新Android Studio版本。
  38. 4.4 Gradle下依赖项安装过慢问题:
  39. 如图所示, 打开Demo根目录下`build.gradle`文件,加入华为镜像源地址:`maven {url 'https://developer.huawei.com/repo/'}`,修改classpath为4.0.0,点击`sync`进行同步。下载完成后,将classpath版本复原,再次进行同步。
  40. ![maven](images/maven.jpg)
  41. ## 示例程序详细说明
  42. 骨骼检测Android示例程序通过Android Camera 2 API实现摄像头获取图像帧,以及相应的图像处理等功能,在[Runtime](https://www.mindspore.cn/tutorial/lite/zh-CN/master/use/runtime.html)中完成模型推理的过程。
  43. ### 示例程序结构
  44. ```text
  45. ├── app
  46. │   ├── build.gradle # 其他Android配置文件
  47. │   ├── download.gradle # APP构建时由gradle自动从HuaWei Server下载依赖的库文件及模型文件
  48. │   ├── proguard-rules.pro
  49. │   └── src
  50. │   ├── main
  51. │   │   ├── AndroidManifest.xml # Android配置文件
  52. │   │   ├── java # java层应用代码
  53. │   │   │   └── com
  54. │   │   │   └── mindspore
  55. │   │   │   └── posenetdemo # 图像处理及推理流程实现
  56. │   │   │   ├── CameraDataDealListener.java
  57. │   │   │   ├── ImageUtils.java
  58. │   │   │   ├── MainActivity.java
  59. │   │   │   ├── PoseNetFragment.java
  60. │   │   │   ├── Posenet.java #
  61. │   │   │   └── TestActivity.java
  62. │   │   └── res # 存放Android相关的资源文件
  63. │   └── test
  64. └── ...
  65. ```
  66. ### 下载及部署模型文件
  67. 从MindSpore Model Hub中下载模型文件,本示例程序中使用的目标检测模型文件为`posenet_model.ms`,同样通过`download.gradle`脚本在APP构建时自动下载,并放置在`app/src/main/assets`工程目录下。
  68. > 若下载失败请手动下载模型文件,posenet_model.ms [下载链接](https://download.mindspore.cn/model_zoo/official/lite/posenet_lite/posenet_model.ms)。
  69. ### 编写端侧推理代码
  70. 在骨骼检测demo中,使用Java API实现端测推理。相比于C++ API,Java API可以直接在Java Class中调用,无需实现JNI层的相关代码,具有更好的便捷性。
  71. - 本实例通过识别鼻子眼睛等身体特征、获取身体特征位置、计算结果的置信分数,来实现骨骼检测的目的。
  72. ```java
  73. public enum BodyPart {
  74. NOSE,
  75. LEFT_EYE,
  76. RIGHT_EYE,
  77. LEFT_EAR,
  78. RIGHT_EAR,
  79. LEFT_SHOULDER,
  80. RIGHT_SHOULDER,
  81. LEFT_ELBOW,
  82. RIGHT_ELBOW,
  83. LEFT_WRIST,
  84. RIGHT_WRIST,
  85. LEFT_HIP,
  86. RIGHT_HIP,
  87. LEFT_KNEE,
  88. RIGHT_KNEE,
  89. LEFT_ANKLE,
  90. RIGHT_ANKLE
  91. }
  92. public class Position {
  93. int x;
  94. int y;
  95. }
  96. public class KeyPoint {
  97. BodyPart bodyPart = BodyPart.NOSE;
  98. Position position = new Position();
  99. float score = 0.0f;
  100. }
  101. public class Person {
  102. List<KeyPoint> keyPoints;
  103. float score = 0.0f;
  104. }
  105. ```
  106. 骨骼检测demo推理代码流程如下,完整代码请参见:`src/main/java/com/mindspore/posenetdemo/Posenet.java`。
  107. 1. 加载MindSpore Lite模型文件,构建上下文、会话以及用于推理的计算图。
  108. - 加载模型:从文件系统中读取MindSpore Lite模型,并进行模型解析。
  109. ```java
  110. // Load the .ms model.
  111. model = new Model();
  112. if (!model.loadModel(mContext, "posenet_model.ms")) {
  113. Log.e("MS_LITE", "Load Model failed");
  114. return false;
  115. }
  116. ```
  117. - 创建配置上下文:创建配置上下文`MSConfig`,保存会话所需的一些基本配置参数,用于指导图编译和图执行。
  118. ```java
  119. // Create and init config.
  120. msConfig = new MSConfig();
  121. if (!msConfig.init(DeviceType.DT_CPU, NUM_THREADS, CpuBindMode.MID_CPU)) {
  122. Log.e("MS_LITE", "Init context failed");
  123. return false;
  124. }
  125. ```
  126. - 创建会话:创建`LiteSession`,并调用`init`方法将上一步得到`MSConfig`配置到会话中。
  127. ```java
  128. // Create the MindSpore lite session.
  129. session = new LiteSession();
  130. if (!session.init(msConfig)) {
  131. Log.e("MS_LITE", "Create session failed");
  132. msConfig.free();
  133. return false;
  134. }
  135. msConfig.free();
  136. ```
  137. - 加载模型文件并构建用于推理的计算图
  138. ```java
  139. // Compile graph.
  140. if (!session.compileGraph(model)) {
  141. Log.e("MS_LITE", "Compile graph failed");
  142. model.freeBuffer();
  143. return false;
  144. }
  145. // Note: when use model.freeBuffer(), the model can not be compile graph again.
  146. model.freeBuffer();
  147. ```
  148. 2. 输入数据: Java目前支持`byte[]`或者`ByteBuffer`两种类型的数据,设置输入Tensor的数据。
  149. - 在输入数据之前,需要对存储图像信息的Bitmap进行解读分析与转换。
  150. ```java
  151. /**
  152. * Scale the image to a byteBuffer of [-1,1] values.
  153. */
  154. private ByteBuffer initInputArray(Bitmap bitmap) {
  155. final int bytesPerChannel = 4;
  156. final int inputChannels = 3;
  157. final int batchSize = 1;
  158. ByteBuffer inputBuffer = ByteBuffer.allocateDirect(
  159. batchSize * bytesPerChannel * bitmap.getHeight() * bitmap.getWidth() * inputChannels
  160. );
  161. inputBuffer.order(ByteOrder.nativeOrder());
  162. inputBuffer.rewind();
  163. final float mean = 128.0f;
  164. final float std = 128.0f;
  165. int[] intValues = new int[bitmap.getWidth() * bitmap.getHeight()];
  166. bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight());
  167. int pixel = 0;
  168. for (int y = 0; y < bitmap.getHeight(); y++) {
  169. for (int x = 0; x < bitmap.getWidth(); x++) {
  170. int value = intValues[pixel++];
  171. inputBuffer.putFloat(((float) (value >> 16 & 0xFF) - mean) / std);
  172. inputBuffer.putFloat(((float) (value >> 8 & 0xFF) - mean) / std);
  173. inputBuffer.putFloat(((float) (value & 0xFF) - mean) / std);
  174. }
  175. }
  176. return inputBuffer;
  177. }
  178. ```
  179. - 通过`ByteBuffer`输入数据。
  180. ```java
  181. long estimationStartTimeNanos = SystemClock.elapsedRealtimeNanos();
  182. ByteBuffer inputArray = this.initInputArray(bitmap);
  183. List<MSTensor> inputs = session.getInputs();
  184. if (inputs.size() != 1) {
  185. return null;
  186. }
  187. Log.i("posenet", String.format("Scaling to [-1,1] took %.2f ms",
  188. 1.0f * (SystemClock.elapsedRealtimeNanos() - estimationStartTimeNanos) / 1_000_000));
  189. MSTensor inTensor = inputs.get(0);
  190. inTensor.setData(inputArray);
  191. long inferenceStartTimeNanos = SystemClock.elapsedRealtimeNanos();
  192. ```
  193. 3. 对输入Tensor按照模型进行推理,获取输出Tensor,并进行后处理。
  194. - 使用`runGraph`进行模型推理。
  195. ```java
  196. // Run graph to infer results.
  197. if (!session.runGraph()) {
  198. Log.e("MS_LITE", "Run graph failed");
  199. return null;
  200. }
  201. lastInferenceTimeNanos = SystemClock.elapsedRealtimeNanos() - inferenceStartTimeNanos;
  202. Log.i(
  203. "posenet",
  204. String.format("Interpreter took %.2f ms", 1.0f * lastInferenceTimeNanos / 1_000_000)
  205. );
  206. ```
  207. - 通过输出Tensor得到推理结果。
  208. ```java
  209. // Get output tensor values.
  210. List<MSTensor> heatmaps_list = session.getOutputsByNodeName("Conv2D-27");
  211. if (heatmaps_list == null) {
  212. return null;
  213. }
  214. MSTensor heatmaps_tensors = heatmaps_list.get(0);
  215. float[] heatmaps_results = heatmaps_tensors.getFloatData();
  216. int[] heatmapsShape = heatmaps_tensors.getShape(); //1, 9, 9 ,17
  217. float[][][][] heatmaps = new float[heatmapsShape[0]][][][];
  218. for (int x = 0; x < heatmapsShape[0]; x++) { // heatmapsShape[0] =1
  219. float[][][] arrayThree = new float[heatmapsShape[1]][][];
  220. for (int y = 0; y < heatmapsShape[1]; y++) { // heatmapsShape[1] = 9
  221. float[][] arrayTwo = new float[heatmapsShape[2]][];
  222. for (int z = 0; z < heatmapsShape[2]; z++) { //heatmapsShape[2] = 9
  223. float[] arrayOne = new float[heatmapsShape[3]]; //heatmapsShape[3] = 17
  224. for (int i = 0; i < heatmapsShape[3]; i++) {
  225. int n = i + z * heatmapsShape[3] + y * heatmapsShape[2] * heatmapsShape[3] + x * heatmapsShape[1] * heatmapsShape[2] * heatmapsShape[3];
  226. arrayOne[i] = heatmaps_results[n]; //1*9*9*17 ??
  227. }
  228. arrayTwo[z] = arrayOne;
  229. }
  230. arrayThree[y] = arrayTwo;
  231. }
  232. heatmaps[x] = arrayThree;
  233. }
  234. List<MSTensor> offsets_list = session.getOutputsByNodeName("Conv2D-28");
  235. if (offsets_list == null) {
  236. return null;
  237. }
  238. MSTensor offsets_tensors = offsets_list.get(0);
  239. float[] offsets_results = offsets_tensors.getFloatData();
  240. int[] offsetsShapes = offsets_tensors.getShape();
  241. float[][][][] offsets = new float[offsetsShapes[0]][][][];
  242. for (int x = 0; x < offsetsShapes[0]; x++) {
  243. float[][][] offsets_arrayThree = new float[offsetsShapes[1]][][];
  244. for (int y = 0; y < offsetsShapes[1]; y++) {
  245. float[][] offsets_arrayTwo = new float[offsetsShapes[2]][];
  246. for (int z = 0; z < offsetsShapes[2]; z++) {
  247. float[] offsets_arrayOne = new float[offsetsShapes[3]];
  248. for (int i = 0; i < offsetsShapes[3]; i++) {
  249. int n = i + z * offsetsShapes[3] + y * offsetsShapes[2] * offsetsShapes[3] + x * offsetsShapes[1] * offsetsShapes[2] * offsetsShapes[3];
  250. offsets_arrayOne[i] = offsets_results[n];
  251. }
  252. offsets_arrayTwo[z] = offsets_arrayOne;
  253. }
  254. offsets_arrayThree[y] = offsets_arrayTwo;
  255. }
  256. offsets[x] = offsets_arrayThree;
  257. }
  258. ```
  259. - 对输出节点的数据进行处理,得到骨骼检测demo的返回值`person`,实现功能。
  260. `Conv2D-27`中,`heatmaps`存储`height`、`weight`、`numKeypoints`三种参数,可用于求出`keypointPosition`位置信息。
  261. `Conv2D-28`中,`offset`代表位置坐标的偏移量,与`keypointPosition`结合可获取`confidenceScores`置信分数,用于判断模型推理结果。
  262. 通过`keypointPosition`与`confidenceScores`,获取`person.keyPoints`和`person.score`,得到模型的返回值`person`。
  263. ```java
  264. int height = ((Object[]) heatmaps[0]).length; //9
  265. int width = ((Object[]) heatmaps[0][0]).length; //9
  266. int numKeypoints = heatmaps[0][0][0].length; //17
  267. // Finds the (row, col) locations of where the keypoints are most likely to be.
  268. Pair[] keypointPositions = new Pair[numKeypoints];
  269. for (int i = 0; i < numKeypoints; i++) {
  270. keypointPositions[i] = new Pair(0, 0);
  271. }
  272. for (int keypoint = 0; keypoint < numKeypoints; keypoint++) {
  273. float maxVal = heatmaps[0][0][0][keypoint];
  274. int maxRow = 0;
  275. int maxCol = 0;
  276. for (int row = 0; row < height; row++) {
  277. for (int col = 0; col < width; col++) {
  278. if (heatmaps[0][row][col][keypoint] > maxVal) {
  279. maxVal = heatmaps[0][row][col][keypoint];
  280. maxRow = row;
  281. maxCol = col;
  282. }
  283. }
  284. }
  285. keypointPositions[keypoint] = new Pair(maxRow, maxCol);
  286. }
  287. // Calculating the x and y coordinates of the keypoints with offset adjustment.
  288. int[] xCoords = new int[numKeypoints];
  289. int[] yCoords = new int[numKeypoints];
  290. float[] confidenceScores = new float[numKeypoints];
  291. for (int i = 0; i < keypointPositions.length; i++) {
  292. Pair position = keypointPositions[i];
  293. int positionY = (int) position.first;
  294. int positionX = (int) position.second;
  295. yCoords[i] = (int) ((float) positionY / (float) (height - 1) * bitmap.getHeight() + offsets[0][positionY][positionX][i]);
  296. xCoords[i] = (int) ((float) positionX / (float) (width - 1) * bitmap.getWidth() + offsets[0][positionY][positionX][i + numKeypoints]);
  297. confidenceScores[i] = sigmoid(heatmaps[0][positionY][positionX][i]);
  298. }
  299. Person person = new Person();
  300. KeyPoint[] keypointList = new KeyPoint[numKeypoints];
  301. for (int i = 0; i < numKeypoints; i++) {
  302. keypointList[i] = new KeyPoint();
  303. }
  304. float totalScore = 0.0f;
  305. for (int i = 0; i < keypointList.length; i++) {
  306. keypointList[i].position.x = xCoords[i];
  307. keypointList[i].position.y = yCoords[i];
  308. keypointList[i].score = confidenceScores[i];
  309. totalScore += confidenceScores[i];
  310. }
  311. person.keyPoints = Arrays.asList(keypointList);
  312. person.score = totalScore / numKeypoints;
  313. return person;
  314. ```