You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 6.4 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150
  1. # MindSpore-based Inference Service Deployment
  2. <!-- TOC -->
  3. - [MindSpore-based Inference Service Deployment](#mindspore-based-inference-service-deployment)
  4. - [Overview](#overview)
  5. - [Starting Serving](#starting-serving)
  6. - [Application Example](#application-example)
  7. - [Exporting Model](#exporting-model)
  8. - [Starting Serving Inference](#starting-serving-inference)
  9. - [Client Samples](#client-samples)
  10. - [Python Client Sample](#python-client-sample)
  11. - [C++ Client Sample](#cpp-client-sample)
  12. <!-- /TOC -->
  13. <a href="https://gitee.com/mindspore/docs/blob/master/tutorials/source_en/advanced_use/serving.md" target="_blank"><img src="../_static/logo_source.png"></a>
  14. ## Overview
  15. MindSpore Serving is a lightweight and high-performance service module that helps MindSpore developers efficiently deploy online inference services in the production environment. After completing model training using MindSpore, you can export the MindSpore model and use MindSpore Serving to create an inference service for the model. Currently, only Ascend 910 is supported.
  16. ## Starting Serving
  17. After MindSpore is installed using `pip`, the Serving executable program is stored in `/{your python path}/lib/python3.7/site-packages/mindspore/ms_serving`.
  18. Run the following command to start Serving:
  19. ```bash
  20. ms_serving [--help] [--model_path <MODEL_PATH>] [--model_name <MODEL_NAME>]
  21. [--port <PORT>] [--device_id <DEVICE_ID>]
  22. ```
  23. Parameters are described as follows:
  24. |Parameter|Attribute|Function|Parameter Type|Default Value|Value Range|
  25. |---|---|---|---|---|---|
  26. |`--help`|Optional|Displays the help information about the startup command. |-|-|-|
  27. |`--model_path=<MODEL_PATH>`|Mandatory|Path for storing the model to be loaded. |String|Null|-|
  28. |`--model_name=<MODEL_NAME>`|Mandatory|Name of the model file to be loaded. |String|Null|-|
  29. |`--=port <PORT>`|Optional|Specifies the external Serving port number. |Integer|5500|1–65535|
  30. |`--device_id=<DEVICE_ID>`|Optional|Specifies device ID to be used.|Integer|0|0 to 7|
  31. > Before running the startup command, add the path `/{your python path}/lib:/{your python path}/lib/python3.7/site-packages/mindspore/lib` to the environment variable `LD_LIBRARY_PATH`.
  32. ## Application Example
  33. The following uses a simple network as an example to describe how to use MindSpore Serving.
  34. ### Exporting Model
  35. Use [add_model.py](https://gitee.com/mindspore/mindspore/blob/master/serving/example/export_model/add_model.py) to build a network with only the Add operator and export the MindSpore inference deployment model.
  36. ```python
  37. python add_model.py
  38. ```
  39. Execute the script to generate the `tensor_add.mindir` file. The input of the model is two one-dimensional tensors with shape [2,2], and the output is the sum of the two input tensors.
  40. ### Starting Serving Inference
  41. ```bash
  42. ms_serving --model_path={model directory} --model_name=tensor_add.mindir
  43. ```
  44. If the server prints the `MS Serving Listening on 0.0.0.0:5500` log, the Serving has loaded the inference model.
  45. ### Client Samples
  46. #### <span name="python-client-sample">Python Client Sample</span>
  47. Obtain [ms_client.py](https://gitee.com/mindspore/mindspore/blob/master/serving/example/python_client/ms_client.py) and start the Python client.
  48. ```bash
  49. python ms_client.py
  50. ```
  51. If the following information is displayed, the Serving has correctly executed the inference of the Add network.
  52. ```
  53. ms client received:
  54. [[2. 2.]
  55. [2. 2.]]
  56. ```
  57. #### <span name="cpp-client-sample">C++ Client Sample</span>
  58. 1. Obtain an executable client sample program.
  59. Download the [MindSpore source code](https://gitee.com/mindspore/mindspore). You can use either of the following methods to compile and obtain the client sample program:
  60. + When MindSpore is compiled using the source code, the Serving C++ client sample program is generated. You can find the `ms_client` executable program in the `build/mindspore/serving/example/cpp_client` directory.
  61. + Independent compilation
  62. Preinstall [gRPC](https://gRPC.io).
  63. Run the following command in the MindSpore source code path to compile a client sample program:
  64. ```bash
  65. cd mindspore/serving/example/cpp_client
  66. mkdir build && cd build
  67. cmake -D GRPC_PATH={grpc_install_dir} ..
  68. make
  69. ```
  70. In the preceding command, `{grpc_install_dir}` indicates the gRPC installation path. Replace it with the actual gRPC installation path.
  71. 2. Start the client.
  72. Execute `ms_client` to send an inference request to the Serving.
  73. ```bash
  74. ./ms_client --target=localhost:5500
  75. ```
  76. If the following information is displayed, the Serving has correctly executed the inference of the Add network.
  77. ```
  78. Compute [[1, 2], [3, 4]] + [[1, 2], [3, 4]]
  79. Add result is 2 4 6 8
  80. client received: RPC OK
  81. ```
  82. The client code consists of the following parts:
  83. 1. Implement the client based on MSService::Stub and create a client instance.
  84. ```
  85. class MSClient {
  86. public:
  87. explicit MSClient(std::shared_ptr<Channel> channel) : stub_(MSService::NewStub(channel)) {}
  88. private:
  89. std::unique_ptr<MSService::Stub> stub_;
  90. };MSClient client(grpc::CreateChannel(target_str, grpc::InsecureChannelCredentials()));
  91. MSClient client(grpc::CreateChannel(target_str, grpc::InsecureChannelCredentials()));
  92. ```
  93. 2. Build the request input parameter `Request`, output parameter `Reply`, and gRPC client `Context` based on the actual network input.
  94. ```
  95. PredictRequest request;
  96. PredictReply reply;
  97. ClientContext context;
  98. //construct tensor
  99. Tensor data;
  100. //set shape
  101. TensorShape shape;
  102. shape.add_dims(4);
  103. *data.mutable_tensor_shape() = shape;
  104. //set type
  105. data.set_tensor_type(ms_serving::MS_FLOAT32);
  106. std::vector<float> input_data{1, 2, 3, 4};
  107. //set datas
  108. data.set_data(input_data.data(), input_data.size());
  109. //add tensor to request
  110. *request.add_data() = data;
  111. *request.add_data() = data;
  112. ```
  113. 3. Call the gRPC API to communicate with the Serving that has been started, and obtain the return value.
  114. ```
  115. Status status = stub_->Predict(&context, request, &reply);
  116. ```
  117. For details about the complete code, see [ms_client](https://gitee.com/mindspore/mindspore/blob/master/serving/example/cpp_client/ms_client.cc).