MindSpore Serving is a lightweight and high-performance service module that helps MindSpore developers efficiently deploy online inference services in the production environment. After completing model training on MindSpore, you can export the MindSpore model and use MindSpore Serving to create an inference service for the model.
MindSpore Serving architecture:
Currently, the MindSpore Serving nodes include client, master, and worker. On a client node, you can directly deliver inference service commands through the gRPC or RESTful API. Servable model service is deployed on a worker node. Servable indicates a single model or a combination of multiple models and provides different services in various methods. A master node manages all worker nodes and their model information, as well as managing and distributing tasks. The master and worker nodes can be deployed in the same process or in different processes. Currently, the client and master nodes do not depend on specific hardware platforms. The worker node supports GPUs, the Ascend 310 and Ascend 910 platforms. CPUs will be supported in the future.

MindSpore Serving provides the following functions:
batch size requirement of the model.MindSpore Serving depends on the MindSpore training and inference framework. Therefore, install MindSpore and then MindSpore Serving.
Perform the following steps to install Serving:
If use the pip command, download the .whl package from the MindSpore Serving page and install it.
pip install mindspore_serving-{version}-cp37-cp37m-linux_{arch}.whl
{version}denotes the version of MindSpore Serving. For example, when you are downloading MindSpore Serving 1.1.0,{version}should be 1.1.0.{arch}denotes the system architecture. For example, the Linux system you are using is x86 architecture 64-bit,{arch}should bex86_64. If the system is ARM architecture 64-bit, then it should beaarch64.
Install Serving using the source code.
Download the source code and go to the serving directory.
Method 1: Specify the path of the installed or built MindSpore package on which Serving depends and install Serving.
sh build.sh -p $MINDSPORE_LIB_PATH
In the preceding information, build.sh is the build script file in the serving directory, and $MINDSPORE_LIB_PATH is the lib directory in the installation path of the MindSpore software package, for example, softwarepath/mindspore/lib. This path contains the library files on which MindSpore depends.
Method 2: Directly build Serving. The MindSpore package is built together with Serving. You need to configure the environment variables for MindSpore building.
# GPU
sh build.sh -e gpu
# Ascend 310
sh build.sh -e ascend -V 310
# Ascend 910
sh build.sh -e ascend -V 910
In the preceding information, build.sh is the build script file in the serving directory. After the build is complete, find the .whl installation package of MindSpore in the serving/third_party/mindspore/build/package/ directory and install it.
pip install mindspore_ascend-{version}-cp37-cp37m-linux_{arch}.whl
Find the .whl installation package of Serving in the serving/build/package/ directory and install it.
pip install mindspore_serving-{version}-cp37-cp37m-linux_{arch}.whl
Run the following commands to verify the installation. Import the Python module. If no error is reported, the installation is successful.
from mindspore_serving import master
from mindspore_serving import worker
To run MindSpore Serving, configure the following environment variables:
MindSpore-based Inference Service Deployment is used to demonstrate how to use MindSpore Serving.
For more details about the installation guide, tutorials, and APIs, see MindSpore Python API.
Welcome to MindSpore contribution.