!52 add ms lite

Merge pull request !52 from lz/master
5 years ago · b72e95bb3d
--- a/.sigs.md.swp
+++ b/.sigs.md.swp
--- a/design/meps/MEP-MSLITE.md
+++ b/design/meps/MEP-MSLITE.md
@@ -0,0 +1,145 @@

 | title   | authors                          | owning-sig | participating-sigs | status      | creation-date | reviewers | approvers | stage | milestone     |
 | ------- | -------------------------------- | ---------- | ------------------ | ----------- | ------------- | --------- | --------- | ----- | ------------- |
 | MEP-mslite | @zhengli  @zhiqiangzhai @chaijun | mslite |  | provisional | 2020-08-18    |  | TBD       | beta  | beta : "v0.7" |

 # MEP-mslite: MindSpore Lite

 ## Table of Contents

 <!-- toc -->

 - [MEP-mslite: MindSpore Lite](#mep-mindspore-lite)
  - [Table of Contents](#table-of-contents)
  - [Summary](#summary)
  - [Motivation](#motivation)
    - [Goals](#goals)
    - [Non-Goals](#non-goals)
  - [Proposal](#proposal)
    - [User Stories](#user-stories-optional)
      - [Generate a compact target model and low latency and low consumption runtime](#generate-a-compact-target-model-and-low-latency-and-low-consumption-runtime)
  - [Design Details](#design-details)
    - [Test Plan](#test-plan)
  - [Implementation History](#implementation-history)
  - [Drawbacks](#drawbacks)
  - [Alternatives](#alternatives)
  - [References](#references-optional)
  
  <!-- /toc -->
 ## Summary
 MindSpore(MS) lite is an extremely light-weight deep learning inference framework, 
 and designed for smart-phones and embedded devices, such as watches, headsets, and various IoT devices. 
 It supports Android and iOS, as well as Harmony os, and has industry leading performance. 
                                
 ## Motivation
 Since increased computing power and sensor data, intelligence is moving towards edge devices. 
 Improved AI algorithms are driving the trend towards machine learning be run on 
 the end device, such as smart-phones or automobiles, rather than in the cloud.
 On-device AI can dramatically reduce latency, conserve bandwidth, 
 improve privacy and enable smarter applications. 

 ### Goals
 - Compatibility: supports MindSpore model, as well as mainstream third-party models, such as TensorFlow lite, Caffe 1.0 and ONNX.
 - High-performance: 
 generates small, low power consumption and fast inference target model for various hardware backends.

 - Versatility: supports Harmony, Android and iOS os.
 - Light-weight: small shared library size, should be less than 1 MB, and could be easily deployed on 
 resource limited devices. 

 ### Non-Goals
 - None

 ## Proposal

 MS lite consists of converter and a runtime library.
 The converter is an offline tool can handle most of the model translation work. 
 The runtime library deploys to device and executes online, 
 it has Lite RT and Lite Micro two modes.
 Lite RT is for slightly resource limited devices, such as smart-phones, 
 while Lite Micro is for extremely resource limited devices, such as watches, headsets. 

 - Compatibility

    provides an abundant of operator parsers for MindSpore, Tensorflow Lite, Caffe, ONNX, 
    and supports common neural networks in CV and NLP, 208+ CPU operators, and 60+ GPU operators.
   
 - High performance

    Many optimization methods, including graph optimizations, post training quantization,
    are applied to model in offline converter, and generated target model is more compact.
    Graph optimizations, such as operator fusion and constant folding, make model more compact.
    Post training quantization transfers fp32 model into fix-point int8 model. 
    It brings nearly 4x smaller model size, low latency and low consumption for inference process. 
        
    MS lite also applies a variety of optimization schemes to NN operations, including using Winograd 
 algorithm in convolution and deconvolution, Strassen algorithm in matrix multiplication.
 Operations support fp64, fp32, fp16 and int8, and are highly optimized with acceleration by 
 neon instructions, hand-written assemble, multi-thread, memory reuse, heterogeneous computing, etc.

 - Versatility   

    Supports Harmony, iOS and Android os, supports smart-phones, watches, headsets, and various IoT devices.
   
 - Light weight

    MS lite is highly Optimized under GHLO and GLLO. It has small foot-print, 
    MS lite runtime is about 800 kB, and MS Micro is less than 200 KB. 
    It is flexible and can easily deploy to mobile and a variety of embedded devices.     
 ### User Stories

 #### Generate a compact target model and low latency and low consumption runtime

 Since devices has limited resource with few ROM, RAM, and power, how to deploy AI model to 
 device is very challenge. MS lite aims to solve the challenge for users, and provides user-friendly, 
 flexible tool to help users to make their own models more slim and more efficiency.
 
 ## Design Details

 MS lite consists of converter and runtime. 
 The converter is an offline tool has three parts, frontend, IR, and backend.
 Runtime deploys to device and executes online.

 - **Frontend.** Frontend aims to parse model from MindSpore, Tensorflow Lite, Caffe and ONNX in protobuf. 
 - **IR.** IR is to define ANF, including tensor, operations, and graph.
 - **Backend.** Backend is an optimizer based ANF graph, including GHLO, GLLO, and quantization.
               GHLO is short for "graph high level optimization", common optimization methods, 
               such as operators fusion, operator substitution, and constant folding, are included. 
               GLLO is short for "graph low level optimization", low level optimization methods 
               are related to hardware, such as layout adjustment, mixed-precision, etc.
                
 - **Runtime.** Runtime has Lite RT and Lite Micro two modes.
  
  <img src="./ms-lite-arch.jpg" style="zoom:80%" div align=center/>


 ### Test Plan

 MS lite employed pytests and nosetest to launch the testing process, 
 and there are two types of testing strategies in MS lite:

 - **Unit Test.** Every operation, optimization or pass in MS has its own unitest. 

 - **System test**. The ms lite module has its own component testing. 
 Basically we classify the testing into compilation verification, 
 function verification and performance testing.

 ## Implementation History
 - Support high and low level graph optimization.
 - Support post training quantization.
 - Support Arm CPU and Mali GPU. 
 - Support fp64, fp32, fp16, int8 operations.

 ## Drawbacks
 - MS lite does not support on-device training yet, it is coming soon...

 ## Alternatives
 - MNN[1], TF lite[2] and TNN[3] are outstanding on-device AI frameworks. 
 MS lite is for on-device AI, and MS cloud is for on-cloud AI, 
 both of them are in scope of Huawei's MindSpore AI framework. 
 They share same IR, and optimization passes. MS lite is more flexible. 

 ## References
 - [1] https://github.com/alibaba/MNN 
 - [2] https://www.tensorflow.org/lite
 - [3] https://github.com/Tencent/TNN 
--- a/design/meps/ms-lite-arch.jpg
+++ b/design/meps/ms-lite-arch.jpg
--- a/sigs.md
+++ b/sigs.md
@@ -25,3 +25,4 @@ in the mailing list. SIG artifacts can be found in the [sigs repository](sigs).
 | Visualization | This SIG is responsible for the development of MindSpore visualization tools. |
 | Security | This SIG is responsible for the development of MindSpore security related tools. |
 | AKG | This SIG is responsible for the development of MindSpore auto kernel generator. |
 | MSLITE | This SIG is responsible for the development of MindSpore lite. |
--- a/sigs/mslite/README.md
+++ b/sigs/mslite/README.md
@@ -0,0 +1,24 @@
 # MindSpore Lite Special Interest Group (SIG)

 This is the working repo for the mslite Special Interest Group (SIG). This repo contains all the artifacts, materials, meeting notes and proposals regarding **MS Lite Converter** , **MS Lite Runtime**. Feedbacks and contributions are welcomed.
 1. **Converter**: converter is an offline tool has three parts, frontend, IR, and backend, aims to generate a compact model with applying graph optimizations and post training quantization. 
 2. **Runtime**: runtime deploys to device and executes online, has Lite RT and Lite Micro two modes. 

 # SIG Leads

 * Zheng Li (Huawei)

 # Logistics

 * SIG leads will drive the meeting.
 * Meeting annoucement will be posted on our gitee channel: https://gitee.com/mindspore/community/tree/master/sigs/mslite
 * Feedbacks and topic requests are welcomed by all.

 # Discussion

 * Slack channel: https://app.slack.com/client/TUKCY4QDR/CUZ3FESNS?cdn_fallback=2
 * Documents and artifacts: https://gitee.com/mindspore/community/tree/master/sigs/mslite

 # Meeting notes


--- a/sigs/mslite/docs/design-template.md
+++ b/sigs/mslite/docs/design-template.md
--- a/sigs/mslite/meetings/meeting-template.md
+++ b/sigs/mslite/meetings/meeting-template.md
@@ -0,0 +1,14 @@
 # Thursday Aug 20, 2020 at 21:30pm GMT+8

 ## Agenda

 ## Conference links

 ## Attendees 
 * Tom (Huawei)

 ## Notes
 * TODO

 ## Action items
 * TODO