* [to #42322933] refine quick_start.md and pipeline.md

4 years ago · 31b203acfb
--- a/docs/source/quick_start.md
+++ b/docs/source/quick_start.md
@@ -1,72 +1,55 @@
 # 快速开始

 ModelScope Library目前支持tensorflow，pytorch深度学习框架进行模型训练、推理， 在Python 3.7+, Pytorch 1.8+, Tensorflow1.15+，Tensorflow 2.6上测试可运行。
 注： 当前（630）版本仅支持python3.7 以及linux环境，其他环境(mac,windows等)支持预计730完成。
 ## python环境配置
 首先，参考[文档](https://docs.anaconda.com/anaconda/install/) 安装配置Anaconda环境

 安装完成后，执行如下命令为modelscope library创建对应的python环境。
 ```shell
 conda create -n modelscope python=3.6
 conda create -n modelscope python=3.7
 conda activate modelscope
 ```
 检查python和pip命令是否切换到conda环境下。
 ## 安装深度学习框架
 * 安装pytorch[参考链接](https://pytorch.org/get-started/locally/)
 ```shell
 which python
 # ~/workspace/anaconda3/envs/modelscope/bin/python

 which pip
 # ~/workspace/anaconda3/envs/modelscope/bin/pip
 pip install torch torchvision
 ```
 注： 本项目只支持`python3`环境，请勿使用python2环境。

 ## 第三方依赖安装

 ModelScope Library目前支持tensorflow，pytorch两大深度学习框架进行模型训练、推理， 在Python 3.6+,  Pytorch 1.8+, Tensorflow 2.6上测试可运行，用户可以根据所选模型对应的计算框架进行安装，可以参考如下链接进行安装所需框架:

 * [Pytorch安装指导](https://pytorch.org/get-started/locally/)
 * [Tensorflow安装指导](https://www.tensorflow.org/install/pip)

 部分第三方依赖库需要提前安装numpy
 ```
 pip install numpy
 * 安装Tensorflow[参考链接](https://www.tensorflow.org/install/pip)
 ```shell
 pip install --upgrade tensorflow
 ```

 ## ModelScope library 安装

 注： 如果在安装过程中遇到错误，请前往[常见问题](faq.md)查找解决方案。

 ### pip安装
 执行如下命令：
 ```shell
 pip install -r http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/release/maas/modelscope.txt
 pip install model_scope[all] -f https://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/release/maas/repo.html
 ```

 安装成功后，可以执行如下命令进行验证安装是否正确
 ```shell
 python -c "from modelscope.pipelines import pipeline;print(pipeline('image-matting',model='damo/image-matting-person')('http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/data/test/maas/image_matting/test.png'))"
 ```


 ### 使用源码安装

 适合本地开发调试使用，修改源码后可以直接执行
 下载源码前首先联系（临在，谦言，颖达，一耘）申请代码库权限，clone代码到本地
 ```shell
 git clone git@gitlab.alibaba-inc.com:Ali-MaaS/MaaS-lib.git modelscope
 git fetch origin master
 git checkout master

 cd modelscope

 #安装依赖
 ```
 安装依赖并设置PYTHONPATH
 ```shell
 pip install -r requirements.txt

 # 设置PYTHONPATH
 export PYTHONPATH=`pwd`
 ```

 ### 安装验证
 安装成功后，可以执行如下命令进行验证安装是否正确
 ```shell
 python -c "from modelscope.pipelines import pipeline;print(pipeline('image-matting',model='damo/image-matting-person')('http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/data/test/maas/image_matting/test.png'))"
 python -c "from modelscope.pipelines import pipeline;print(pipeline('word-segmentation')('今天天气不错，适合 出去游玩'))"
 {'output': '今天 天气 不错 ， 适合 出去 游玩'}
 ```
 ## 推理

 pipeline函数提供了简洁的推理接口，相关介绍和示例请参考[pipeline使用教程](tutorials/pipeline.md)

 ## 训练

@@ -75,46 +58,3 @@ to be done
 ## 评估

 to be done

 ## 推理

 pipeline函数提供了简洁的推理接口，示例如下， 更多pipeline介绍和示例请参考[pipeline使用教程](tutorials/pipeline.md)

 ```python
 import cv2
 import os.path as osp
 from modelscope.pipelines import pipeline
 from modelscope.utils.constant import Tasks

 # 根据任务名创建pipeline
 img_matting = pipeline(Tasks.image_matting, model='damo/image-matting-person')

 # 直接提供图像文件的url作为pipeline推理的输入
 result = img_matting(
    'http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/data/test/maas/image_matting/test.png'
 )
 cv2.imwrite('result.png', result['output_png'])
 print(f'Output written to {osp.abspath("result.png")}')

 ```

 此外，pipeline接口也能接收Dataset作为输入，上面的代码同样可以实现为

 ```python
 import cv2
 import os.path as osp
 from modelscope.pipelines import pipeline
 from modelscope.utils.constant import Tasks
 from modelscope.msdatasets import MsDataset

 # 使用图像url构建MsDataset，此处也可通过 input_location = '/dir/to/images' 来使用本地文件夹
 input_location = [
    'http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/data/test/maas/image_matting/test.png'
 ]
 dataset = MsDataset.load(input_location, target='image')
 img_matting = pipeline(Tasks.image_matting, model='damo/image-matting-person')
 # 输入为MsDataset时，输出的结果为迭代器
 result = img_matting(dataset)
 cv2.imwrite('result.png', next(result)['output_png'])
 print(f'Output written to {osp.abspath("result.png")}')
 ```
--- a/docs/source/tutorials/pipeline.md
+++ b/docs/source/tutorials/pipeline.md
@@ -1,84 +1,62 @@
 # Pipeline使用教程

 本文将简单介绍如何使用`pipeline`函数加载模型进行推理。`pipeline`函数支持按照任务类型、模型名称从模型仓库
 拉取模型进行进行推理，当前支持的任务有

 * 人像抠图 (image-matting)
 * 基于bert的语义情感分析 (bert-sentiment-analysis)

 本文将从如下方面进行讲解如何使用Pipeline模块：
 本文简单介绍如何使用`pipeline`函数加载模型进行推理。`pipeline`函数支持按照任务类型、模型名称从模型仓库拉取模型进行进行推理，包含以下几个方面：
 * 使用pipeline()函数进行推理
 * 指定特定预处理、特定模型进行推理
 * 不同场景推理任务示例

 ## 环境准备
 详细步骤可以参考 [快速开始](../quick_start.md)

 ## Pipeline基本用法
 下面以中文分词任务为例，说明pipeline函数的基本用法

 1. pipeline函数支持指定特定任务名称，加载任务默认模型，创建对应Pipeline对象
 1. pipeline函数支持指定特定任务名称，加载任务默认模型，创建对应pipeline对象
   执行如下python代码
   ```python
   >>> from modelscope.pipelines import pipeline
   >>> img_matting = pipeline(task='image-matting', model='damo/image-matting-person')
   from modelscope.pipelines import pipeline
   word_segmentation = pipeline('word-segmentation')
   ```

 2. 传入单张图像url进行处理
 2. 输入文本
   ``` python
   >>> import cv2
   >>> result = img_matting('http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/data/test/maas/image_matting/test.png')
   >>> cv2.imwrite('result.png', result['output_png'])
   >>> import os.path as osp
   >>> print(f'result file path is {osp.abspath("result.png")}')
   input = '今天天气不错，适合出去游玩'
   print(word_segmentation(input))
   {'output': '今天 天气 不错 ， 适合 出去 游玩'}
   ```

   pipeline对象也支持传入一个列表输入，返回对应输出列表，每个元素对应输入样本的返回结果
   ```python
   >>> results = img_matting(
       [
           'http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/data/test/maas/image_matting/test.png',
           'http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/data/test/maas/image_matting/test.png',
           'http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/data/test/maas/image_matting/test.png',
       ])
   ```
 3. 输入多条样本

 pipeline对象也支持传入多个样本列表输入，返回对应输出列表，每个元素对应输入样本的返回结果

   如果pipeline对应有一些后处理参数，也支持通过调用时候传入.
   ```python
   >>> pipe = pipeline(task_name)
   >>> result = pipe(input, post_process_args)
   inputs =  ['今天天气不错，适合出去游玩','这本书很好，建议你看看']
   print(word_segmentation(inputs))
   [{'output': '今天 天气 不错 ， 适合 出去 游玩'}, {'output': '这 本 书 很 好 ， 建议 你 看看'}]
   ```

 ## 指定预处理、模型进行推理
 pipeline函数支持传入实例化的预处理对象、模型对象，从而支持用户在推理过程中定制化预处理、模型。
 下面以文本情感分类为例进行介绍。

 由于demo模型为EasyNLP提供的模型，首先，安装EasyNLP
 ```shell
 pip install https://atp-modelzoo-sh.oss-cn-shanghai.aliyuncs.com/release/package/whl/easynlp-0.0.4-py2.py3-none-any.whl
 ```


 下载模型文件
 ```shell
 wget https://atp-modelzoo-sh.oss-cn-shanghai.aliyuncs.com/release/easynlp_modelzoo/alibaba-pai/bert-base-sst2.zip && unzip bert-base-sst2.zip
 ```

 创建tokenizer和模型
 1. 首先，创建预处理方法和模型
 ```python
 >>> from modelscope.models import Model
 >>> from modelscope.preprocessors import SequenceClassificationPreprocessor
 >>> model = Model.from_pretrained('damo/bert-base-sst2')
 >>> tokenizer = SequenceClassificationPreprocessor(
            model.model_dir, first_sequence='sentence', second_sequence=None)
 from modelscope.models import Model
 from modelscope.preprocessors import TokenClassifcationPreprocessor
 model = Model.from_pretrained('damo/nlp_structbert_word-segmentation_chinese-base')
 tokenizer = TokenClassifcationPreprocessor(model.model_dir)
 ```

 使用tokenizer和模型对象创建pipeline
 2. 使用tokenizer和模型对象创建pipeline
 ```python
 >>> from modelscope.pipelines import pipeline
 >>> semantic_cls = pipeline('text-classification', model=model,   preprocessor=tokenizer)
 >>> semantic_cls("Hello world!")
 from modelscope.pipelines import pipeline
 word_seg = pipeline('word-segmentation', model=model, preprocessor=tokenizer)
 input = '今天天气不错，适合出去游玩'
 print(word_seg(input))
 {'output': '今天 天气 不错 ， 适合 出去 游玩'}
 ```

 ## 不同场景任务推理示例

 人像抠图、语义分类建上述两个例子。  其他例子未来添加。
 下面以一个图像任务：人像抠图（'image-matting'）为例，进一步说明pipeline的用法
 ```python
 import cv2
 import os.path as osp
 from modelscope.pipelines import pipeline
 img_matting = pipeline('image-matting')
 result = img_matting('http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/data/test/maas/image_matting/test.png')
 cv2.imwrite('result.png', result['output_png'])
 ```