1. remove pai-easynlp temporarily due to its hard dependency on scipy==1.5.4 2. fix sentiment classification output 3. update quickstart and trainer doc Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/9646399master
| @@ -1,7 +1,7 @@ | |||||
| # 快速开始 | # 快速开始 | ||||
| ModelScope Library目前支持tensorflow,pytorch深度学习框架进行模型训练、推理, 在Python 3.7+, Pytorch 1.8+, Tensorflow1.13-1.15,Tensorflow 2.x上测试可运行。 | |||||
| ModelScope Library目前支持tensorflow,pytorch深度学习框架进行模型训练、推理, 在Python 3.7+, Pytorch 1.8+, Tensorflow1.15,Tensorflow 2.x上测试可运行。 | |||||
| 注: 当前(630)版本 `语音相关`的功能仅支持 python3.7,tensorflow1.13-1.15的`linux`环境使用。 其他功能可以在windows、mac上安装使用。 | |||||
| 注: `语音相关`的功能仅支持 python3.7,tensorflow1.15的`linux`环境使用。 其他功能可以在windows、mac上安装使用。 | |||||
| ## python环境配置 | ## python环境配置 | ||||
| 首先,参考[文档](https://docs.anaconda.com/anaconda/install/) 安装配置Anaconda环境 | 首先,参考[文档](https://docs.anaconda.com/anaconda/install/) 安装配置Anaconda环境 | ||||
| @@ -8,22 +8,10 @@ Modelscope提供了众多预训练模型,你可以使用其中任意一个, | |||||
| 在开始Finetuning前,需要准备一个数据集用以训练和评估,详细可以参考数据集使用教程。 | 在开始Finetuning前,需要准备一个数据集用以训练和评估,详细可以参考数据集使用教程。 | ||||
| `临时写法`,我们通过数据集接口创建一个虚假的dataset | |||||
| ```python | ```python | ||||
| from datasets import Dataset | from datasets import Dataset | ||||
| dataset_dict = { | |||||
| 'sentence1': [ | |||||
| 'This is test sentence1-1', 'This is test sentence2-1', | |||||
| 'This is test sentence3-1' | |||||
| ], | |||||
| 'sentence2': [ | |||||
| 'This is test sentence1-2', 'This is test sentence2-2', | |||||
| 'This is test sentence3-2' | |||||
| ], | |||||
| 'label': [0, 1, 1] | |||||
| } | |||||
| train_dataset = MsDataset.from_hf_dataset(Dataset.from_dict(dataset_dict)) | |||||
| eval_dataset = MsDataset.from_hf_dataset(Dataset.from_dict(dataset_dict)) | |||||
| train_dataset = MsDataset.load'afqmc_small', namespace='modelscope', split='train') | |||||
| eval_dataset = MsDataset.load('afqmc_small', namespace='modelscope', split='validation') | |||||
| ``` | ``` | ||||
| ### 训练 | ### 训练 | ||||
| ModelScope把所有训练相关的配置信息全部放到了模型仓库下的`configuration.json`中,因此我们只需要创建Trainer,加载配置文件,传入数据集即可完成训练。 | ModelScope把所有训练相关的配置信息全部放到了模型仓库下的`configuration.json`中,因此我们只需要创建Trainer,加载配置文件,传入数据集即可完成训练。 | ||||
| @@ -141,7 +141,7 @@ class Trainers(object): | |||||
| Holds the standard trainer name to use for identifying different trainer. | Holds the standard trainer name to use for identifying different trainer. | ||||
| This should be used to register trainers. | This should be used to register trainers. | ||||
| For a general Trainer, you can use easynlp-trainer/ofa-trainer. | |||||
| For a general Trainer, you can use EpochBasedTrainer. | |||||
| For a model specific Trainer, you can use ${ModelName}-${Task}-trainer. | For a model specific Trainer, you can use ${ModelName}-${Task}-trainer. | ||||
| """ | """ | ||||
| @@ -214,10 +214,10 @@ TASK_OUTPUTS = { | |||||
| Tasks.nli: [OutputKeys.SCORES, OutputKeys.LABELS], | Tasks.nli: [OutputKeys.SCORES, OutputKeys.LABELS], | ||||
| # sentiment classification result for single sample | # sentiment classification result for single sample | ||||
| # { | |||||
| # "labels": ["happy", "sad", "calm", "angry"], | |||||
| # "scores": [0.9, 0.1, 0.05, 0.05] | |||||
| # } | |||||
| # { | |||||
| # 'scores': [0.07183828949928284, 0.9281617403030396], | |||||
| # 'labels': ['1', '0'] | |||||
| # } | |||||
| Tasks.sentiment_classification: [OutputKeys.SCORES, OutputKeys.LABELS], | Tasks.sentiment_classification: [OutputKeys.SCORES, OutputKeys.LABELS], | ||||
| # zero-shot classification result for single sample | # zero-shot classification result for single sample | ||||
| @@ -1,6 +1,8 @@ | |||||
| en_core_web_sm>=2.3.5 | en_core_web_sm>=2.3.5 | ||||
| fairseq>=0.10.2 | fairseq>=0.10.2 | ||||
| pai-easynlp | |||||
| # temporarily remove pai-easynl due to its hard dependency scipy==1.5.4 | |||||
| # will be added back | |||||
| # pai-easynlp | |||||
| # rough-score was just recently updated from 0.0.4 to 0.0.7 | # rough-score was just recently updated from 0.0.4 to 0.0.7 | ||||
| # which introduced compatability issues that are being investigated | # which introduced compatability issues that are being investigated | ||||
| rouge_score<=0.0.4 | rouge_score<=0.0.4 | ||||