1、更新requirements以及README.md 2、更新DataLoader 3、更新loss 4、更新model/bert.py内容及适配的测试代码 5、更新reproduction/README.md 6、修复其他测试代码的报错的地方tags/v0.4.10
| @@ -6,13 +6,14 @@ | |||
|  | |||
| [](http://fastnlp.readthedocs.io/?badge=latest) | |||
| fastNLP 是一款轻量级的 NLP 处理套件。你既可以使用它快速地完成一个命名实体识别(NER)、中文分词或文本分类任务; 也可以使用他构建许多复杂的网络模型,进行科研。它具有如下的特性: | |||
| fastNLP 是一款轻量级的 NLP 处理套件。你既可以使用它快速地完成一个序列标注([NER](reproduction/seqence_labelling/ner/)、POS-Tagging等)、中文分词、文本分类、[Matching](reproduction/matching/)、指代消解、摘要等任务; 也可以使用它构建许多复杂的网络模型,进行科研。它具有如下的特性: | |||
| - 统一的Tabular式数据容器,让数据预处理过程简洁明了。内置多种数据集的DataSet Loader,省去预处理代码。 | |||
| - 各种方便的NLP工具,例如预处理embedding加载; 中间数据cache等; | |||
| - 详尽的中文文档以供查阅; | |||
| - 统一的Tabular式数据容器,让数据预处理过程简洁明了。内置多种数据集的DataSet Loader,省去预处理代码; | |||
| - 多种训练、测试组件,例如训练器Trainer;测试器Tester;以及各种评测metrics等等; | |||
| - 各种方便的NLP工具,例如预处理embedding加载(包括EMLo和BERT); 中间数据cache等; | |||
| - 详尽的中文[文档](https://fastnlp.readthedocs.io/)、教程以供查阅; | |||
| - 提供诸多高级模块,例如Variational LSTM, Transformer, CRF等; | |||
| - 封装CNNText,Biaffine等模型可供直接使用; | |||
| - 在序列标注、中文分词、文本分类、Matching、指代消解、摘要等任务上封装了各种模型可供直接使用; [详细链接](reproduction/) | |||
| - 便捷且具有扩展性的训练器; 提供多种内置callback函数,方便实验记录、异常捕获等。 | |||
| @@ -20,13 +21,14 @@ fastNLP 是一款轻量级的 NLP 处理套件。你既可以使用它快速地 | |||
| fastNLP 依赖如下包: | |||
| + numpy | |||
| + torch>=0.4.0 | |||
| + tqdm | |||
| + nltk | |||
| + numpy>=1.14.2 | |||
| + torch>=1.0.0 | |||
| + tqdm>=4.28.1 | |||
| + nltk>=3.4.1 | |||
| + requests | |||
| 其中torch的安装可能与操作系统及 CUDA 的版本相关,请参见 PyTorch 官网 。 | |||
| 在依赖包安装完成的情况,您可以在命令行执行如下指令完成安装 | |||
| 其中torch的安装可能与操作系统及 CUDA 的版本相关,请参见 [PyTorch 官网](https://pytorch.org/) 。 | |||
| 在依赖包安装完成后,您可以在命令行执行如下指令完成安装 | |||
| ```shell | |||
| pip install fastNLP | |||
| @@ -77,8 +79,8 @@ fastNLP 在 modules 模块中内置了三种模块的诸多组件,可以帮助 | |||
| fastNLP 为不同的 NLP 任务实现了许多完整的模型,它们都经过了训练和测试。 | |||
| 你可以在以下两个地方查看相关信息 | |||
| - [介绍](reproduction/) | |||
| - [源码](fastNLP/models/) | |||
| - [模型介绍](reproduction/) | |||
| - [模型源码](fastNLP/models/) | |||
| ## 项目结构 | |||
| @@ -93,7 +95,7 @@ fastNLP的大致工作流程如上图所示,而项目结构如下: | |||
| </tr> | |||
| <tr> | |||
| <td><b> fastNLP.core </b></td> | |||
| <td> 实现了核心功能,包括数据处理组件、训练器、测速器等 </td> | |||
| <td> 实现了核心功能,包括数据处理组件、训练器、测试器等 </td> | |||
| </tr> | |||
| <tr> | |||
| <td><b> fastNLP.models </b></td> | |||
| @@ -20,6 +20,7 @@ from collections import defaultdict | |||
| import torch | |||
| import torch.nn.functional as F | |||
| from ..core.const import Const | |||
| from .utils import _CheckError | |||
| from .utils import _CheckRes | |||
| from .utils import _build_args | |||
| @@ -28,6 +29,7 @@ from .utils import _check_function_or_method | |||
| from .utils import _get_func_signature | |||
| from .utils import seq_len_to_mask | |||
| class LossBase(object): | |||
| """ | |||
| 所有loss的基类。如果想了解其中的原理,请查看源码。 | |||
| @@ -95,22 +97,7 @@ class LossBase(object): | |||
| # if func_spect.varargs: | |||
| # raise NameError(f"Delete `*{func_spect.varargs}` in {get_func_signature(self.get_loss)}(Do not use " | |||
| # f"positional argument.).") | |||
| def _fast_param_map(self, pred_dict, target_dict): | |||
| """Only used as inner function. When the pred_dict, target is unequivocal. Don't need users to pass key_map. | |||
| such as pred_dict has one element, target_dict has one element | |||
| :param pred_dict: | |||
| :param target_dict: | |||
| :return: dict, if dict is not {}, pass it to self.evaluate. Otherwise do mapping. | |||
| """ | |||
| fast_param = {} | |||
| if len(self._param_map) == 2 and len(pred_dict) == 1 and len(target_dict) == 1: | |||
| fast_param['pred'] = list(pred_dict.values())[0] | |||
| fast_param['target'] = list(target_dict.values())[0] | |||
| return fast_param | |||
| return fast_param | |||
| def __call__(self, pred_dict, target_dict, check=False): | |||
| """ | |||
| :param dict pred_dict: 模型的forward函数返回的dict | |||
| @@ -118,11 +105,7 @@ class LossBase(object): | |||
| :param Boolean check: 每一次执行映射函数的时候是否检查映射表,默认为不检查 | |||
| :return: | |||
| """ | |||
| fast_param = self._fast_param_map(pred_dict, target_dict) | |||
| if fast_param: | |||
| loss = self.get_loss(**fast_param) | |||
| return loss | |||
| if not self._checked: | |||
| # 1. check consistence between signature and _param_map | |||
| func_spect = inspect.getfullargspec(self.get_loss) | |||
| @@ -212,7 +195,6 @@ class LossFunc(LossBase): | |||
| if not isinstance(key_map, dict): | |||
| raise RuntimeError(f"Loss error: key_map except a {type({})} but got a {type(key_map)}") | |||
| self._init_param_map(key_map, **kwargs) | |||
| class CrossEntropyLoss(LossBase): | |||
| @@ -226,7 +208,7 @@ class CrossEntropyLoss(LossBase): | |||
| :param seq_len: 句子的长度, 长度之外的token不会计算loss。。 | |||
| :param padding_idx: padding的index,在计算loss时将忽略target中标号为padding_idx的内容, 可以通过该值代替 | |||
| 传入seq_len. | |||
| :param str reduction: 支持'elementwise_mean'和'sum'. | |||
| :param str reduction: 支持'mean','sum'和'none'. | |||
| Example:: | |||
| @@ -234,16 +216,16 @@ class CrossEntropyLoss(LossBase): | |||
| """ | |||
| def __init__(self, pred=None, target=None, seq_len=None, padding_idx=-100, reduction='elementwise_mean'): | |||
| def __init__(self, pred=None, target=None, seq_len=None, padding_idx=-100, reduction='mean'): | |||
| super(CrossEntropyLoss, self).__init__() | |||
| self._init_param_map(pred=pred, target=target, seq_len=seq_len) | |||
| self.padding_idx = padding_idx | |||
| assert reduction in ('elementwise_mean', 'sum') | |||
| assert reduction in ('mean', 'sum', 'none') | |||
| self.reduction = reduction | |||
| def get_loss(self, pred, target, seq_len=None): | |||
| if pred.dim()>2: | |||
| if pred.size(1)!=target.size(1): | |||
| if pred.dim() > 2: | |||
| if pred.size(1) != target.size(1): | |||
| pred = pred.transpose(1, 2) | |||
| pred = pred.reshape(-1, pred.size(-1)) | |||
| target = target.reshape(-1) | |||
| @@ -263,15 +245,18 @@ class L1Loss(LossBase): | |||
| :param pred: 参数映射表中 `pred` 的映射关系,None表示映射关系为 `pred` -> `pred` | |||
| :param target: 参数映射表中 `target` 的映射关系,None表示映射关系为 `target` >`target` | |||
| :param str reduction: 支持'mean','sum'和'none'. | |||
| """ | |||
| def __init__(self, pred=None, target=None): | |||
| def __init__(self, pred=None, target=None, reduction='mean'): | |||
| super(L1Loss, self).__init__() | |||
| self._init_param_map(pred=pred, target=target) | |||
| assert reduction in ('mean', 'sum', 'none') | |||
| self.reduction = reduction | |||
| def get_loss(self, pred, target): | |||
| return F.l1_loss(input=pred, target=target) | |||
| return F.l1_loss(input=pred, target=target, reduction=self.reduction) | |||
| class BCELoss(LossBase): | |||
| @@ -282,14 +267,17 @@ class BCELoss(LossBase): | |||
| :param pred: 参数映射表中`pred`的映射关系,None表示映射关系为`pred`->`pred` | |||
| :param target: 参数映射表中`target`的映射关系,None表示映射关系为`target`->`target` | |||
| :param str reduction: 支持'mean','sum'和'none'. | |||
| """ | |||
| def __init__(self, pred=None, target=None): | |||
| def __init__(self, pred=None, target=None, reduction='mean'): | |||
| super(BCELoss, self).__init__() | |||
| self._init_param_map(pred=pred, target=target) | |||
| assert reduction in ('mean', 'sum', 'none') | |||
| self.reduction = reduction | |||
| def get_loss(self, pred, target): | |||
| return F.binary_cross_entropy(input=pred, target=target) | |||
| return F.binary_cross_entropy(input=pred, target=target, reduction=self.reduction) | |||
| class NLLLoss(LossBase): | |||
| @@ -300,14 +288,20 @@ class NLLLoss(LossBase): | |||
| :param pred: 参数映射表中`pred`的映射关系,None表示映射关系为`pred`->`pred` | |||
| :param target: 参数映射表中`target`的映射关系,None表示映射关系为`target`->`target` | |||
| :param ignore_idx: ignore的index,在计算loss时将忽略target中标号为ignore_idx的内容, 可以通过该值代替 | |||
| 传入seq_len. | |||
| :param str reduction: 支持'mean','sum'和'none'. | |||
| """ | |||
| def __init__(self, pred=None, target=None): | |||
| def __init__(self, pred=None, target=None, ignore_idx=-100, reduction='mean'): | |||
| super(NLLLoss, self).__init__() | |||
| self._init_param_map(pred=pred, target=target) | |||
| assert reduction in ('mean', 'sum', 'none') | |||
| self.reduction = reduction | |||
| self.ignore_idx = ignore_idx | |||
| def get_loss(self, pred, target): | |||
| return F.nll_loss(input=pred, target=target) | |||
| return F.nll_loss(input=pred, target=target, ignore_index=self.ignore_idx, reduction=self.reduction) | |||
| class LossInForward(LossBase): | |||
| @@ -319,7 +313,7 @@ class LossInForward(LossBase): | |||
| :param str loss_key: 在forward函数中loss的键名,默认为loss | |||
| """ | |||
| def __init__(self, loss_key='loss'): | |||
| def __init__(self, loss_key=Const.LOSS): | |||
| super().__init__() | |||
| if not isinstance(loss_key, str): | |||
| raise TypeError(f"Only str allowed for loss_key, got {type(loss_key)}.") | |||
| @@ -10,6 +10,7 @@ from typing import Union, Dict | |||
| import os | |||
| from ..core.dataset import DataSet | |||
| class BaseLoader(object): | |||
| """ | |||
| 各个 Loader 的基类,提供了 API 的参考。 | |||
| @@ -55,8 +56,6 @@ class BaseLoader(object): | |||
| return obj | |||
| def _download_from_url(url, path): | |||
| try: | |||
| from tqdm.auto import tqdm | |||
| @@ -115,13 +114,11 @@ class DataInfo: | |||
| 经过处理的数据信息,包括一系列数据集(比如:分开的训练集、验证集和测试集)及它们所用的词表和词嵌入。 | |||
| :param vocabs: 从名称(字符串)到 :class:`~fastNLP.Vocabulary` 类型的dict | |||
| :param embeddings: 从名称(字符串)到一系列 embedding 的dict,参考 :class:`~fastNLP.io.EmbedLoader` | |||
| :param datasets: 从名称(字符串)到 :class:`~fastNLP.DataSet` 类型的dict | |||
| """ | |||
| def __init__(self, vocabs: dict = None, embeddings: dict = None, datasets: dict = None): | |||
| def __init__(self, vocabs: dict = None, datasets: dict = None): | |||
| self.vocabs = vocabs or {} | |||
| self.embeddings = embeddings or {} | |||
| self.datasets = datasets or {} | |||
| def __repr__(self): | |||
| @@ -133,6 +130,7 @@ class DataInfo: | |||
| _str += '\t{} has {} entries.\n'.format(name, len(vocab)) | |||
| return _str | |||
| class DataSetLoader: | |||
| """ | |||
| 别名::class:`fastNLP.io.DataSetLoader` :class:`fastNLP.io.dataset_loader.DataSetLoader` | |||
| @@ -213,7 +211,6 @@ class DataSetLoader: | |||
| 返回的 :class:`DataInfo` 对象有如下属性: | |||
| - vocabs: 由从数据集中获取的词表组成的字典,每个词表 | |||
| - embeddings: (可选) 数据集对应的词嵌入 | |||
| - datasets: 一个dict,包含一系列 :class:`~fastNLP.DataSet` 类型的对象。其中 field 的命名参考 :mod:`~fastNLP.core.const` | |||
| :param paths: 原始数据读取的路径 | |||
| @@ -0,0 +1,19 @@ | |||
| """ | |||
| 用于读数据集的模块, 具体包括: | |||
| 这些模块的使用方法如下: | |||
| """ | |||
| __all__ = [ | |||
| 'SSTLoader', | |||
| 'MatchingLoader', | |||
| 'SNLILoader', | |||
| 'MNLILoader', | |||
| 'QNLILoader', | |||
| 'QuoraLoader', | |||
| 'RTELoader', | |||
| ] | |||
| from .sst import SSTLoader | |||
| from .matching import MatchingLoader, SNLILoader, \ | |||
| MNLILoader, QNLILoader, QuoraLoader, RTELoader | |||
| @@ -8,35 +8,7 @@ from torch import nn | |||
| from .base_model import BaseModel | |||
| from ..core.const import Const | |||
| from ..modules.encoder import BertModel | |||
| class BertConfig: | |||
| def __init__( | |||
| self, | |||
| vocab_size=30522, | |||
| hidden_size=768, | |||
| num_hidden_layers=12, | |||
| num_attention_heads=12, | |||
| intermediate_size=3072, | |||
| hidden_act="gelu", | |||
| hidden_dropout_prob=0.1, | |||
| attention_probs_dropout_prob=0.1, | |||
| max_position_embeddings=512, | |||
| type_vocab_size=2, | |||
| initializer_range=0.02 | |||
| ): | |||
| self.vocab_size = vocab_size | |||
| self.hidden_size = hidden_size | |||
| self.num_hidden_layers = num_hidden_layers | |||
| self.num_attention_heads = num_attention_heads | |||
| self.intermediate_size = intermediate_size | |||
| self.hidden_act = hidden_act | |||
| self.hidden_dropout_prob = hidden_dropout_prob | |||
| self.attention_probs_dropout_prob = attention_probs_dropout_prob | |||
| self.max_position_embeddings = max_position_embeddings | |||
| self.type_vocab_size = type_vocab_size | |||
| self.initializer_range = initializer_range | |||
| from ..modules.encoder._bert import BertConfig | |||
| class BertForSequenceClassification(BaseModel): | |||
| @@ -84,11 +56,17 @@ class BertForSequenceClassification(BaseModel): | |||
| self.bert = BertModel.from_pretrained(bert_dir) | |||
| else: | |||
| if config is None: | |||
| config = BertConfig() | |||
| self.bert = BertModel(**config.__dict__) | |||
| config = BertConfig(30522) | |||
| self.bert = BertModel(config) | |||
| self.dropout = nn.Dropout(config.hidden_dropout_prob) | |||
| self.classifier = nn.Linear(config.hidden_size, num_labels) | |||
| @classmethod | |||
| def from_pretrained(cls, num_labels, pretrained_model_dir): | |||
| config = BertConfig(pretrained_model_dir) | |||
| model = cls(num_labels=num_labels, config=config, bert_dir=pretrained_model_dir) | |||
| return model | |||
| def forward(self, input_ids, token_type_ids=None, attention_mask=None, labels=None): | |||
| _, pooled_output = self.bert(input_ids, token_type_ids, attention_mask, output_all_encoded_layers=False) | |||
| pooled_output = self.dropout(pooled_output) | |||
| @@ -151,11 +129,17 @@ class BertForMultipleChoice(BaseModel): | |||
| self.bert = BertModel.from_pretrained(bert_dir) | |||
| else: | |||
| if config is None: | |||
| config = BertConfig() | |||
| self.bert = BertModel(**config.__dict__) | |||
| config = BertConfig(30522) | |||
| self.bert = BertModel(config) | |||
| self.dropout = nn.Dropout(config.hidden_dropout_prob) | |||
| self.classifier = nn.Linear(config.hidden_size, 1) | |||
| @classmethod | |||
| def from_pretrained(cls, num_choices, pretrained_model_dir): | |||
| config = BertConfig(pretrained_model_dir) | |||
| model = cls(num_choices=num_choices, config=config, bert_dir=pretrained_model_dir) | |||
| return model | |||
| def forward(self, input_ids, token_type_ids=None, attention_mask=None, labels=None): | |||
| flat_input_ids = input_ids.view(-1, input_ids.size(-1)) | |||
| flat_token_type_ids = token_type_ids.view(-1, token_type_ids.size(-1)) | |||
| @@ -224,11 +208,17 @@ class BertForTokenClassification(BaseModel): | |||
| self.bert = BertModel.from_pretrained(bert_dir) | |||
| else: | |||
| if config is None: | |||
| config = BertConfig() | |||
| self.bert = BertModel(**config.__dict__) | |||
| config = BertConfig(30522) | |||
| self.bert = BertModel(config) | |||
| self.dropout = nn.Dropout(config.hidden_dropout_prob) | |||
| self.classifier = nn.Linear(config.hidden_size, num_labels) | |||
| @classmethod | |||
| def from_pretrained(cls, num_labels, pretrained_model_dir): | |||
| config = BertConfig(pretrained_model_dir) | |||
| model = cls(num_labels=num_labels, config=config, bert_dir=pretrained_model_dir) | |||
| return model | |||
| def forward(self, input_ids, token_type_ids=None, attention_mask=None, labels=None): | |||
| sequence_output, _ = self.bert(input_ids, token_type_ids, attention_mask, output_all_encoded_layers=False) | |||
| sequence_output = self.dropout(sequence_output) | |||
| @@ -302,12 +292,18 @@ class BertForQuestionAnswering(BaseModel): | |||
| self.bert = BertModel.from_pretrained(bert_dir) | |||
| else: | |||
| if config is None: | |||
| config = BertConfig() | |||
| self.bert = BertModel(**config.__dict__) | |||
| config = BertConfig(30522) | |||
| self.bert = BertModel(config) | |||
| # TODO check with Google if it's normal there is no dropout on the token classifier of SQuAD in the TF version | |||
| # self.dropout = nn.Dropout(config.hidden_dropout_prob) | |||
| self.qa_outputs = nn.Linear(config.hidden_size, 2) | |||
| @classmethod | |||
| def from_pretrained(cls, pretrained_model_dir): | |||
| config = BertConfig(pretrained_model_dir) | |||
| model = cls(config=config, bert_dir=pretrained_model_dir) | |||
| return model | |||
| def forward(self, input_ids, token_type_ids=None, attention_mask=None, start_positions=None, end_positions=None): | |||
| sequence_output, _ = self.bert(input_ids, token_type_ids, attention_mask, output_all_encoded_layers=False) | |||
| logits = self.qa_outputs(sequence_output) | |||
| @@ -15,7 +15,8 @@ class MLP(nn.Module): | |||
| 多层感知器 | |||
| :param List[int] size_layer: 一个int的列表,用来定义MLP的层数,列表中的数字为每一层是hidden数目。MLP的层数为 len(size_layer) - 1 | |||
| :param Union[str,func,List[str]] activation: 一个字符串或者函数的列表,用来定义每一个隐层的激活函数,字符串包括relu,tanh和sigmoid,默认值为relu | |||
| :param Union[str,func,List[str]] activation: 一个字符串或者函数的列表,用来定义每一个隐层的激活函数,字符串包括relu,tanh和 | |||
| sigmoid,默认值为relu | |||
| :param Union[str,func] output_activation: 字符串或者函数,用来定义输出层的激活函数,默认值为None,表示输出层没有激活函数 | |||
| :param str initial_method: 参数初始化方式 | |||
| :param float dropout: dropout概率,默认值为0 | |||
| @@ -26,6 +26,7 @@ import sys | |||
| CONFIG_FILE = 'bert_config.json' | |||
| class BertConfig(object): | |||
| """Configuration class to store the configuration of a `BertModel`. | |||
| """ | |||
| @@ -339,13 +340,19 @@ class BertModel(nn.Module): | |||
| 如果你想使用预训练好的权重矩阵,请在以下网址下载. | |||
| sources:: | |||
| 'bert-base-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased.tar.gz", | |||
| 'bert-large-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased.tar.gz", | |||
| 'bert-base-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased.tar.gz", | |||
| 'bert-large-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased.tar.gz", | |||
| 'bert-base-multilingual-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-uncased.tar.gz", | |||
| 'bert-base-multilingual-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased.tar.gz", | |||
| 'bert-base-chinese': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-chinese.tar.gz", | |||
| 'bert-base-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-pytorch_model.bin", | |||
| 'bert-large-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-pytorch_model.bin", | |||
| 'bert-base-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-pytorch_model.bin", | |||
| 'bert-large-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-pytorch_model.bin", | |||
| 'bert-base-multilingual-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-uncased-pytorch_model.bin", | |||
| 'bert-base-multilingual-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-pytorch_model.bin", | |||
| 'bert-base-chinese': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-chinese-pytorch_model.bin", | |||
| 'bert-base-german-cased': "https://int-deepset-models-bert.s3.eu-central-1.amazonaws.com/pytorch/bert-base-german-cased-pytorch_model.bin", | |||
| 'bert-large-uncased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-pytorch_model.bin", | |||
| 'bert-large-cased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-pytorch_model.bin", | |||
| 'bert-large-uncased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-finetuned-squad-pytorch_model.bin", | |||
| 'bert-large-cased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-finetuned-squad-pytorch_model.bin", | |||
| 'bert-base-cased-finetuned-mrpc': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-mrpc-pytorch_model.bin" | |||
| 用预训练权重矩阵来建立BERT模型:: | |||
| @@ -562,6 +569,7 @@ class WordpieceTokenizer(object): | |||
| output_tokens.extend(sub_tokens) | |||
| return output_tokens | |||
| def load_vocab(vocab_file): | |||
| """Loads a vocabulary file into a dictionary.""" | |||
| vocab = collections.OrderedDict() | |||
| @@ -692,6 +700,7 @@ class BasicTokenizer(object): | |||
| output.append(char) | |||
| return "".join(output) | |||
| def _is_whitespace(char): | |||
| """Checks whether `chars` is a whitespace character.""" | |||
| # \t, \n, and \r are technically contorl characters but we treat them | |||
| @@ -3,6 +3,8 @@ | |||
| 复现的模型有: | |||
| - [Star-Transformer](Star_transformer/) | |||
| - [Biaffine](https://github.com/fastnlp/fastNLP/blob/999a14381747068e9e6a7cc370037b320197db00/fastNLP/models/biaffine_parser.py#L239) | |||
| - [CNNText](https://github.com/fastnlp/fastNLP/blob/999a14381747068e9e6a7cc370037b320197db00/fastNLP/models/cnn_text_classification.py#L12) | |||
| - ... | |||
| # 任务复现 | |||
| @@ -11,11 +13,11 @@ | |||
| ## Matching (自然语言推理/句子匹配) | |||
| - [Matching 任务复现](matching/) | |||
| - [Matching 任务复现](matching) | |||
| ## Sequence Labeling (序列标注) | |||
| - still in progress | |||
| - [NER](seqence_labelling/ner) | |||
| ## Coreference resolution (指代消解) | |||
| @@ -2,7 +2,8 @@ import torch | |||
| import json | |||
| import os | |||
| from fastNLP import Vocabulary | |||
| from fastNLP.io.dataset_loader import ConllLoader, SSTLoader, SNLILoader | |||
| from fastNLP.io.dataset_loader import ConllLoader | |||
| from fastNLP.io.data_loader import SSTLoader, SNLILoader | |||
| from fastNLP.core import Const as C | |||
| import numpy as np | |||
| @@ -1,5 +1,5 @@ | |||
| numpy | |||
| torch>=0.4.0 | |||
| tqdm | |||
| nltk | |||
| numpy>=1.14.2 | |||
| torch>=1.0.0 | |||
| tqdm>=4.28.1 | |||
| nltk>=3.4.1 | |||
| requests | |||
| @@ -1,7 +1,7 @@ | |||
| import unittest | |||
| import os | |||
| from fastNLP.io import Conll2003Loader, PeopleDailyCorpusLoader, CSVLoader, SNLILoader, JsonLoader | |||
| from fastNLP.io.dataset_loader import SSTLoader | |||
| from fastNLP.io import Conll2003Loader, PeopleDailyCorpusLoader, CSVLoader, JsonLoader | |||
| from fastNLP.io.dataset_loader import SSTLoader, SNLILoader | |||
| from reproduction.text_classification.data.yelpLoader import yelpLoader | |||
| @@ -8,8 +8,9 @@ from fastNLP.models.bert import * | |||
| class TestBert(unittest.TestCase): | |||
| def test_bert_1(self): | |||
| from fastNLP.core.const import Const | |||
| from fastNLP.modules.encoder._bert import BertConfig | |||
| model = BertForSequenceClassification(2) | |||
| model = BertForSequenceClassification(2, BertConfig(32000)) | |||
| input_ids = torch.LongTensor([[31, 51, 99], [15, 5, 0]]) | |||
| input_mask = torch.LongTensor([[1, 1, 1], [1, 1, 0]]) | |||
| @@ -22,8 +23,9 @@ class TestBert(unittest.TestCase): | |||
| def test_bert_2(self): | |||
| from fastNLP.core.const import Const | |||
| from fastNLP.modules.encoder._bert import BertConfig | |||
| model = BertForMultipleChoice(2) | |||
| model = BertForMultipleChoice(2, BertConfig(32000)) | |||
| input_ids = torch.LongTensor([[31, 51, 99], [15, 5, 0]]) | |||
| input_mask = torch.LongTensor([[1, 1, 1], [1, 1, 0]]) | |||
| @@ -36,8 +38,9 @@ class TestBert(unittest.TestCase): | |||
| def test_bert_3(self): | |||
| from fastNLP.core.const import Const | |||
| from fastNLP.modules.encoder._bert import BertConfig | |||
| model = BertForTokenClassification(7) | |||
| model = BertForTokenClassification(7, BertConfig(32000)) | |||
| input_ids = torch.LongTensor([[31, 51, 99], [15, 5, 0]]) | |||
| input_mask = torch.LongTensor([[1, 1, 1], [1, 1, 0]]) | |||
| @@ -50,8 +53,9 @@ class TestBert(unittest.TestCase): | |||
| def test_bert_4(self): | |||
| from fastNLP.core.const import Const | |||
| from fastNLP.modules.encoder._bert import BertConfig | |||
| model = BertForQuestionAnswering() | |||
| model = BertForQuestionAnswering(BertConfig(32000)) | |||
| input_ids = torch.LongTensor([[31, 51, 99], [15, 5, 0]]) | |||
| input_mask = torch.LongTensor([[1, 1, 1], [1, 1, 0]]) | |||
| @@ -8,8 +8,9 @@ from fastNLP.models.bert import BertModel | |||
| class TestBert(unittest.TestCase): | |||
| def test_bert_1(self): | |||
| model = BertModel(vocab_size=32000, hidden_size=768, | |||
| num_hidden_layers=12, num_attention_heads=12, intermediate_size=3072) | |||
| from fastNLP.modules.encoder._bert import BertConfig | |||
| config = BertConfig(32000) | |||
| model = BertModel(config) | |||
| input_ids = torch.LongTensor([[31, 51, 99], [15, 5, 0]]) | |||
| input_mask = torch.LongTensor([[1, 1, 1], [1, 1, 0]]) | |||
| @@ -18,4 +19,4 @@ class TestBert(unittest.TestCase): | |||
| all_encoder_layers, pooled_output = model(input_ids, token_type_ids, input_mask) | |||
| for layer in all_encoder_layers: | |||
| self.assertEqual(tuple(layer.shape), (2, 3, 768)) | |||
| self.assertEqual(tuple(pooled_output.shape), (2, 768)) | |||
| self.assertEqual(tuple(pooled_output.shape), (2, 768)) | |||