History

mindspore-ci-bot ad87aca10d !12029 Add FastText model for GPU From: @yuruilee Reviewed-by: Signed-off-by:		4 years ago
..
scripts	fasttext gpu	4 years ago

src	!12029 Add FastText model for GPU	4 years ago

README.md	fasttext gpu	4 years ago

eval.py	fasttext gpu	4 years ago

export.py	fasttext gpu	4 years ago

requirements.txt	fasttext gpu	4 years ago

train.py	fasttext gpu	4 years ago

README.md

FastText

FastText

FastText
Model Structure
Dataset
Environment Requirements
Quick Start
Script Description
Model Description
- Performance
  - Training Performance
  - Inference Performance
Random Situation Description
Others
ModelZoo HomePage

FastText

FastText is a fast text classification algorithm, which is simple and efficient. It was proposed by Armand
Joulin, Tomas Mikolov etc. in the article "Bag of Tricks for Efficient Text Classification" in 2016. It is similar to
CBOW in model architecture, where the middle word is replace by a label. FastText adopts ngram feature as addition feature
to get some information about words. It speeds up training and testing while maintaining high precision, and widly used
in various tasks of text classification.

Paper: "Bag of Tricks for Efficient Text Classification", 2016, A. Joulin, E. Grave, P. Bojanowski, and T. Mikolov

Model Structure

The FastText model mainly consists of an input layer, hidden layer and output layer, where the input is a sequence of words (text or sentence).
The output layer is probability that the words sequence belongs to different categories. The hidden layer is formed by average of multiple word vector.
The feature is mapped to the hidden layer through linear transformation, and then mapped to the label from the hidden layer.

Dataset

Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network
architecture. In the following sections, we will introduce how to run the scripts using the related dataset below.

AG's news topic classification dataset
DBPedia Ontology Classification Dataset
Yelp Review Polarity Dataset

Environment Requirements

Hardware（Ascend/GPU）
- Prepare hardware environment with Ascend or GPU processor. If you want to try Ascend , please send the application form to ascend@huawei.com. Once approved, you can get the resources.
Framework
- MindSpore
For more information, please check the resources below：
- MindSpore Tutorials
- MindSpore Python API

Quick Start

After dataset preparation, you can start training and evaluation as follows:

# run training example
cd ./scripts
sh run_standalone_train.sh [TRAIN_DATASET] [DEVICEID]

# run distributed training example
sh run_distribute_train.sh [TRAIN_DATASET] [RANK_TABLE_PATH]

# run evaluation example
sh run_eval.sh [EVAL_DATASET_PATH] [DATASET_NAME] [MODEL_CKPT] [DEVICEID]

Script Description

The FastText network script and code result are as follows:

├── fasttext
  ├── README.md                              // Introduction of FastText model.
  ├── src
  │   ├──config.py                           // Configuration instance definition.
  │   ├──create_dataset.py                   // Dataset preparation.
  │   ├──fasttext_model.py                   // FastText model architecture.
  │   ├──fasttext_train.py                   // Use FastText model architecture.
  │   ├──load_dataset.py                     // Dataset loader to feed into model.
  │   ├──lr_scheduler.py                     // Learning rate scheduler.
  ├── scripts
  │   ├──run_distributed_train.sh            // shell script for distributed train on ascend.
  │   ├──run_eval.sh                         // shell script for standalone eval on ascend.
  │   ├──run_standalone_train.sh             // shell script for standalone eval on ascend.
  │   ├──run_distributed_train_gpu.sh        // shell script for distributed train on GPU.
  │   ├──run_eval_gpu.sh                     // shell script for standalone eval on GPU.
  │   ├──run_standalone_train_gpu.sh         // shell script for standalone train on GPU.
  ├── eval.py                                // Infer API entry.
  ├── requirements.txt                       // Requirements of third party package.
  ├── train.py                               // Train API entry.

Dataset Preparation

Download the AG's News Topic Classification Dataset, DBPedia Ontology Classification Dataset and Yelp Review Polarity Dataset. Unzip datasets to any path you want.
Run the following scripts to do data preprocess and convert the original data to mindrecord for training and evaluation.
```
cd scripts
sh creat_dataset.sh [SOURCE_DATASET_PATH] [DATASET_NAME]
```

Configuration File

Parameters for both training and evaluation can be set in config.py. All the datasets are using same parameter name, parameters value could be changed according the needs.

Network Parameters

  vocab_size               # vocabulary size.
  buckets                  # bucket sequence length.
  test_buckets             # test dataset bucket sequence length
  batch_size               # batch size of input dataset.
  embedding_dims           # The size of each embedding vector.
  num_class                # number of labels.
  epoch                    # total training epochs.
  lr                       # initial learning rate.
  min_lr                   # minimum learning rate.
  warmup_steps             # warm up steps.
  poly_lr_scheduler_power  # a value used to calculate decayed learning rate.
  pretrain_ckpt_dir        # pretrain checkpoint direction.
  keep_ckpt_max            # Max ckpt files number.

Training Process

Running on Ascend
- Start task training on a single device and run the shell script
```
cd ./scripts
sh run_standalone_train.sh [DATASET_PATH]
```
- Running scripts for distributed training of FastText. Task training on multiple device and run the following command in bash to be executed in scripts/:
```
cd ./scripts
sh run_distributed_train.sh [DATASET_PATH] [RANK_TABLE_PATH]
```
Running on GPU
- Start task training on a single device and run the shell script
```
cd ./scripts
sh run_standalone_train_gpu.sh [DATASET_PATH]
```
- Running scripts for distributed training of FastText. Task training on multiple device and run the following command in bash to be executed in scripts/:
```
cd ./scripts
sh run_distributed_train_gpu.sh [DATASET_PATH] [NUM_OF_DEVICES]
```

Inference Process

Running on Ascend
- Running scripts for evaluation of FastText. The commdan as below.
```
cd ./scripts
sh run_eval.sh [DATASET_PATH] [DATASET_NAME] [MODEL_CKPT]
```
Note: The DATASET_PATH is path to mindrecord. eg. /dataset_path/*.mindrecord
Running on GPU
- Running scripts for evaluation of FastText. The commdan as below.
```
cd ./scripts
sh run_eval_gpu.sh [DATASET_PATH] [DATASET_NAME] [MODEL_CKPT]
```
Note: The DATASET_PATH is path to mindrecord. eg. /dataset_path/*.mindrecord

Model Description

Performance

Training Performance

Parameters	Ascend	GPU
Resource	Ascend 910; OS Euler2.8	NV SMX3 V100-32G
uploaded Date	12/21/2020 (month/day/year)	1/29/2021 (month/day/year)
MindSpore Version	1.1.0	1.1.0
Dataset	AG's News Topic Classification Dataset	AG's News Topic Classification Dataset
Training Parameters	epoch=5, batch_size=512	epoch=5, batch_size=512
Optimizer	Adam	Adam
Loss Function	Softmax Cross Entropy	Softmax Cross Entropy
outputs	probability	probability
Speed	10ms/step (1pcs)	11.91ms/step(1pcs)
Epoch Time	2.36s (1pcs)	2.815s(1pcs)
Loss	0.0067	0.0085
Params (M)	22	22
Checkpoint for inference	254M (.ckpt file)	254M (.ckpt file)
Scripts	fasttext	fasttext

Parameters	Ascend	GPU
Resource	Ascend 910; OS Euler2.8	NV SMX3 V100-32G
uploaded Date	11/21/2020 (month/day/year)	1/29/2020 (month/day/year)
MindSpore Version	1.1.0	1.1.0
Dataset	DBPedia Ontology Classification Dataset	DBPedia Ontology Classification Dataset
Training Parameters	epoch=5, batch_size=4096	epoch=5, batch_size=4096
Optimizer	Adam	Adam
Loss Function	Softmax Cross Entropy	Softmax Cross Entropy
outputs	probability	probability
Speed	58ms/step (1pcs)	34.82ms/step(1pcs)
Epoch Time	8.15s (1pcs)	4.87s(1pcs)
Loss	2.6e-4	0.0004
Params (M)	106	106
Checkpoint for inference	1.2G (.ckpt file)	1.2G (.ckpt file)
Scripts	fasttext	fasttext

Parameters	Ascend	GPU
Resource	Ascend 910; OS Euler2.8	NV SMX3 V100-32G
uploaded Date	11/21/2020 (month/day/year)	1/29/2020 (month/day/year)
MindSpore Version	1.1.0	1.1.0
Dataset	Yelp Review Polarity Dataset	Yelp Review Polarity Dataset
Training Parameters	epoch=5, batch_size=2048	epoch=5, batch_size=2048
Optimizer	Adam	Adam
Loss Function	Softmax Cross Entropy	Softmax Cross Entropy
outputs	probability	probability
Speed	101ms/step (1pcs)	30.54ms/step(1pcs)
Epoch Time	28s (1pcs)	8.46s(1pcs)
Loss	0.062	0.002
Params (M)	103	103
Checkpoint for inference	1.2G (.ckpt file)	1.2G (.ckpt file)
Scripts	fasttext	fasttext

Inference Performance

Parameters	Ascend	GPU
Resource	Ascend 910; OS Euler2.8	NV SMX3 V100-32G
Uploaded Date	12/21/2020 (month/day/year)	1/29/2020 (month/day/year)
MindSpore Version	1.1.0	1.1.0
Dataset	AG's News Topic Classification Dataset	AG's News Topic Classification Dataset
batch_size	512	128
Epoch Time	2.36s	2.815s(1pcs)
outputs	label index	label index
Accuracy	92.53	92.58
Model for inference	254M (.ckpt file)	254M (.ckpt file)

Parameters	Ascend	GPU
Resource	Ascend 910; OS Euler2.8	NV SMX3 V100-32G
Uploaded Date	12/21/2020 (month/day/year)	1/29/2020 (month/day/year)
MindSpore Version	1.1.0	1.1.0
Dataset	DBPedia Ontology Classification Dataset	DBPedia Ontology Classification Dataset
batch_size	4096	4096
Epoch Time	8.15s	4.87s
outputs	label index	label index
Accuracy	98.6	98.49
Model for inference	1.2G (.ckpt file)	1.2G (.ckpt file)

Parameters	Ascend	GPU
Resource	Ascend 910; OS Euler2.8	NV SMX3 V100-32G
Uploaded Date	12/21/2020 (month/day/year)	12/29/2020 (month/day/year)
MindSpore Version	1.1.0	1.1.0
Dataset	Yelp Review Polarity Dataset	Yelp Review Polarity Dataset
batch_size	2048	2048
Epoch Time	28s	8.46s
outputs	label index	label index
Accuracy	95.7	95.7
Model for inference	1.2G (.ckpt file)	1.2G (.ckpt file)

Random Situation Description

There only one random situation.

Initialization of some model weights.

Some seeds have already been set in train.py to avoid the randomness of weight initialization.

Others

This model has been validated in the Ascend environment and is not validated on the CPU and GPU.

ModelZoo HomePage

Please check the official homepage

MindSpore is a new open source deep learning training/inference framework that could be used for mobile, edge and cloud scenarios.

C++ Python Text Unity3D Asset C other

314202276@qq.com 5518576+mindspore_ci@user.noreply.gitee.com tommylike@qq.com zhaozhenlong1@huawei.com zhoufeng54@huawei.com sunsuodong@huawei.com wangkaisheng2@huawei.com shiliang10@huawei.com yangruoqi@huawei.com xiefangqi2@huawei.com caifubi1@huawei.com lingqiaomin.huawei.com fuzhiye@huawei.com chenweifeng720@huawei.com liubuyu1@huawei.com changzherui1@huawei.com guozhijian@huawei.com huanghui44@huawei.com peixu.ren1@huawei.com liuxiao93@huawei.com zhaoting23@huawei.com lizhenyu13@huawei.com yaoyifan1@huawei.com xuanyue@huawei.com yuchaojie1@huawei.com

README.md

FastText

Training Performance

Inference Performance

Contributors (25+) All

Contributors (25+)
All