FastText is a fast text classification algorithm, which is simple and efficient. It was proposed by Armand
Joulin, Tomas Mikolov etc. in the article "Bag of Tricks for Efficient Text Classification" in 2016. It is similar to
@@ -32,13 +34,13 @@ in various tasks of text classification.
[Paper](https://arxiv.org/pdf/1607.01759.pdf): "Bag of Tricks for Efficient Text Classification", 2016, A. Joulin, E. Grave, P. Bojanowski, and T. Mikolov
# [Model Structure](#contents)
## [Model Structure](#contents)
The FastText model mainly consists of an input layer, hidden layer and output layer, where the input is a sequence of words (text or sentence).
The output layer is probability that the words sequence belongs to different categories. The hidden layer is formed by average of multiple word vector.
The feature is mapped to the hidden layer through linear transformation, and then mapped to the label from the hidden layer.
# [Dataset](#contents)
## [Dataset](#contents)
Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network
architecture. In the following sections, we will introduce how to run the scripts using the related dataset below.
@@ -47,17 +49,17 @@ architecture. In the following sections, we will introduce how to run the script
- DBPedia Ontology Classification Dataset
- Yelp Review Polarity Dataset
# [Environment Requirements](#content)
## [Environment Requirements](#content)
- Hardware(Ascend)
- Prepare hardware environment with Ascend processor.
- Hardware(Ascend/GPU)
- Prepare hardware environment with Ascend or GPU processor. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
After dataset preparation, you can start training and evaluation as follows:
@@ -73,7 +75,7 @@ sh run_distribute_train.sh [TRAIN_DATASET] [RANK_TABLE_PATH]
sh run_eval.sh [EVAL_DATASET_PATH] [DATASET_NAME] [MODEL_CKPT] [DEVICEID]
```
# [Script Description](#content)
## [Script Description](#content)
The FastText network script and code result are as follows:
@@ -91,12 +93,15 @@ The FastText network script and code result are as follows:
│ ├──run_distributed_train.sh // shell script for distributed train on ascend.
│ ├──run_eval.sh // shell script for standalone eval on ascend.
│ ├──run_standalone_train.sh // shell script for standalone eval on ascend.
│ ├──run_distributed_train_gpu.sh // shell script for distributed train on GPU.
│ ├──run_eval_gpu.sh // shell script for standalone eval on GPU.
│ ├──run_standalone_train_gpu.sh // shell script for standalone train on GPU.
├── eval.py // Infer API entry.
├── requirements.txt // Requirements of third party package.
├── train.py // Train API entry.
```
## [Dataset Preparation](#content)
### [Dataset Preparation](#content)
- Download the AG's News Topic Classification Dataset, DBPedia Ontology Classification Dataset and Yelp Review Polarity Dataset. Unzip datasets to any path you want.
@@ -107,151 +112,182 @@ The FastText network script and code result are as follows:
sh creat_dataset.sh [SOURCE_DATASET_PATH] [DATASET_NAME]
```
## [Configuration File](#content)
### [Configuration File](#content)
Parameters for both training and evaluation can be set in config.py. All the datasets are using same parameter name, parameters value could be changed according the needs.
- Network Parameters
```text
vocab_size # vocabulary size.
buckets # bucket sequence length.
test_buckets # test dataset bucket sequence length
batch_size # batch size of input dataset.
embedding_dims # The size of each embedding vector.
num_class # number of labels.
epoch # total training epochs.
lr # initial learning rate.
min_lr # minimum learning rate.
warmup_steps # warm up steps.
poly_lr_scheduler_power # a value used to calculate decayed learning rate.
- Running scripts for distributed training of FastText. Task training on multiple device and run the following command in bash to be executed in `scripts/`:
``` bash
cd ./scripts
sh run_distributed_train.sh [DATASET_PATH] [RANK_TABLE_PATH]
```
### [Training Process](#content)
## [Inference Process](#content)
- Running on Ascend
- Running scripts for evaluation of FastText. The commdan as below.
``` bash
cd ./scripts
sh run_eval.sh [DATASET_PATH] [DATASET_NAME] [MODEL_CKPT] [DEVICEID]
```
- Start task training on a single device and run the shell script
Note: The `DATASET_PATH` is path to mindrecord. eg. /dataset_path/*.mindrecord
- Running scripts for distributed training of FastText. Task training on multiple device and run the following command in bash to be executed in `scripts/`:
```bash
cd ./scripts
sh run_distributed_train.sh [DATASET_PATH] [RANK_TABLE_PATH]
```
- Running on GPU
- Start task training on a single device and run the shell script
```bash
cd ./scripts
sh run_standalone_train_gpu.sh [DATASET_PATH]
```
- Running scripts for distributed training of FastText. Task training on multiple device and run the following command in bash to be executed in `scripts/`:
```bash
cd ./scripts
sh run_distributed_train_gpu.sh [DATASET_PATH] [NUM_OF_DEVICES]
```
### [Inference Process](#content)
- Running on Ascend
- Running scripts for evaluation of FastText. The commdan as below.
```bash
cd ./scripts
sh run_eval.sh [DATASET_PATH] [DATASET_NAME] [MODEL_CKPT]
```
Note: The `DATASET_PATH` is path to mindrecord. eg. `/dataset_path/*.mindrecord`
- Running on GPU
- Running scripts for evaluation of FastText. The commdan as below.
```bash
cd ./scripts
sh run_eval_gpu.sh [DATASET_PATH] [DATASET_NAME] [MODEL_CKPT]
```
Note: The `DATASET_PATH` is path to mindrecord. eg. `/dataset_path/*.mindrecord`