History

CaoJian 02f630286c update wrong comment		5 years ago
..
scripts	yolov3-darknet add weight transform scripts.	5 years ago

src	update wrong comment	5 years ago

README.md	fix readme of densenet121 and config of inceptionv3	5 years ago

eval.py	new add densenet121 and update resnet, vgg.	5 years ago

mindspore_hub_conf.py	add hub for densenet121 and inceptionv3	5 years ago

train.py	remove parameter broadcast	5 years ago

README.md

DenseNet121 Description

DenseNet121 is a convolution based neural network for the task of image classification. The paper describing the model can be found here. HuaWei’s DenseNet121 is a implementation on MindSpore.

The repository also contains scripts to launch training and inference routines.

Model Architecture

DenseNet121 builds on 4 densely connected block. In every dense block, each layer obtains additional inputs from all preceding layers and passes on its own feature-maps to all subsequent layers. Concatenation is used. Each layer is receiving a “collective knowledge” from all preceding layers.

Dataset

Dataset used: ImageNet
The default configuration of the Dataset are as follows:

Training Dataset preprocess:
- Input size of images is 224*224
- Range (min, max) of respective size of the original size to be cropped is (0.08, 1.0)
- Range (min, max) of aspect ratio to be cropped is (0.75, 1.333)
- Probability of the image being flipped set to 0.5
- Randomly adjust the brightness, contrast, saturation (0.4, 0.4, 0.4)
- Normalize the input image with respect to mean and standard deviation
Test Dataset preprocess:
- Input size of images is 224*224 (Resize to 256*256 then crops images at the center)
- Normalize the input image with respect to mean and standard deviation

Features

Mixed Precision

The mixed precision training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware.
For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching ‘reduce precision’.

Environment Requirements

Hardware（Ascend）
- Prepare hardware environment with Ascend AI processor. If you want to try Ascend , please send the application form to ascend@huawei.com. Once approved, you can get the resources.
Framework
- MindSpore
For more information, please check the resources below：
- MindSpore Tutorials
- MindSpore Python API

Quick Start

After installing MindSpore via the official website, you can start training and evaluation as follows:

# run training example
python train.py --data_dir /PATH/TO/DATASET --pretrained /PATH/TO/PRETRAINED_CKPT --is_distributed 0 > train.log 2>&1 & 

# run distributed training example
sh scripts/run_distribute_train.sh 8 rank_table.json /PATH/TO/DATASET /PATH/TO/PRETRAINED_CKPT

# run evaluation example
python eval.py --data_dir /PATH/TO/DATASET --pretrained /PATH/TO/CHECKPOINT > eval.log 2>&1 & 
OR
sh scripts/run_distribute_eval.sh 8 rank_table.json /PATH/TO/DATASET /PATH/TO/CHECKPOINT

For distributed training, a hccl configuration file with JSON format needs to be created in advance.

Please follow the instructions in the link below:

https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools.

Script Description

Script and Sample Code

├── model_zoo
    ├── README.md                          // descriptions about all the models
    ├── densenet121        
        ├── README.md                    // descriptions about densenet121
        ├── scripts 
        │   ├── run_distribute_train.sh             // shell script for distributed on Ascend
        │   ├── run_distribute_eval.sh              // shell script for evaluation on Ascend
        ├── src 
        │   ├── datasets             // dataset processing function
        │   ├── losses          
        │       ├──crossentropy.py            // densenet loss function
        │   ├── lr_scheduler           
        │       ├──lr_scheduler.py            // densenet learning rate schedule function
        │   ├── network            
        │       ├──densenet.py            // densenet architecture
        │   ├──optimizers            // densenet optimize function
        │   ├──utils            
        │       ├──logging.py            // logging function
        │       ├──var_init.py            // densenet variable init function
        │   ├── config.py             // network config
        ├── train.py               // training script 
        ├── eval.py               //  evaluation script

Script Parameters

You can modify the training behaviour through the various flags in the train.py script. Flags in the train.py script are as follows:

  --data_dir              train data dir
  --num_classes           num of classes in dataset（default:1000)
  --image_size            image size of the dataset
  --per_batch_size        mini-batch size (default: 256) per gpu
  --pretrained            path of pretrained model
  --lr_scheduler          type of LR schedule: exponential, cosine_annealing
  --lr                    initial learning rate
  --lr_epochs             epoch milestone of lr changing
  --lr_gamma              decrease lr by a factor of exponential lr_scheduler
  --eta_min               eta_min in cosine_annealing scheduler
  --T_max                 T_max in cosine_annealing scheduler
  --max_epoch             max epoch num to train the model
  --warmup_epochs         warmup epoch(when batchsize is large)
  --weight_decay          weight decay (default: 1e-4)
  --momentum              momentum(default: 0.9)
  --label_smooth          whether to use label smooth in CE
  --label_smooth_factor   smooth strength of original one-hot
  --log_interval          logging interval(dafault:100)
  --ckpt_path             path to save checkpoint
  --ckpt_interval         the interval to save checkpoint
  --is_save_on_master     save checkpoint on master or all rank
  --is_distributed        if multi device(default: 1)
  --rank                  local rank of distributed(default: 0)
  --group_size            world size of distributed(default: 1)

Training Process

Training

running on Ascend

python train.py --data_dir /PATH/TO/DATASET --pretrained /PATH/TO/PRETRAINED_CKPT --is_distributed 0 > train.log 2>&1 &

The python command above will run in the background, The log and model checkpoint will be generated in output/202x-xx-xx_time_xx_xx_xx/. The loss value will be achieved as follows:

2020-08-22 16:58:56,617:INFO:epoch[0], iter[5003], loss:4.367, mean_fps:0.00 imgs/sec
2020-08-22 16:58:56,619:INFO:local passed
2020-08-22 17:02:19,920:INFO:epoch[1], iter[10007], loss:3.193, mean_fps:6301.11 imgs/sec
2020-08-22 17:02:19,921:INFO:local passed
2020-08-22 17:05:43,112:INFO:epoch[2], iter[15011], loss:3.096, mean_fps:6304.53 imgs/sec
2020-08-22 17:05:43,113:INFO:local passed
...

Distributed Training

running on Ascend

sh scripts/run_distribute_train.sh 8 rank_table.json /PATH/TO/DATASET /PATH/TO/PRETRAINED_CKPT

The above shell script will run distribute training in the background. You can view the results log and model checkpoint through the file train[X]/output/202x-xx-xx_time_xx_xx_xx/. The loss value will be achieved as follows:

2020-08-22 16:58:54,556:INFO:epoch[0], iter[5003], loss:3.857, mean_fps:0.00 imgs/sec
2020-08-22 17:02:19,188:INFO:epoch[1], iter[10007], loss:3.18, mean_fps:6260.18 imgs/sec
2020-08-22 17:05:42,490:INFO:epoch[2], iter[15011], loss:2.621, mean_fps:6301.11 imgs/sec
2020-08-22 17:09:05,686:INFO:epoch[3], iter[20015], loss:3.113, mean_fps:6304.37 imgs/sec
2020-08-22 17:12:28,925:INFO:epoch[4], iter[25019], loss:3.29, mean_fps:6303.07 imgs/sec
2020-08-22 17:15:52,167:INFO:epoch[5], iter[30023], loss:2.865, mean_fps:6302.98 imgs/sec
...
...

Evaluation Process

Evaluation

evaluation on Ascend

running the command below for evaluation.

python eval.py --data_dir /PATH/TO/DATASET --pretrained /PATH/TO/CHECKPOINT > eval.log 2>&1 & 
OR
sh scripts/run_distribute_eval.sh 8 rank_table.json /PATH/TO/DATASET /PATH/TO/CHECKPOINT

The above python command will run in the background. You can view the results through the file "output/202x-xx-xx_time_xx_xx_xx/202x_xxxx.log". The accuracy of the test dataset will be as follows:

2020-08-24 09:21:50,551:INFO:after allreduce eval: top1_correct=37657, tot=49920, acc=75.43%
2020-08-24 09:21:50,551:INFO:after allreduce eval: top5_correct=46224, tot=49920, acc=92.60%

Model Description

Performance

Training accuracy results

Parameters	Densenet
Model Version	Inception V1
Resource	Ascend 910
Uploaded Date	09/15/2020 (month/day/year)
MindSpore Version	1.0.0
Dataset	ImageNet
epochs	120
outputs	probability
accuracy	Top1:75.13%; Top5:92.57%

Training performance results

Parameters	Densenet
Model Version	Inception V1
Resource	Ascend 910
Uploaded Date	09/15/2020 (month/day/year)
MindSpore Version	1.0.0
Dataset	ImageNet
batch_size	32
outputs	probability
speed	1pc:760 img/s;8pc:6000 img/s

Description of Random Situation

In dataset.py, we set the seed inside “create_dataset" function. We also use random seed in train.py.

ModelZoo Homepage

Please check the official homepage.

MindSpore is a new open source deep learning training/inference framework that could be used for mobile, edge and cloud scenarios.

C++ Python Text Unity3D Asset C other

314202276@qq.com 5518576+mindspore_ci@user.noreply.gitee.com tommylike@qq.com zhaozhenlong1@huawei.com shiliang10@huawei.com sunsuodong@huawei.com wangkaisheng2@huawei.com jiangjinsheng@huawei.com chenzomi12@gmail.com yiren19920727@163.com zhoufeng54@huawei.com chenweifeng720@huawei.com huanghui44@huawei.com guozhijian@huawei.com fuzhiye@huawei.com yangruoqi@huawei.com 2713219276@qq.com peixu.ren1@huawei.com xiefangqi2@huawei.com zhaojichen1@huawei.com lingqiaomin.huawei.com yaoyifan1@huawei.com caifubi1@huawei.com fary.fanrui@huawei.com zhoupeichen@huawei.com

README.md

Contents

DenseNet121 Description

Model Architecture

Dataset

Features

Mixed Precision

Environment Requirements

Quick Start

Script Description

Script and Sample Code

Script Parameters

Training Process

Training

Distributed Training

Evaluation Process

Evaluation

Model Description

Performance

Training accuracy results

Training performance results

Description of Random Situation

ModelZoo Homepage

Contributors (25+)
All

README.md

Contents

Mixed Precision

Training

Distributed Training

Evaluation

Training accuracy results

Training performance results

Contributors (25+) All

Contributors (25+)
All