From 8eae3262f3f184dbdf8549ec43ab7f76a90210cd Mon Sep 17 00:00:00 2001
From: chenhaozhe <chenhaozhe1@huawei.com>
Date: Tue, 19 Jan 2021 10:49:10 +0800
Subject: [PATCH] fix some description about bert and yolov3

---
 .../official/cv/yolov3_darknet53/README.md    | 87 ++++++++++---------
 model_zoo/official/nlp/bert/README.md         |  7 +-
 .../nlp/bert/src/bert_for_finetune.py         | 10 +++
 3 files changed, 59 insertions(+), 45 deletions(-)

diff --git a/model_zoo/official/cv/yolov3_darknet53/README.md b/model_zoo/official/cv/yolov3_darknet53/README.md
index 9181e11607..b366a9398b 100644
--- a/model_zoo/official/cv/yolov3_darknet53/README.md
+++ b/model_zoo/official/cv/yolov3_darknet53/README.md
@@ -1,26 +1,27 @@
 # Contents
 
-- [YOLOv3-DarkNet53 Description](#yolov3-darknet53-description)
-- [Model Architecture](#model-architecture)
-- [Dataset](#dataset)
-- [Environment Requirements](#environment-requirements)
-- [Quick Start](#quick-start)
-- [Script Description](#script-description)
-    - [Script and Sample Code](#script-and-sample-code)
-    - [Script Parameters](#script-parameters)
-    - [Training Process](#training-process)
-        - [Training](#training)
-        - [Distributed Training](#distributed-training)  
-    - [Evaluation Process](#evaluation-process)
-        - [Evaluation](#evaluation)
-- [Model Description](#model-description)
-    - [Performance](#performance)  
-        - [Evaluation Performance](#evaluation-performance)
-        - [Inference Performance](#evaluation-performance)
-- [Description of Random Situation](#description-of-random-situation)
-- [ModelZoo Homepage](#modelzoo-homepage)
-
-# [YOLOv3-DarkNet53 Description](#contents)
+- [Contents](#contents)
+    - [YOLOv3-DarkNet53 Description](#yolov3-darknet53-description)
+    - [Model Architecture](#model-architecture)
+    - [Dataset](#dataset)
+    - [Environment Requirements](#environment-requirements)
+    - [Quick Start](#quick-start)
+    - [Script Description](#script-description)
+        - [Script and Sample Code](#script-and-sample-code)
+        - [Script Parameters](#script-parameters)
+        - [Training Process](#training-process)
+            - [Training](#training)
+            - [Distributed Training](#distributed-training)
+        - [Evaluation Process](#evaluation-process)
+            - [Evaluation](#evaluation)
+    - [Model Description](#model-description)
+        - [Performance](#performance)
+            - [Evaluation Performance](#evaluation-performance)
+            - [Inference Performance](#inference-performance)
+    - [Description of Random Situation](#description-of-random-situation)
+    - [ModelZoo Homepage](#modelzoo-homepage)
+
+## [YOLOv3-DarkNet53 Description](#contents)
 
 You only look once (YOLO) is a state-of-the-art, real-time object detection system. YOLOv3 is extremely fast and accurate.
 
@@ -32,11 +33,11 @@ YOLOv3 uses a few tricks to improve training and increase performance, including
 [Paper](https://pjreddie.com/media/files/papers/YOLOv3.pdf):  YOLOv3: An Incremental Improvement. Joseph Redmon, Ali Farhadi,
 University of Washington
 
-# [Model Architecture](#contents)
+## [Model Architecture](#contents)
 
 YOLOv3 use DarkNet53 for performing feature extraction, which is a hybrid approach between the network used in YOLOv2, Darknet-19, and that newfangled residual network stuff. DarkNet53 uses successive 3 × 3 and 1 × 1 convolutional layers and has some shortcut connections as well and is significantly larger. It has 53 convolutional layers.
 
-# [Dataset](#contents)
+## [Dataset](#contents)
 
 Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below.
 
@@ -47,19 +48,19 @@ Dataset used: [COCO2014](https://cocodataset.org/#download)
     - Val：6GM, 40,504 images
     - Annotations: 241M, Train/Val annotations
 - Data format：zip files
-  - Note：Data will be processed in yolo_dataset.py, and unzip files before uses it.
+    - Note：Data will be processed in yolo_dataset.py, and unzip files before uses it.
 
-# [Environment Requirements](#contents)
+## [Environment Requirements](#contents)
 
 - Hardware（Ascend/GPU）
 - Prepare hardware environment with Ascend or GPU processor. If you want to try Ascend  , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
 - Framework
     - [MindSpore](https://www.mindspore.cn/install/en)
 - For more information, please check the resources below：
-  - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
-  - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
+    - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
+    - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
 
-# [Quick Start](#contents)
+## [Quick Start](#contents)
 
 After installing MindSpore via the official website, you can start training and evaluation in as follows. If running on GPU, please add `--device_target=GPU` in the python command or use the "_gpu" shell script ("xxx_gpu.sh").
 
@@ -101,9 +102,9 @@ python eval.py \
 sh run_eval.sh dataset/coco2014/ checkpoint/0-319_102400.ckpt
 ```
 
-# [Script Description](#contents)
+## [Script Description](#contents)
 
-## [Script and Sample Code](#contents)
+### [Script and Sample Code](#contents)
 
 ```contents
 .
@@ -134,7 +135,7 @@ sh run_eval.sh dataset/coco2014/ checkpoint/0-319_102400.ckpt
   └─train.py                          # train net
 ```
 
-## [Script Parameters](#contents)
+### [Script Parameters](#contents)
 
 ```parameters
 Major parameters in train.py as follow.
@@ -197,9 +198,9 @@ optional arguments:
                         Resize rate for multi-scale training. Default: None
 ```
 
-## [Training Process](#contents)
+### [Training Process](#contents)
 
-### Training
+#### Training
 
 ```command
 python train.py \
@@ -230,7 +231,7 @@ After training, you'll get some checkpoint files under the outputs folder by def
 
 The model checkpoint will be saved in outputs directory.
 
-### Distributed Training
+#### Distributed Training
 
 For Ascend device, distributed training example(8p) by shell script
 
@@ -261,9 +262,9 @@ epoch[319], iter[102300], loss:31.952403, 496.02 imgs/sec, lr:2.409552052995423e
 ...
 ```
 
-## [Evaluation Process](#contents)
+### [Evaluation Process](#contents)
 
-### Evaluation
+#### Evaluation
 
 Before running the command below. If running on GPU, please add `--device_target=GPU` in the python command or use the "_gpu" shell script ("xxx_gpu.sh").
 
@@ -278,6 +279,8 @@ sh run_eval.sh dataset/coco2014/ checkpoint/0-319_102400.ckpt
 
 The above python command will run in the background. You can view the results through the file "log.txt". The mAP of the test dataset will be as follows:
 
+This the standard format from `pycocotools`, you can refer to [cocodataset](https://cocodataset.org/#detection-eval) for more detail.
+
 ```eval log
 # log.txt
 =============coco eval reulst=========
@@ -295,11 +298,11 @@ The above python command will run in the background. You can view the results th
  Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.551
 ```
 
-# [Model Description](#contents)
+## [Model Description](#contents)
 
-## [Performance](#contents)
+### [Performance](#contents)
 
-### Evaluation Performance
+#### Evaluation Performance
 
 | Parameters                 | YOLO                                                        |YOLO                                                         |
 | -------------------------- | ----------------------------------------------------------- |------------------------------------------------------------ |
@@ -319,7 +322,7 @@ The above python command will run in the background. You can view the results th
 | Checkpoint for Fine tuning | 474M (.ckpt file)                                           | 474M (.ckpt file)                                           |
 | Scripts                    | https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/yolov3_darknet53 | https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/yolov3_darknet53 |
 
-### Inference Performance
+#### Inference Performance
 
 | Parameters          | YOLO                        |YOLO                          |
 | ------------------- | --------------------------- |------------------------------|
@@ -333,10 +336,10 @@ The above python command will run in the background. You can view the results th
 | Accuracy            | 8pcs: 31.1%                 | 8pcs: 29.7%~30.3% (shape=416)|
 | Model for inference | 474M (.ckpt file)           | 474M (.ckpt file)            |
 
-# [Description of Random Situation](#contents)
+## [Description of Random Situation](#contents)
 
 There are random seeds in distributed_sampler.py, transforms.py, yolo_dataset.py files.
 
-# [ModelZoo Homepage](#contents)
+## [ModelZoo Homepage](#contents)
 
  Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).
diff --git a/model_zoo/official/nlp/bert/README.md b/model_zoo/official/nlp/bert/README.md
index 1fb895933f..41ce5e40c0 100644
--- a/model_zoo/official/nlp/bert/README.md
+++ b/model_zoo/official/nlp/bert/README.md
@@ -12,8 +12,8 @@
         - [Pre-Training](#pre-training)
         - [Fine-Tuning and Evaluation](#fine-tuning-and-evaluation)
     - [Options and Parameters](#options-and-parameters)
-        - [Options:](#options)
-        - [Parameters:](#parameters)
+        - [Options](#options)
+        - [Parameters](#parameters)
     - [Training Process](#training-process)
         - [Training](#training)
             - [Running on Ascend](#running-on-ascend)
@@ -380,7 +380,8 @@ config for lossscale and etc.
 ```text
 Parameters for dataset and network (Pre-Training/Fine-Tuning/Evaluation):
     seq_length                      length of input sequence: N, default is 128
-    vocab_size                      size of each embedding vector: N, must be consistant with the dataset you use. Default is 21136
+    vocab_size                      size of each embedding vector: N, must be consistant with the dataset you use. Default is 21128.
+                                    Usually, we use 21128 for CN vocabs and 30522 for EN vocabs according to the origin paper.
     hidden_size                     size of bert encoder layers: N, default is 768
     num_hidden_layers               number of hidden layers: N, default is 12
     num_attention_heads             number of attention heads: N, default is 12
diff --git a/model_zoo/official/nlp/bert/src/bert_for_finetune.py b/model_zoo/official/nlp/bert/src/bert_for_finetune.py
index 3ef32cbc17..d8b37da579 100644
--- a/model_zoo/official/nlp/bert/src/bert_for_finetune.py
+++ b/model_zoo/official/nlp/bert/src/bert_for_finetune.py
@@ -50,6 +50,16 @@ def _tensor_grad_overflow(grad):
 class BertFinetuneCell(nn.Cell):
     """
     Especifically defined for finetuning where only four inputs tensor are needed.
+
+    Append an optimizer to the training network after that the construct
+    function can be called to create the backward graph.
+
+    Different from the builtin loss_scale wrapper cell, we apply grad_clip before the optimization.
+
+    Args:
+        network (Cell): The training network. Note that loss function should have been added.
+        optimizer (Optimizer): Optimizer for updating the weights.
+        scale_update_cell (Cell): Cell to do the loss scale. Default: None.
     """
     def __init__(self, network, optimizer, scale_update_cell=None):