You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 6.0 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164
  1. # Contents
  2. - [Thinking Path Re-Ranker](#thinking-path-re-ranker)
  3. - [Model Architecture](#model-architecture)
  4. - [Dataset](#dataset)
  5. - [Features](#features)
  6. - [Mixed Precision](#mixed-precision)
  7. - [Environment Requirements](#environment-requirements)
  8. - [Quick Start](#quick-start)
  9. - [Script Description](#script-description)
  10. - [Script and Sample Code](#script-and-sample-code)
  11. - [Script Parameters](#script-parameters)
  12. - [Training Process](#training-process)
  13. - [Training](#training)
  14. - [Evaluation Process](#evaluation-process)
  15. - [Evaluation](#evaluation)
  16. - [Model Description](#model-description)
  17. - [Performance](#performance)
  18. - [Description of random situation](#description-of-random-situation)
  19. - [ModelZoo Homepage](#modelzoo-homepage)
  20. # [Thinking Path Re-Ranker](#contents)
  21. Thinking Path Re-Ranker(TPRR) was proposed in 2021 by Huawei Poisson Lab & Parallel Distributed Computing Lab. By incorporating the
  22. retriever, reranker and reader modules, TPRR shows excellent performance on open-domain multi-hop question answering. Moreover, TPRR has won
  23. the first place in the current HotpotQA official leaderboard. This is a example of evaluation of TPRR with HotPotQA dataset in MindSpore. More
  24. importantly, this is the first open source version for TPRR.
  25. # [Model Architecture](#contents)
  26. Specially, TPRR contains three main modules. The first is retriever, which generate document sequences of each hop iteratively. The second
  27. is reranker for selecting the best path from candidate paths generated by retriever. The last one is reader for extracting answer spans.
  28. # [Dataset](#contents)
  29. The retriever dataset consists of three parts:
  30. Wikipedia data: the 2017 English Wikipedia dump version with bidirectional hyperlinks.
  31. dev data: HotPotQA full wiki setting dev data with 7398 question-answer pairs.
  32. dev tf-idf data: the candidates for each question in dev data which is originated from top-500 retrieved from 5M paragraphs of Wikipedia
  33. through TF-IDF.
  34. # [Features](#contents)
  35. ## [Mixed Precision](#contents)
  36. To ultilize the strong computation power of Ascend chip, and accelerate the evaluation process, the mixed evaluation method is used. MindSpore
  37. is able to cope with FP32 inputs and FP16 operators. In TPRR example, the model is set to FP16 mode for the matmul calculation part.
  38. # [Environment Requirements](#contents)
  39. - Hardware (Ascend)
  40. - Framework
  41. - [MindSpore](https://www.mindspore.cn/install/en)
  42. - For more information, please check the resources below:
  43. - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
  44. - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
  45. # [Quick Start](#contents)
  46. After installing MindSpore via the official website and Dataset is correctly generated, you can start training and evaluation as follows.
  47. - running on Ascend
  48. ```python
  49. # run evaluation example with HotPotQA dev dataset
  50. sh run_eval_ascend.sh
  51. ```
  52. # [Script Description](#contents)
  53. ## [Script and Sample Code](#contents)
  54. ```shell
  55. .
  56. └─tprr
  57. ├─README.md
  58. ├─scripts
  59. | ├─run_eval_ascend.sh # Launch evaluation in ascend
  60. |
  61. ├─src
  62. | ├─config.py # Evaluation configurations
  63. | ├─onehop.py # Onehop model
  64. | ├─onehop_bert.py # Onehop bert model
  65. | ├─process_data.py # Data preprocessing
  66. | ├─twohop.py # Twohop model
  67. | ├─twohop_bert.py # Twohop bert model
  68. | └─utils.py # Utils for evaluation
  69. |
  70. └─retriever_eval.py # Evaluation net for retriever
  71. ```
  72. ## [Script Parameters](#contents)
  73. Parameters for evaluation can be set in config.py.
  74. - config for TPRR retriever dataset
  75. ```python
  76. "q_len": 64, # Max query length
  77. "d_len": 192, # Max doc length
  78. "s_len": 448, # Max sequence length
  79. "in_len": 768, # Input dim
  80. "out_len": 1, # Output dim
  81. "num_docs": 500, # Num of docs
  82. "topk": 8, # Top k
  83. "onehop_num": 8 # Num of onehop doc as twohop neighbor
  84. ```
  85. config.py for more configuration.
  86. ## [Evaluation Process](#contents)
  87. ### Evaluation
  88. - Evaluation on Ascend
  89. ```python
  90. sh run_eval_ascend.sh
  91. ```
  92. Evaluation result will be stored in the scripts path, whose folder name begins with "eval". You can find the result like the
  93. followings in log.
  94. ```python
  95. ###step###: 0
  96. val: 0
  97. count: 1
  98. true count: 0
  99. PEM: 0.0
  100. ...
  101. ###step###: 7396
  102. val:6796
  103. count:7397
  104. true count: 6924
  105. PEM: 0.9187508449371367
  106. true top8 PEM: 0.9815135759676488
  107. evaluation time (h): 20.155506462653477
  108. ```
  109. # [Model Description](#contents)
  110. ## [Performance](#contents)
  111. ### Inference Performance
  112. | Parameter | BGCF Ascend |
  113. | ------------------------------ | ---------------------------- |
  114. | Model Version | Inception V1 |
  115. | Resource | Ascend 910 |
  116. | uploaded Date | 03/12/2021(month/day/year) |
  117. | MindSpore Version | 1.2.0 |
  118. | Dataset | HotPotQA |
  119. | Batch_size | 1 |
  120. | Output | inference path |
  121. | PEM | 0.9188 |
  122. # [Description of random situation](#contents)
  123. No random situation for evaluation.
  124. # [ModelZoo Homepage](#contents)
  125. Please check the official [homepage](http://gitee.com/mindspore/mindspore/tree/master/model_zoo).