You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 9.3 kB

5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222
  1. <!--TOC -->
  2. - [Bayesian Graph Collaborative Filtering](#bayesian-graph-collaborative-filtering)
  3. - [Model Architecture](#model-architecture)
  4. - [Dataset](#dataset)
  5. - [Features](#features)
  6. - [Mixed Precision](#mixed-precision)
  7. - [Environment Requirements](#environment-requirements)
  8. - [Quick Start](#quick-start)
  9. - [Script Description](#script-description)
  10. - [Script and Sample Code](#script-and-sample-code)
  11. - [Script Parameters](#script-parameters)
  12. - [Training Process](#training-process)
  13. - [Training](#training)
  14. - [Evaluation Process](#evaluation-process)
  15. - [Evaluation](#evaluation)
  16. - [Model Description](#model-description)
  17. - [Performance](#performance)
  18. - [Description of random situation](#description-of-random-situation)
  19. - [ModelZoo Homepage](#modelzoo-homepage)
  20. <!--TOC -->
  21. # [Bayesian Graph Collaborative Filtering](#contents)
  22. Bayesian Graph Collaborative Filtering(BGCF) was proposed in 2020 by Sun J, Guo W, Zhang D et al. By naturally incorporating the
  23. uncertainty in the user-item interaction graph shows excellent performance on Amazon recommendation dataset.This is an example of
  24. training of BGCF with Amazon-Beauty dataset in MindSpore. More importantly, this is the first open source version for BGCF.
  25. [Paper](https://dl.acm.org/doi/pdf/10.1145/3394486.3403254): Sun J, Guo W, Zhang D, et al. A Framework for Recommending Accurate and Diverse Items Using Bayesian Graph Convolutional Neural Networks[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2020: 2030-2039.
  26. # [Model Architecture](#contents)
  27. Specially, BGCF contains two main modules. The first is sampling, which produce sample graphs based in node copying. Another module
  28. aggregate the neighbors sampling from nodes consisting of mean aggregator and attention aggregator.
  29. # [Dataset](#contents)
  30. - Dataset size:
  31. Statistics of dataset used are summarized as below:
  32. | | Amazon-Beauty |
  33. | ------------------ | -----------------------:|
  34. | Task | Recommendation |
  35. | # User | 7068 (1 graph) |
  36. | # Item | 3570 |
  37. | # Interaction | 79506 |
  38. | # Training Data | 60818 |
  39. | # Test Data | 18688 |
  40. | # Density | 0.315% |
  41. - Data Preparation
  42. - Place the dataset to any path you want, the folder should include files as follows(we use Amazon-Beauty dataset as an example)"
  43. ```
  44. .
  45. └─data
  46. ├─ratings_Beauty.csv
  47. ```
  48. - Generate dataset in mindrecord format for Amazon-Beauty.
  49. ```builddoutcfg
  50. cd ./scripts
  51. # SRC_PATH is the dataset file path you download.
  52. sh run_process_data_ascend.sh [SRC_PATH]
  53. ```
  54. - Launch
  55. ```
  56. # Generate dataset in mindrecord format for Amazon-Beauty.
  57. sh ./run_process_data_ascend.sh ./data
  58. # [Features](#contents)
  59. ## Mixed Precision
  60. To ultilize the strong computation power of Ascend chip, and accelerate the training process, the mixed training method is used. MindSpore is able to cope with FP32 inputs and FP16 operators. In BGCF example, the model is set to FP16 mode except for the loss calculation part.
  61. # [Environment Requirements](#contents)
  62. - Hardward (Ascend)
  63. - Framework
  64. - [MindSpore](https://www.mindspore.cn/install/en)
  65. - For more information, please check the resources below:
  66. - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
  67. - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
  68. # [Quick Start](#contents)
  69. After installing MindSpore via the official website and Dataset is correctly generated, you can start training and evaluation as follows.
  70. - running on Ascend
  71. ```
  72. # run training example with Amazon-Beauty dataset
  73. sh run_train_ascend.sh
  74. # run evaluation example with Amazon-Beauty dataset
  75. sh run_eval_ascend.sh
  76. ```
  77. # [Script Description](#contents)
  78. ## [Script and Sample Code](#contents)
  79. ```shell
  80. .
  81. └─bgcf
  82. ├─README.md
  83. ├─scripts
  84. | ├─run_eval_ascend.sh # Launch evaluation
  85. | ├─run_process_data_ascend.sh # Generate dataset in mindrecord format
  86. | └─run_train_ascend.sh # Launch training
  87. |
  88. ├─src
  89. | ├─bgcf.py # BGCF model
  90. | ├─callback.py # Callback function
  91. | ├─config.py # Training configurations
  92. | ├─dataset.py # Data preprocessing
  93. | ├─metrics.py # Recommendation metrics
  94. | └─utils.py # Utils for training bgcf
  95. |
  96. ├─eval.py # Evaluation net
  97. └─train.py # Train net
  98. ```
  99. ## [Script Parameters](#contents)
  100. Parameters for both training and evaluation can be set in config.py.
  101. - config for BGCF dataset
  102. ```python
  103. "learning_rate": 0.001, # Learning rate
  104. "num_epochs": 600, # Epoch sizes for training
  105. "num_neg": 10, # Negative sampling rate
  106. "raw_neighs": 40, # Num of sampling neighbors in raw graph
  107. "gnew_neighs": 20, # Num of sampling neighbors in sample graph
  108. "input_dim": 64, # User and item embedding dimension
  109. "l2_coeff": 0.03 # l2 coefficient
  110. "neighbor_dropout": [0.0, 0.2, 0.3]# Dropout ratio for different aggregation layer
  111. "num_graphs":5 # Num of sample graph
  112. ```
  113. config.py for more configuration.
  114. ## [Training Process](#contents)
  115. ### Training
  116. - running on Ascend
  117. ```python
  118. sh run_train_ascend.sh
  119. ```
  120. Training result will be stored in the scripts path, whose folder name begins with "train". You can find the result like the
  121. followings in log.
  122. ```python
  123. Epoch 001 iter 12 loss 34696.242
  124. Epoch 002 iter 12 loss 34275.508
  125. Epoch 003 iter 12 loss 30620.635
  126. Epoch 004 iter 12 loss 21628.908
  127. ...
  128. Epoch 597 iter 12 loss 3662.3152
  129. Epoch 598 iter 12 loss 3640.7612
  130. Epoch 599 iter 12 loss 3654.9087
  131. Epoch 600 iter 12 loss 3632.4585
  132. ...
  133. ```
  134. ## [Evaluation Process](#contents)
  135. ### Evaluation
  136. - Evaluation on Ascend
  137. ```python
  138. sh run_eval_ascend.sh
  139. ```
  140. Evaluation result will be stored in the scripts path, whose folder name begins with "eval". You can find the result like the
  141. followings in log.
  142. ```python
  143. epoch:020, recall_@10:0.07345, recall_@20:0.11193, ndcg_@10:0.05293, ndcg_@20:0.06613,
  144. sedp_@10:0.01393, sedp_@20:0.01126, nov_@10:6.95106, nov_@20:7.22280
  145. epoch:040, recall_@10:0.07410, recall_@20:0.11537, ndcg_@10:0.05387, ndcg_@20:0.06801,
  146. sedp_@10:0.01445, sedp_@20:0.01168, nov_@10:7.34799, nov_@20:7.58883
  147. epoch:060, recall_@10:0.07654, recall_@20:0.11987, ndcg_@10:0.05530, ndcg_@20:0.07015,
  148. sedp_@10:0.01474, sedp_@20:0.01206, nov_@10:7.46553, nov_@20:7.69436
  149. ...
  150. epoch:560, recall_@10:0.09825, recall_@20:0.14877, ndcg_@10:0.07176, ndcg_@20:0.08883,
  151. sedp_@10:0.01882, sedp_@20:0.01501, nov_@10:7.58045, nov_@20:7.79586
  152. epoch:580, recall_@10:0.09917, recall_@20:0.14970, ndcg_@10:0.07337, ndcg_@20:0.09037,
  153. sedp_@10:0.01896, sedp_@20:0.01504, nov_@10:7.57995, nov_@20:7.79439
  154. epoch:600, recall_@10:0.09926, recall_@20:0.15080, ndcg_@10:0.07283, ndcg_@20:0.09016,
  155. sedp_@10:0.01890, sedp_@20:0.01517, nov_@10:7.58277, nov_@20:7.80038
  156. ...
  157. ```
  158. # [Model Description](#contents)
  159. ## [Performance](#contents)
  160. | Parameter | BGCF |
  161. | ------------------------------------ | ----------------------------------------- |
  162. | Resource | Ascend 910 |
  163. | uploaded Date | |
  164. | MindSpore Version | |
  165. | Dataset | Amazon-Beauty |
  166. | Training Parameter | epoch=600 |
  167. | Optimizer | Adam |
  168. | Loss Function | BPR loss |
  169. | Recall@20 | 0.1534 |
  170. | NDCG@20 | 0.0912 |
  171. | Training Cost | 25min |
  172. | Scripts | |
  173. # [Description of random situation](#contents)
  174. BGCF model contains lots of dropout operations, if you want to disable dropout, set the neighbor_dropout to [0.0, 0.0, 0.0] in src/config.py.
  175. # [ModelZoo Homepage](#contents)
  176. Please check the official [homepage](http://gitee.com/mindspore/mindspore/tree/master/model_zoo).