You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

RELEASE.md 16 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213
  1. # Release 0.3.0-alpha
  2. ## Major Features and Improvements
  3. ### Ascend 910 Training and Inference Framework
  4. * New models
  5. * DeepFM: a factorization-machine based neural network for CTR prediction on Criteo dataset.
  6. * DeepLabV3: significantly improves over our previous DeepLab versions without DenseCRF post-processing and attains comparable performance with other state-of-art models on the PASCAL VOC 2007 semantic image segmentation benchmark.
  7. * Faster-RCNN: towards real-time object detection with region proposal networks on COCO 2017 dataset.
  8. * GoogLeNet: a deep convolutional neural network architecture codenamed Inception V1 for classification and detection on CIFAR-10 dataset.
  9. * Wide&Deep: jointly trained wide linear models and deep neural networks for recommender systems on Criteo dataset.
  10. * Frontend and User Interface
  11. * Complete numpy advanced indexing method. Supports value and assignment through tensor index.
  12. * Some optimizers support separating parameter groups. Different parameter groups can set different `learning_rate` and `weight_decay`.
  13. * Support setting submodule's logging level independently, e.g. you can set logging level of module `A` to warning and set logging level of module `B` to info.
  14. * Support weights to be compiled according to shape to solve the problem of large memory overhead.
  15. * Add some operators implement and grammar support in pynative mode. To be consistent with graph mode.
  16. * User interfaces change log
  17. * Learning rate and weight decay making group params([!637](https://gitee.com/mindspore/mindspore/pulls/637))
  18. * Support weights to be compiled according to shape([!1015](https://gitee.com/mindspore/mindspore/pulls/1015))
  19. * delete some context param([!1100](https://gitee.com/mindspore/mindspore/pulls/1100))
  20. * ImageSummary/ScalarSummary/TensorSummary/HistogramSummary([!1329](https://gitee.com/mindspore/mindspore/pulls/1329))([!1425](https://gitee.com/mindspore/mindspore/pulls/1425))
  21. * Executor and Performance Optimization
  22. * Support doing evaluation while in training process, so that the accuracy of training can be easily obtained.
  23. * Enable second-order optimization for resnet50, which can achieve 75.9% accuracy in 45 epochs (Resnet50 @ImageNet).
  24. * Optimize pynative implementation and improve it's execution performance.
  25. * Optimize summary record implementation and improve its performance.
  26. * Data processing, augmentation, and save format
  27. * Support simple text processing, such as tokenizer/buildvocab/lookup.
  28. * Support padding batch.
  29. * Support split or concat dataset.
  30. * Support MindDataset reading from file list.
  31. ### Other Hardware Support
  32. * GPU platform
  33. * New models supported: MobileNetV2, MobileNetV3.
  34. * Support mixed precision training.
  35. * Support device memory swapping.
  36. ## Bugfixes
  37. * Python API
  38. * An exception to the broadcast input data type check([!712](https://gitee.com/mindspore/mindspore/pulls/712))
  39. * Fix issues assignsub return value 0([!1036](https://gitee.com/mindspore/mindspore/pulls/1036))
  40. * Fix issue Conv2dBackpropInput bprop should return 3 instead of 2 items([!1001](https://gitee.com/mindspore/mindspore/pulls/1001))
  41. * Fix sens shape error of TrainOneStepWithLossScaleCell([!1050](https://gitee.com/mindspore/mindspore/pulls/1050))
  42. * Fix BatchNormGrad operator([!1344](https://gitee.com/mindspore/mindspore/pulls/1344))
  43. * Executor
  44. * Fix dropout,topK and addn errors in PyNative mode ([!1285](https://gitee.com/mindspore/mindspore/pulls/1285), [!1138](https://gitee.com/mindspore/mindspore/pulls/1138), [!1033](https://gitee.com/mindspore/mindspore/pulls/1033)).
  45. * Fix memory leaks after execution in PyNatvie mode ([!1201](https://gitee.com/mindspore/mindspore/pulls/1201)).
  46. * Fix HCCL failure in some special scenes ([!1204](https://gitee.com/mindspore/dashboard/projects/mindspore/mindspore/pulls/1204), [!1252](https://gitee.com/mindspore/dashboard/projects/mindspore/mindspore/pulls/1252)).
  47. * Fix SSD network when Select failed, cann't find kernel info([!1449](https://gitee.com/mindspore/dashboard/projects/mindspore/mindspore/pulls/1449)).
  48. * Fix Topk operator selection strategy bug between aicore and aicpu([!1367](https://gitee.com/mindspore/dashboard/projects/mindspore/mindspore/pulls/1367)).
  49. * Fix input memory size of 'assign' op unequal in control sink mode when assigning a data from one child graph to another child graph([!802](https://gitee.com/mindspore/dashboard/projects/mindspore/mindspore/pulls/802)).
  50. * Fix allreduce ir inconsistency([!989](https://gitee.com/mindspore/dashboard/projects/mindspore/mindspore/pulls/989)).
  51. * GPU platform
  52. * Fix summary for gradient collection ([!1364](https://gitee.com/mindspore/mindspore/pulls/1364))
  53. * Fix the slice operator ([!1489](https://gitee.com/mindspore/mindspore/pulls/1489))
  54. * Data processing
  55. * Fix memory problems of GeneratorDataset of sub-process ([!907](https://gitee.com/mindspore/mindspore/pulls/907))
  56. * Fix getting data timeout when training the cifar10 dataset under the lenet([!1391](https://gitee.com/mindspore/mindspore/pulls/1391))
  57. ## Contributors
  58. Thanks goes to these wonderful people:
  59. Alexey Shevlyakov, Amir Lashkari, anthony, baihuawei, biffex, buxue, caifubi, candanzg, caojian05, Cathy Wong, changzherui, chenfei, chengxianbin, chenhaozhe, chenzomi, chujinjin, cristoval, dengwentao, eric, etone-chan, fary86, gaojing, gengdongjie, gongchen, guohongzilong, guozhijian, heleiwang, hesham, He Wei, Hoai Linh Tran h00472437, hongxing, huangdongrun, huanghui, Jamie Nisbet, Jesse Lee, jiangjinsheng, jiangzhiwen, jinyaohui, jjfeing, jonwe, jonyguo, Junhan Hu, Kang, kingfo, kswang, laiyongqiang, leopz, lichenever, lihongkang, limingqi107, liubuyu, liuliyan2, liuwenhao4, liuxiao, liuxiao, liyong, lizhenyu, lvliang, Margaret_wangrui, meixiaowei, ms_yan, Nat Sutyanyong, ougongchang, panfengfeng, panyifeng, Peilin Wang, peixu_ren, qianlong, rick_sanchez, seatea, sheng, shijianning, simson, sunsuodong, Tinazhang, VectorSL, wandongdong, wangcong, wanghua, wangnan39, Wei Luning, wenchunjiang, wilfChen, WilliamLian, wsc, wukesong, wuxuejian, Xiaoda Zhang, xiefangqi, xulei2020, Yang, yangjie159, yangruoqi713, yangyongjie, yangzhenzhang, Yanjun Peng, yanzhenxiang2020, yao_yf, Yi Huaijie, yoonlee666, yujianfeng, YuJianfeng, yvetteliu, z00478463, zhangdengcheng, Zhang Qinghua, zhangz0911gm, zhaojichen, zhaoting, zhaozhenlong, zhoufeng, zhouneng, zhousiyi, zhouyuanshen, Zirui Wu, Ziyan, zjun, ZPaC, lihongzhang
  60. Contributions of any kind are welcome!
  61. # Release 0.2.0-alpha
  62. ## Major Features and Improvements
  63. ### Ascend 910 Training and Inference Framework
  64. * New models
  65. * MobileNetV2: Inverted Residuals and Linear Bottlenecks.
  66. * ResNet101: Deep Residual Learning for Image Recognition.
  67. * Frontend and User Interface
  68. * Support for all python comparison operators.
  69. * Support for math operators **,//,%. Support for other python operators like and/or/not/is/is not/ in/ not in.
  70. * Support for the gradients of function with variable arguments.
  71. * Support for tensor indexing assignment for certain indexing type.
  72. * Support for dynamic learning rate.
  73. * User interfaces change log
  74. * DepthwiseConv2dNative, DepthwiseConv2dNativeBackpropFilter, DepthwiseConv2dNativeBackpropInput([!424](https://gitee.com/mindspore/mindspore/pulls/424))
  75. * ReLU6, ReLU6Grad([!224](https://gitee.com/mindspore/mindspore/pulls/224))
  76. * GeneratorDataset([!183](https://gitee.com/mindspore/mindspore/pulls/183))
  77. * VOCDataset([!477](https://gitee.com/mindspore/mindspore/pulls/477))
  78. * MindDataset, PKSampler([!514](https://gitee.com/mindspore/mindspore/pulls/514))
  79. * map([!506](https://gitee.com/mindspore/mindspore/pulls/506))
  80. * Conv([!226](https://gitee.com/mindspore/mindspore/pulls/226))
  81. * Adam([!253](https://gitee.com/mindspore/mindspore/pulls/253))
  82. * _set_fusion_strategy_by_idx, _set_fusion_strategy_by_size([!189](https://gitee.com/mindspore/mindspore/pulls/189))
  83. * CheckpointConfig([!122](https://gitee.com/mindspore/mindspore/pulls/122))
  84. * Constant([!54](https://gitee.com/mindspore/mindspore/pulls/54))
  85. * Executor and Performance Optimization
  86. * Support parallel execution of data prefetching and forward/backward computing.
  87. * Support parallel execution of gradient aggregation and forward/backward computing in distributed training scenarios.
  88. * Support operator fusion optimization.
  89. * Optimize compilation process and improve the performance.
  90. * Data processing, augmentation, and save format
  91. * Support multi-process of GeneratorDataset/PyFunc for high performance
  92. * Support variable batchsize
  93. * Support new Dataset operators, such as filter,skip,take,TextLineDataset
  94. ### Other Hardware Support
  95. * GPU platform
  96. * Use dynamic memory pool by default on GPU.
  97. * Support parallel execution of computation and communication.
  98. * Support continuous address allocation by memory pool.
  99. * CPU platform
  100. * Support for windows 10 OS.
  101. ## Bugfixes
  102. * Models
  103. * Fix mixed precision bug for VGG16 model ([!629](https://gitee.com/mindspore/mindspore/pulls/629)).
  104. * Python API
  105. * Fix ControlDepend operator bugs on CPU and GPU ([!396](https://gitee.com/mindspore/mindspore/pulls/396)).
  106. * Fix ArgMinWithValue operator bugs ([!338](https://gitee.com/mindspore/mindspore/pulls/338)).
  107. * Fix Dense operator bugs on PyNative mode ([!276](https://gitee.com/mindspore/mindspore/pulls/276)).
  108. * Fix MatMul operator bugs on PyNative mode ([!288](https://gitee.com/mindspore/mindspore/pulls/288)).
  109. * Executor
  110. * Fix operator selection bugs and make it general ([!300](https://gitee.com/mindspore/mindspore/pulls/300)).
  111. * Fix memory reuse bug for GetNext op ([!291](https://gitee.com/mindspore/mindspore/pulls/291)).
  112. * GPU platform
  113. * Fix memory allocation in multi-graph scenarios ([!444](https://gitee.com/mindspore/mindspore/pulls/444)).
  114. * Fix bias_add_grad under fp16 precision ([!598](https://gitee.com/mindspore/mindspore/pulls/598)).
  115. * Fix support for fp16 kernels on nvidia 1080Ti([!571](https://gitee.com/mindspore/mindspore/pulls/571)).
  116. * Fix parsing of tuple type parameters ([!316](https://gitee.com/mindspore/mindspore/pulls/316)).
  117. * Data processing
  118. * Fix TypeErrors about can't pickle mindspore._c_dataengine.DEPipeline objects([!434](https://gitee.com/mindspore/mindspore/pulls/434)).
  119. * Add TFRecord file verification([!406](https://gitee.com/mindspore/mindspore/pulls/406)).
  120. ## Contributors
  121. Thanks goes to these wonderful people:
  122. Alexey_Shevlyakov, Cathy, Chong, Hoai, Jonathan, Junhan, JunhanHu, Peilin, SanjayChan, StrawNoBerry, VectorSL, Wei, WeibiaoYu, Xiaoda, Yanjun, YuJianfeng, ZPaC, Zhang, ZhangQinghua, ZiruiWu, amongo, anthonyaje, anzhengqi, biffex, caifubi, candanzg, caojian05, casgj, cathwong, ch-l, chang, changzherui, chenfei, chengang, chenhaozhe, chenjianping, chentingting, chenzomi, chujinjin, dengwentao, dinghao, fanglei, fary86, flywind, gaojing, geekun, gengdongjie, ghzl, gong, gongchen, gukecai, guohongzilong, guozhijian, gziyan, h.farahat, hesham, huangdongrun, huanghui, jiangzhiwen, jinyaohui, jjfeing, jojobugfree, jonathan_yan, jonyguo, jzw, kingfo, kisnwang, laiyongqiang, leonwanghui, lianliguang, lichen, lichenever, limingqi107, liubuyu, liuxiao, liyong, liyong126, lizhenyu, lupengcheng, lvliang, maoweiyong, ms_yan, mxm, ougongchang, panfengfeng, panyifeng, pengyanjun, penn, qianlong, seatea, simson, suteng, thlinh, vlne-v1, wangchengke, wanghua, wangnan39, wangqiuliang, wenchunjiang, wenkai, wukesong, xiefangqi, xulei, yanghaitao, yanghaoran, yangjie159, yangzhenzhang, yankai10, yanzhenxiang2020, yao_yf, yoonlee666, zhangbuxue, zhangz0911gm, zhangzheng, zhaojichen, zhaoting, zhaozhenlong, zhongligeng, zhoufeng, zhousiyi, zjun, zyli2020, yuhuijun, limingqi107, lizhenyu, chenweifeng.
  123. Contributions of any kind are welcome!
  124. # Release 0.1.0-alpha
  125. ## Main Features
  126. ### Ascend 910 Training and Inference Framework
  127. * Recommended OS: Ubuntu 16.04 (or later) or EulerOS 2.5 or EulerOS 2.8
  128. * Python version: 3.7.5
  129. * Preset models
  130. * ResNet-50: residual structure-based convolutional neural network (CNN) for image classification, which is widely used.
  131. * AlexNet: classic CNN for image classification, achieving historical results in ImageNet LSVRC-2012.
  132. * LeNet: classic CNN for image classification, which was proposed by Yann LeCun.
  133. * VGG16: classic CNN for image classification, which was proposed by Oxford Visual Geometry Group.
  134. * YoloV3: real-time object detection network.
  135. * NEZHA: BERT-based Chinese pre-training network produced by Huawei Noah's Ark Laboratory.
  136. * Execution modes
  137. * Graph mode: provides graph optimization methods such as memory overcommitment, IR fusion, and buffer fusion to achieve optimal execution performance.
  138. * PyNative mode: single-step execution mode, facilitating process debugging.
  139. * Debugging capability and methods
  140. * Save CheckPoints and Summary data during training.
  141. * Support asynchronous printing.
  142. * Dump the computing data.
  143. * Support profiling analysis of the execution process performance.
  144. * Distributed execution
  145. * Support AllReduce, AllGather, and BroadCast collective communication.
  146. * AllReduce data parallel: Each device obtains different training data, which accelerates the overall training process.
  147. * Collective communication-based layerwise parallel: Models are divided and allocated to different devices to solve the problem of insufficient memory for large model processing and improve the training speed.
  148. * Automatic parallel mode: The better data and model parallel mode can be predicted based on the cost model. It is recommended that this mode be used on ResNet series networks.
  149. * Automatic differentiation
  150. * Implement automatic differentiation based on Source to Source.
  151. * Support distributed scenarios and automatic insertion of reverse communication operators.
  152. * Data processing, augmentation, and save format
  153. * Load common datasets such as ImageNet, MNIST, CIFAR-10, and CIFAR-100.
  154. * Support common data loading pipeline operations, such as shuffle, repeat, batch, map, and sampler.
  155. * Provide basic operator libraries to cover common CV scenarios.
  156. * Support users to customize Python data augmentation operators through the Pyfunc mechanism.
  157. * Support the access of user-defined datasets through the GeneratorDataset mechanism.
  158. * Provide the MindSpore data format, data aggregation and storage, random access example, data partition, efficient parallel read, user-defined index, and dataset search.
  159. * Convert user datasets to the MindSpore data format.
  160. * After data processing and augmentation, provide training applications in feed and graph modes.
  161. * FP32/16 mixed precision computation, supporting automatic and manual configuration
  162. * Provide common operators such as nn, math, and array, which can be customized.
  163. ### Inference Deployment
  164. * Deploy models in MindSpore format on the Ascend 310 platform for inference.
  165. * Save models in ONNX format.
  166. * Support saving models in LITE format and running models based on the lightweight inference framework.
  167. * Recommended OS: Android 4.3 or later
  168. * Supported network type: LeNet
  169. * Provide the generalization operators generated by TVM and operators generated after specific networks are tuned.
  170. ### Other Hardware Support
  171. * GPU platform training
  172. * Recommended OS: Ubuntu 16.04
  173. * CUDA version: 9.2 or 10.1
  174. * CuDNN version: 7.6 or later
  175. * Python version: 3.7.5
  176. * NCCL version: 2.4.8-1
  177. * OpenMPI version: 3.1.5
  178. * Supported models: AlexNet, LeNet, and LSTM
  179. * Supported datasets: MNIST and CIFAR-10
  180. * Support data parallel.
  181. * CPU platform training
  182. * Recommended OS: Ubuntu 16.04
  183. * Python version: 3.7.5
  184. * Supported model: LeNet
  185. * Supported dataset: MNIST
  186. * Provide only the stand-alone operation version.
  187. ## Peripherals and Tools
  188. * [MindSpore Official Website] (https://www.mindspore.cn/)
  189. * [MindInsight Visualization Debugging and Optimization] (https://gitee.com/mindspore/mindinsight)
  190. * [MindArmour Model Security Hardening Package] (https://gitee.com/mindspore/mindarmour)
  191. * [GraphEngine Computational Graph Engine] (https://gitee.com/mindspore/graphengine)