He Wei
4f04757e16
Fix some typo issues
1. Sequeue --> Sequence
2. Interger --> Integer
4 years ago
i-robot
4eb816008e
!27066 set offload node
Merge pull request !27066 from kisnwang/enable-set-swap-node
4 years ago
yuchaojie
02dc87e4d9
DynamicRNNGrad support `input_size not multiple of 16` scene
4 years ago
dayschan
be11003ec7
set the environment variable "MS_GRAPH_KERNEL_FLAGS" to deprecated
4 years ago
kswang
391a06aad1
set offload node
4 years ago
i-robot
ac7c73b770
!26976 convert attr to input for aicpu op-select
Merge pull request !26976 from yuchaojie/op_select
4 years ago
i-robot
c0048bb117
!26953 Add some bprop mindir files
Merge pull request !26953 from YuJianfeng/bprop_mindir
4 years ago
i-robot
f72bce0377
!27015 Use MindSpore communication framework as OpenMPI
Merge pull request !27015 from ZPaC/dir-of-distributed
4 years ago
ZPaC
ae3bae1571
Replace OpenMPI
4 years ago
linqingke
8b293f25f0
MindSpore aicpu ops support CpuKernel.
4 years ago
yuchaojie
a90c6e8df8
convert attr to input for aicpu op-select
4 years ago
yujianfeng
7c808ee792
Add some bprop mindir files
4 years ago
changzherui
4260b5e5d9
modify lossmonitor and print convert type
4 years ago
i-robot
e3e53b2f75
!26229 add loss landscape visualization function
Merge pull request !26229 from Songyuanwei/loss_landscape
4 years ago
ZPaC
e01e67b921
Adapt dlopen macro to windows.
4 years ago
i-robot
fa5ea7b3a6
!26370 DynamicRNNGrad support `hidden_size not multiple of 16` scene
Merge pull request !26370 from yuchaojie/ir_fusion4
4 years ago
songyuanwei
8d212d4812
add loss landscape
4 years ago
i-robot
b472850a75
!26594 Replace std::unordered_map/set with robin-hood-hashing
Merge pull request !26594 from hewei/use_robin_hood
4 years ago
i-robot
519f14a909
!26006 slice recompute activation
Merge pull request !26006 from yao_yf/add_transformer_slice_activation_config
4 years ago
He Wei
41dcac9c49
Replace std::unordered_map/set with robin-hood-hashing
Robin-hood-hashing (https://github.com/martinus/robin-hood-hashing )
is considered faster then std::unordered_map/set,
so we use it to improve mindspore performance.
1. robin_hood head file in `third_party/robin_hood/include`;
2. In `utils/hash_map.h` and `utils/hash_set.h`, we define:
- mindspore::HashMap as an alias of robin_hood::unordered_map;
- mindspore::HashSet as an alias of robin_hood::unordered_set;
3. Replace:
- `#include <unordered_map>` --> `#include "utils/hash_map.h"`;
- `#include <unordered_set>` --> `#include "utils/hash_set.h"`;
- `std::unordered_map` --> `mindspore::HashMap`;
- `std::unordered_set` --> `mindspore::HashSet`;
- `map.insert(std::pair(key, value))` --> `map.emplace(key, value)`;
- `[] (const std::pair<K, V> &p) {..} ` --> `[] (const auto &p) {..} `;
4. Fix issues found by switch to robin_hood:
- AnfNodeConfig hash and equal;
- Fix a bug in `Slice::operator==()`;
- Fix a bug in `CNode::HasPrimalAttr()`;
- Fix map.erase() usage bugs: `map.erase(iter++)` --> `iter = map.erase(iter)`;
- Fix some iterator invalidated problem;
5. Some std::unordered_map/set can not replace by robin_hood:
- As parameter of functions that exposed to python by pybind11;
- Use bad hash that cause robin_hood::map over_flow, such as AbstractBasePtrListHasher;
6. Update cpp unit tests;
7. Add build option '-F' to enable robin_hood, default on.
4 years ago
yuchaojie
b760eba23a
DynamicRNNGrad support `hidden_size not multiple of 16` scene
4 years ago
yao_yf
188d39da83
slice_activation_in_recompute
slice recompute activation
4 years ago
i-robot
9d6248194e
!26310 MindSpore support load custom aicpu kernels.
Merge pull request !26310 from linqingke/aicpu
4 years ago
i-robot
23da0717bc
!26187 Support CSRTensor
Merge pull request !26187 from 杨林枫/csr_frontend
4 years ago
i-robot
5233c73805
!25592 Reshape support shape is variable
Merge pull request !25592 from wangnan39/reshape_support_tensor
4 years ago
王南
1163cfe967
reshape support shape is tensor
4 years ago
linqingke
bef2923acf
MindSpore support load custom aicpu ops.
4 years ago
yanglf1121
72db8e4d3f
support sparse tensor frontend
4 years ago
i-robot
aac1291062
!26297 compiler support dump flag
Merge pull request !26297 from huanghui/cell-dump
4 years ago
i-robot
6bdd38399a
!25811 fault_recover_by_mirror_group
Merge pull request !25811 from yao_yf/fault_recover_by_mirror_group
4 years ago
huanghui
35cb09a536
compiler support dump flag
4 years ago
i-robot
ede648876e
!26180 fix node type error
Merge pull request !26180 from jjfeing/master
4 years ago
yao_yf
501b978d16
find data parallel common group in auto parallel
4 years ago
i-robot
5211733add
!25614 [GraphKernel] Enable parallel fusion in Ascend and enhance parallel feature.
Merge pull request !25614 from TronZhang/parallel_support_in_ascend
4 years ago
tronzhang
e2a0c0d613
support parallel for ascend
4 years ago
i-robot
6c587dc2d3
!25391 Support to profiling parallel strategy
Merge pull request !25391 from ougongchang/profiling_stategy
4 years ago
i-robot
b21a98fca3
!26070 [GraphKernel] move file_utils from ccsrc to core.
Merge pull request !26070 from chenlei_autodiff/move_file_utils
4 years ago
jjfeing
d4e2a21d26
fix node type error
4 years ago
looop5
58e27d87bc
add Custom, custom_op_info_register, CustomRegOp to __init__
4 years ago
ougongchang
9229f1c1ff
profiler support to collect parallel strategy info
If SetNodeOutputType functions forcibly splits into multiple functions, the readability decreases, so it blocks lizard scans
4 years ago
i-robot
816de6f0ee
!26055 convert attr to value node
Merge pull request !26055 from yanzhenxiang2020/aicpu_random_seed_to_input
4 years ago
chenlei_autodiff
13777375bd
[GraphKernel] move file_utils from ccsrc to core.
4 years ago
i-robot
14efcd5a1c
!26030 [GraphKernel] Add Compiling Macros in graph_kernel_flags.
Merge pull request !26030 from chenlei_autodiff/decouple_code
4 years ago
i-robot
8bf7e28fa6
!25410 add dump flag for fusion nodes
Merge pull request !25410 from yuchaojie/ir_fusion3
4 years ago
i-robot
9b00c2d941
!26047 [ME][Fallback] Modify fallback log level
Merge pull request !26047 from Margaret_wangrui/fallback_log
4 years ago
jjfeing
34b73e305d
convert attr to value node
4 years ago
Margaret_wangrui
ea95e2c7d2
modify fallback log level
4 years ago
yuchaojie
0c90aecae4
add dump flag for fusion nodes
4 years ago
chenlei_autodiff
6ac7471d5c
[GraphKernel] Add Compiling Macros in graph_kernel_flags.
4 years ago
huangxinjing
f354ab22a3
add pipeline shard interface
Add support for no pipeline accugradient
Add delay tag for fusion op
Optimizer the visite order
add mirror for mini step control
Move the group to attributes
Add gradient_shard control for the mini step
Fix code stype
Fix ut description
Add interface
4 years ago