LaiYongqiang
6099c54ca1
add ascend aicpu env ops
4 years ago
王泰格
cc8d3e6279
[assistant][ops][I40FKE] add operator Coalesce
4 years ago
i-robot
f2466fbff2
!29443 update ascend stream assign && add PROF log info
Merge pull request !29443 from lyqlola/master
4 years ago
liyiqi
adb33a15b5
update ascend stream assign && add PROF log info
4 years ago
i-robot
5bf10401db
!29445 add lu sovle for gpu backend
Merge pull request !29445 from zhuzhongrui/pub_master4
4 years ago
z00512249
a64947f1de
add lu sovle for gpu backend
4 years ago
r1chardf1d0
044e110f7a
custom op support julia
4 years ago
TronZhang
deac002bed
refactor kernel mod class and subclass
4 years ago
chenfei
fa7cb34c8a
set abstract of switch
4 years ago
i-robot
0155e9630a
!28935 fix_IsNoOpNode
Merge pull request !28935 from lingyunli63/fix_NoOpNode
4 years ago
lingyunli63
bae077f524
stop converting input of dynamic-shape Reshape to attr
4 years ago
baihuawei
d97fa33c49
session decoup with device target
4 years ago
yuchaojie
9a47c979e3
add copyattr in TransposedUpdateFusion
4 years ago
i-robot
f37e43e2b5
!28728 change onednn default omp thread pool to mindspore runtime thread pool
Merge pull request !28728 from fangzehua/mkl_tp
4 years ago
i-robot
7384cb2689
!28546 fused global norm for cpu adafactor
Merge pull request !28546 from kisnwang/add-cpu-adafactor
4 years ago
fangzehua
ac11ceeee4
change omp to ms threadpool
4 years ago
kswang
643b7b7ce4
fused global norm for cpu fused adafactor
4 years ago
i-robot
a0bb65d705
!28155 rename DynamicReshape to reshape, and support static-shape
Merge pull request !28155 from lingyunli63/rm_dynamic_reshape
4 years ago
lingyunli63
89e8b90a8d
rename dynamicreshape to reshape
4 years ago
ttudu
1595ea7d91
neighborexchangev2 add send empty depend
4 years ago
i-robot
1bd52f89dd
!28726 Fix bug of dynamic shape in MindRT
Merge pull request !28726 from caifubi/master-pynative-mindrt-dynamic-shape
4 years ago
caifubi
c0e47202b9
fix bug of dynamic shape
4 years ago
yuchaojie
b63a044a65
add ge format convert for ND_RNN_BIAS&FRACTAL_ZN_RNN and filter None input and non-task op in data dumper
4 years ago
i-robot
241d87a8a6
!27898 convert aicpu op attr to input
Merge pull request !27898 from yuchaojie/op_select
4 years ago
yuchaojie
a0a31fe651
convert aicpu op attr to input
4 years ago
yao_yf
b5d56af0ad
recompute genmask
4 years ago
i-robot
e470f532fc
!28113 support remove compile cache
Merge pull request !28113 from liubuyu/bug_fix
4 years ago
lby
99cf796eea
support remove kernel_meta
4 years ago
i-robot
749917a819
!28197 add cpu fused adafactor
Merge pull request !28197 from kisnwang/add-cpu-adafactor
4 years ago
kswang
af48f201a2
add cpu fused adafactor
4 years ago
i-robot
3f126a8035
!28077 support dynamci batch size for mindir
Merge pull request !28077 from fangzehua/dynamic_batch_mindir
4 years ago
fangzehua
7e8fcd9807
support dynamci batch size for mindir
4 years ago
i-robot
39b3fc2922
!27975 Change "is_internal_output" to "is_internal_output_nop_node"
Merge pull request !27975 from DeshiChen/1220_internaloutput
4 years ago
dayschan
ada1c80980
Change kAttrIsInternalOutput to kAttrIsInternalOutputNopNode
the output address of internal output nop node is required by the next KernelGraph,
so we cannot replace it in graphkernel optimization.
4 years ago
yuchaojie
37c7ccd01a
add DropoutV3 and corresponding passes
4 years ago
韩峥嵘
efe0fce473
[feat][assistant][I40FG0] add new Ascend operator NonMaxSuppression
4 years ago
tanghuikang
75a9da3df4
Remove MS_CTX_ENABLE_MEM_SCHEDULER and adjust SyncParameter
4 years ago
dayschan
c6a9547c24
Add "is_internal_output" attribute for related nodes
The internal output nodes of KernelGraph are stored in a `map`
for memory reuse between KernelGraphs.
However, the nodes may be changed in graphkernel optimization.
the passes of graphkernel are base on `FuncGraph`, instead of `KernelGraph`,
so we cannot use the `ReplaceInternalOutput` interface.
This commit, we add an attribute for these nodes and skip them in graphkernel's passes.
4 years ago
i-robot
4eb816008e
!27066 set offload node
Merge pull request !27066 from kisnwang/enable-set-swap-node
4 years ago
yuchaojie
02dc87e4d9
DynamicRNNGrad support `input_size not multiple of 16` scene
4 years ago
kswang
391a06aad1
set offload node
4 years ago
i-robot
ac7c73b770
!26976 convert attr to input for aicpu op-select
Merge pull request !26976 from yuchaojie/op_select
4 years ago
linqingke
8b293f25f0
MindSpore aicpu ops support CpuKernel.
4 years ago
yuchaojie
a90c6e8df8
convert attr to input for aicpu op-select
4 years ago
i-robot
fa5ea7b3a6
!26370 DynamicRNNGrad support `hidden_size not multiple of 16` scene
Merge pull request !26370 from yuchaojie/ir_fusion4
4 years ago
i-robot
519f14a909
!26006 slice recompute activation
Merge pull request !26006 from yao_yf/add_transformer_slice_activation_config
4 years ago
yuchaojie
b760eba23a
DynamicRNNGrad support `hidden_size not multiple of 16` scene
4 years ago
yao_yf
188d39da83
slice_activation_in_recompute
slice recompute activation
4 years ago
i-robot
9d6248194e
!26310 MindSpore support load custom aicpu kernels.
Merge pull request !26310 from linqingke/aicpu
4 years ago
i-robot
5233c73805
!25592 Reshape support shape is variable
Merge pull request !25592 from wangnan39/reshape_support_tensor
4 years ago