looop5
1b36f454b8
close BatchMatmul and ReduceSum in graph kernel
5 years ago
lingyunli63
e6a5fc0739
consider controldepend edges in checkcircle
5 years ago
mindspore-ci-bot
286f5b05f7
!8493 【GraphKernel】Fuse composite ops separated by GetItem nodes
From: @dayschan
Reviewed-by: @ckey_dou
Signed-off-by: @ckey_dou
5 years ago
mindspore-ci-bot
9969c83f75
!8689 [GraphKernel] Split shape ops for more fusion opportunity.
From: @tronzhang
Reviewed-by: @gaoxiong1,@ckey_dou
Signed-off-by: @ckey_dou
5 years ago
dayschan
8e6d92eac9
Fuse composite ops separated by GetItem nodes
5 years ago
tronzhang
9d7494f4df
split shape ops for more fusion pportunity.
5 years ago
dayschan
a8bb28437c
Temporarily disable all simplify pattern except SimplifyReduce, for some bizarre errors occurs in arithmetic simplify.
5 years ago
lingyunli63
a51465c78b
add graphkerneloptimize pass
align fuse_ops_fusion
align composite_ops_fusion
unify ops table
Init new_code's kernel_info with orig_node's kernel_info in function NewCNodeWithInfo
enable run bert
add pass tensor_promotion
add macro for bias_add and bias_add_grad in expander pass
exclude unused attrs in primitive compare for GraphKernelCSE
exclude fusion_type in kernelinfo cmp for cse in graphkernel
check processor
remove graph kernel pass before select kernel
recover run_standalone_pretrain_ascend.sh
remove is_before_kernel_select
move add_atomic_clean from pass directory to graph_kernel directory
update fuse op list in Ascend back-end
5 years ago
zengzitao
28f1db74dd
expand maximum_grad minimum_grad dropout_grad op
5 years ago
mindspore-ci-bot
7b70c17fc0
!8449 【GraphKernel】Add Transpose into fusible list; Update akg submodule.
From: @dayschan
Reviewed-by: @gaoxiong1,@ckey_dou
Signed-off-by: @ckey_dou
5 years ago
mindspore-ci-bot
b078954667
!8389 remove multiple circles
From: @lingyunli63
Reviewed-by: @gaoxiong1,@ckey_dou
Signed-off-by: @ckey_dou
5 years ago
dayschan
195b1fe8d5
Add Transpose into fusible list.
5 years ago
lingyunli63
dc95c63c03
remove multiple circles
5 years ago
mindspore-ci-bot
f059859909
!8367 expand tanh_grad and reduce mean , add some graph kernel test case, fix kernel info bug
From: @zengzitao
Reviewed-by: @gaoxiong1
Signed-off-by:
5 years ago
zengzitao
db27783d54
expand tanh_grad and reduce_mean, fix bug and add test_case in ci
5 years ago
mindspore-ci-bot
1f9d034e53
!8276 Try to cache object to accelerate "AllocKernelDynamicRes" and "FreeKernelDynamicRes"
From: @tronzhang
Reviewed-by: @ckey_dou,@gaoxiong1
Signed-off-by: @gaoxiong1
5 years ago
tronzhang
1cf2482ba5
try to get address pointer from cache
5 years ago
mindspore-ci-bot
255c4b4b5a
!8317 Bugfix in GraphKernel after all integer ValueNodes are changed to int64
From: @dayschan
Reviewed-by: @gaoxiong1,@ckey_dou
Signed-off-by: @ckey_dou
5 years ago
dayschan
f8be2f972b
Bugfix in GraphKernel after all integer ValueNodes are changed to int64
5 years ago
lingyunli63
b3d76c6e3e
exclude unused attrs and fusion_type in cse cmp
5 years ago
mindspore-ci-bot
51edcd30e2
!8270 expand fused_adam and fused_adam_weight_decay and fix some bug
Merge pull request !8270 from ZengZitao/expand_fused_adam
5 years ago
Yi Huaijie
d7faa77b5e
support int64 shape
5 years ago
zengzitao
53043ae18f
support expand fused_adam and fused_adam_weight_decay op
5 years ago
dayschan
0f8f1cdda7
Eliminate redundant parameters while expanding basic ops.
add testcase for Gelu/GeluGrad
5 years ago
mindspore-ci-bot
371d0071c5
!8043 fix bugs,add patterns and modify handle methods about graph
Merge pull request !8043 from zhuxiaochen/1030_fixandadd_2.0
5 years ago
mindspore-ci-bot
02a942ef56
!8084 expand gelu and gelugrad op
Merge pull request !8084 from ZengZitao/expand_gelu
5 years ago
zengzitao
5cfa172720
expand gelu and gelugrad op
5 years ago
lingyunli63
49bfe4415f
refine remove_circle
5 years ago
mindspore-ci-bot
5c4940cdcc
!7892 Convert non-scalar tensor to parameter
Merge pull request !7892 from DeshiChen/1028_nonscalar_tensor_to_input
5 years ago
zhu_xiaochen
2122e691d5
fix bugs,add patterns and modify handle methods about graph
5 years ago
zengzitao
febdb1850c
expand bias_add and bias_add_grad op
5 years ago
dayschan
b6c2812a29
Convert non-scalar tensor to parameter
Add a pass `tensor_promotion`.
Fix a bug in CreateKernelInfoFromNewParameter, which reset the KernelInfo by mistake.
what's more:
Update akg
Fixbug in model_builder when reduce axis is an interger.
5 years ago
zengzitao
6665684a18
delete sub_graph_changed to avoid resetkernelinfo cause error
5 years ago
dayschan
2b074f6365
Add a simplification pattern to ArithSimplify. The (x*C1)*C2 => x*(C1*C2)
5 years ago
mindspore-ci-bot
8d39a8a4b2
!7529 complex arithmetic_simplify
Merge pull request !7529 from zhuxiaochen/1020_allsimplify_1.0
5 years ago
zhu_xiaochen
c739f14038
simplify transpose matmul reduce
5 years ago
lingyunli63
a500a57c72
add GraphkernelCSE
5 years ago
Geng_Fei
4de1a988d1
fix matcher bug in arithmetic simplify
5 years ago
Geng_Fei
1455372cf1
add new pass in graph kernel: arithmetic_simplify
5 years ago
tronzhang
c32bf5ac28
promote complex tensor as graph's input and recorrect getitem index for graph kernels fusion.
5 years ago
dayschan
7599686a72
GraphKernel supports multi-output kernels
5 years ago
lingyunli63
dd48f10c3d
add assign ops in composite_topi
5 years ago
dayschan
3c2da3197f
Fix review_bot and codedex problems
5 years ago
mindspore-ci-bot
f7691335eb
!6167 fused select and greater op to improve bert perfermance on GPU
Merge pull request !6167 from ZengZitao/fuse_greater_select_ms
5 years ago
zengzitao
a38d6139fa
fused select and greater op to improve bert perfermance on GPU
5 years ago
r1chardf1d0
88de0cffa9
open graph kernel expander opt for gpu
5 years ago
mindspore-ci-bot
7152fe04be
!5783 GraphKernel supports GPU
Merge pull request !5783 from DeshiChen/graph_kernel_1.0
5 years ago
dayschan
37a48f6aac
GraphKernel supports GPU
1. Update akg submodule
2. Refactor akg_kernel_build, akg_ascend_kernel_build, akg_gpu_kernel_build
3. Add akg_kernel_json_decoder to support converting kernel_json to AnfNode.
4. Add GraphKernel Cost Model. (mindspore/_extends/graph_kernel)
5. Add some GraphKernel passes to GpuSession, move these passes to backend/optimizer/graph_kernel.
6. Add global id for ir files.
7. Fix bug in ConstInputToAttr.
5 years ago