TFBunny
9eae68efaa
add gpu BCEWithLogitsLoss kernel
4 years ago
mindspore-ci-bot
3cffa4752e
!15117 fix codedex
From: @limingqi107
Reviewed-by: @cristoval,@wilfchen
Signed-off-by: @wilfchen
4 years ago
limingqi107
b3a5ccebc3
fix codedex
4 years ago
huangbingjian
b803887480
update adam_fusion and adam_weight_decay_fusion
4 years ago
wilfChen
8a7b568203
add relu fusion check
5 years ago
wilfChen
943b992458
trt converter
5 years ago
dingpeifei
3c9d8cb073
The input and output of batchnorm reverse operator increase pass in ascend platform under the mode of pynitve
5 years ago
mindspore-ci-bot
8e8f3043f9
!12115 IR operators of GPU and CPU are unified as batchnorm
From: @ding_fei_fei
Reviewed-by:
Signed-off-by:
5 years ago
TFBunny
32e86f4166
hot fix for print
5 years ago
dingpeifei
87e41aaeee
IR operators of GPU and CPU are unified as batchnorm
5 years ago
TFBunny
4d35303265
support string in GPU print
5 years ago
He Wei
7d9a783993
[auto-monad] Support side-effects by auto-monad
The basic idea is: exploits data dependency to control the execution order
of side-effect operations, and keep the semantics of ANF unchanged.
The ControlDepend primitive is removed and there are two primitives added:
1. UpdateState:
```
a = Assign(para, value)
```
became:
```
a = Assign(para, value, u)
u = UpdateState(u, a)
```
2. Load:
```
x = Add(para, value)
```
became:
```
p = Load(para, u)
x = Add(p, value)
u = UpdateState(u, p)
```
5 years ago
l00591931
9ec100d069
Change TensorAdd to Add, from r1.1 to master
5 years ago
yuchaojie
1932d87a26
update some op's attr name
5 years ago
wilfChen
09e10e18bb
momentum weightdecay fusion
5 years ago
VectorSL
54a496edbc
fix momentum-cast fusion
5 years ago
wilfChen
c1d3bd2160
relu optimize
5 years ago
huanghui
e17dd84c0b
add trace managager around backend opt
5 years ago
mindspore-ci-bot
3f75f13556
!8648 PyNative Performance Optimization
From: @jojobugfree
Reviewed-by:
Signed-off-by:
5 years ago
caifubi
c7d6997819
pynative host device parallel
5 years ago
lizhenyu
094f0b2a07
bugfix:fused batch norm op's input channel nums should be a multiple of 4
5 years ago
wilfChen
2291b7f2e6
dynamic shape check
5 years ago
Yi Huaijie
d7faa77b5e
support int64 shape
5 years ago
mindspore-ci-bot
1014774ab6
!8036 BnAddReluGrad fusion check
Merge pull request !8036 from chenweifeng/BnAddReluGrad-check
5 years ago
wilfChen
d2f0d0db53
BnAddReluGrad check
5 years ago
wilfChen
3b7e01c698
fix momentum fusion pass
5 years ago
wilfChen
cbdd658e24
fix momentum fusion pass
5 years ago
mindspore-ci-bot
d479b91093
!7767 GPU update resnet50 readme and add cast type
Merge pull request !7767 from VectorSL/readme
5 years ago
VectorSL
5102482e3a
1readme update resnet 2cast add more type
5 years ago
wilfChen
e877f72bcf
modify controldepend mount node
5 years ago
VectorSL
509b25ef1e
gpu nhwc
5 years ago
VectorSL
bbcdd81d1b
fix reduce precision: deal tuplegetitem and param
5 years ago
VectorSL
ccab6f88d5
gpu add reduce precision:int64->int32
5 years ago
mindspore-ci-bot
21c5607fca
!6971 cudnn inplace optimizer
Merge pull request !6971 from chenweifeng/tensoradd_inplace
5 years ago
wilfChen
b420b6cda7
cudnn inplace optimizer
5 years ago
mindspore-ci-bot
57ecb40022
!6825 GPU add combine cast fusion
Merge pull request !6825 from VectorSL/combine-cast
5 years ago
VectorSL
f36c2721af
gpu add combine cast fusion
5 years ago
VectorSL
8dca80036a
gpu add combine mom fusion
5 years ago
VectorSL
48db7f8c4f
gpu change bncast
5 years ago
VectorSL
50dc89332c
fix bn cast
5 years ago
wilfChen
aacf7c2e34
codex warning
5 years ago
limingqi107
5058e844cd
gpu inceptionv3 optimize
5 years ago
lizhenyu
c3d6918649
add kernel select after optimize pass
5 years ago
mindspore-ci-bot
1944b8e53b
!5612 Resnet50 pattern Fusion
Merge pull request !5612 from chenweifeng/BatchNormAddReluGrad
5 years ago
limingqi107
7823555e7a
gpu add the pass of remove redundant transpose
5 years ago
wilfChen
5316061fa3
gpu resnet50 fusion
5 years ago
limingqi107
7ec2f6a550
clear graph output address in graph destructor
5 years ago
VectorSL
853987da79
fix getinputformat error when input is not a realnode
5 years ago
VectorSL
9b7df3d099
gpu optimize transpose
5 years ago
lizhenyu
839ec02542
Add FusedBatchEx support
5 years ago