Megvii Engine Team
a0e531180d
fix(src/comp_node): fix calling cuda driver api
GitOrigin-RevId: cc33af2ac4
3 years ago
Megvii Engine Team
ccea0e2386
fix(dnn/rdnn): add warmup before profile
GitOrigin-RevId: 7962525e90
3 years ago
Megvii Engine Team
8182af6eb6
fix(mgb): fix strategy of grad_op and opr_attr
GitOrigin-RevId: bb7ab8fa9d
3 years ago
Megvii Engine Team
e2f5156b69
refactor(megbrain): save fastrun result to algorithm cache
GitOrigin-RevId: 45301ebb4d
3 years ago
Megvii Engine Team
f902ba2433
docs(megbrain): add notes for fastrun
GitOrigin-RevId: b59f7f205d
3 years ago
Megvii Engine Team
7dc347697a
feat(dnn/cuda): add typecvt uint16
GitOrigin-RevId: d1368c414e
3 years ago
Megvii Engine Team
b92866d2c2
fix(build): fix build depends dirty file issue
GitOrigin-RevId: 435d8b5c50
3 years ago
Megvii Engine Team
27d4c4b36c
refactor(stats): use static inline variable declaration
GitOrigin-RevId: 7d86e5f257
3 years ago
Megvii Engine Team
787a22a9d6
perf(tensor): implement __new__ in cpp
GitOrigin-RevId: 4defd249c3
3 years ago
Megvii Engine Team
99df4a7996
fix(dtype): dtype scalar set_retain_dtype supports bool
GitOrigin-RevId: aafd378e1b
3 years ago
Megvii Engine Team
7bf5b0ee1e
test(imperative): check env values after each pytest
GitOrigin-RevId: 826788113a
3 years ago
Megvii Engine Team
b3f79966fd
fix(mgb): fix "TRT_ERROR: INVALID_ARGUMENT: Get binding data type failed."
GitOrigin-RevId: d9601cb15b
3 years ago
Megvii Engine Team
409c988163
fix(imperative): add matmul apply_on_varnode
GitOrigin-RevId: 2cf6bf237c
3 years ago
Megvii Engine Team
b9cbc10120
feat(lite): add pack model
GitOrigin-RevId: 1a150f2af3
3 years ago
Megvii Engine Team
7927e98fd6
perf(mge): speed up PixelShuffle
GitOrigin-RevId: 942e755745
3 years ago
Megvii Engine Team
1c2a323e78
feat(mge): add warning message when mismatched cuda sm is detected
GitOrigin-RevId: f78c79eb06
3 years ago
Megvii Engine Team
877bda4180
perf(mge): improve cross stream memory borrowing
GitOrigin-RevId: c68977c5dc
4 years ago
Megvii Engine Team
484e1f1173
fix(build): fix riscv64 gcc build with > O0
GitOrigin-RevId: 9ad3480492
3 years ago
Megvii Engine Team
14e9ad625d
fix(megdnn): emit define-but-not-referenced and extra-;-ignored warning on cuda9.0~cuda9.1
GitOrigin-RevId: f6db42e395
3 years ago
Megvii Engine Team
c2435d1561
perf(imperative): specialize adaptive pooling
GitOrigin-RevId: 01e1418458
3 years ago
Megvii Engine Team
c0b267fff6
refactor(cuda-stub): opt cuda-stub log
GitOrigin-RevId: 87dda08e1b
3 years ago
Megvii Engine Team
d9c4ef59fe
perf(imperative): using simple hash key in heuristic cache
GitOrigin-RevId: 6fddd612e7
3 years ago
Megvii Engine Team
3949d425fb
feat(core): always show MegEngine version and git commit id
GitOrigin-RevId: 4daa5be6d6
3 years ago
Megvii Engine Team
fd6f8e58b0
feat(mgb/dtype): add dtype qint1
GitOrigin-RevId: abe9fb68b1
3 years ago
Megvii Engine Team
5ebc9d50b7
fix(pylite): fix lite global layout transform and fast run conflict error
GitOrigin-RevId: 910c8da19f
3 years ago
Megvii Engine Team
2a900a69cb
perf(imperative): improve reduce op performance
GitOrigin-RevId: 26d982a7b8
3 years ago
Megvii Engine Team
273c0e8745
fix(autodiff): fix some bugs in relation to 2nd order grad
1. implement double backward for batchnorm
2. fix grad attach in nested grad manager
3. pad empty tensor for unsatisfied output_has_grad
4. support double backward for jit subgraph
5. support double backward for autodiff.Function
6. readd debug flag MGE_LOG_OP_DISPATCH
GitOrigin-RevId: cd31ddc620
3 years ago
Megvii Engine Team
d56570d929
fix(megbrain): add rdnn to copybara
GitOrigin-RevId: 7d8bf77053
3 years ago
Megvii Engine Team
12a3ef8d01
refactor(fastrun): decouple fastrun from computing graph
GitOrigin-RevId: 27abd22295
3 years ago
Megvii Engine Team
2b80806f21
perf(imperative/src): improve dot performance
GitOrigin-RevId: 35b5bd164f
3 years ago
Megvii Engine Team
1709b3940b
perf(mge/functional): speed up Broadcast and Reshape
GitOrigin-RevId: a72f5460b6
3 years ago
Megvii Engine Team
3e206d899b
perf(mge/functional): speed up Split
GitOrigin-RevId: 43550a0706
3 years ago
Megvii Engine Team
8446626193
perf(imperative/src): improve elemwise
GitOrigin-RevId: 78aa487277
3 years ago
Megvii Engine Team
e400b7ffe5
perf(imperative): enable memory forwarding for imperative
GitOrigin-RevId: 7c1993979c
4 years ago
Megvii Engine Team
0cb60d646d
feat(imperative): add output_descs for apply_on_physical_tensor
GitOrigin-RevId: 5b036c2c5a
3 years ago
Megvii Engine Team
fea46ea9a4
perf(imperative): add opr cache for apply_on_physical_tensor
GitOrigin-RevId: fc5d5fb34d
4 years ago
Megvii Engine Team
ea4e6ab93a
fix(mgb/opr): fix shape cache of NvOF
GitOrigin-RevId: 456ba478e9
4 years ago
Megvii Engine Team
87de704a46
feat(gopt): fuse conv h_swish
GitOrigin-RevId: a3d12991fb
3 years ago
Megvii Engine Team
3726f5cc92
feat(gopt): merger consecutive relayout and dimshuffle to one relayout to optimize CD4 performarce
GitOrigin-RevId: a058776be3
3 years ago
Megvii Engine Team
1fead9b6b0
feat(gopt): merge consecutive dimshuffle and relayout to one relayout to optimize CD4 performace
GitOrigin-RevId: 16f22baa80
3 years ago
Megvii Engine Team
26d1e4f7ed
feat(gopt): optimize cd4 pass rule for elemwise and typecvt to let cd4 start as soon as possible
GitOrigin-RevId: 6580dedca7
3 years ago
Megvii Engine Team
5f4501e0f3
fix(gopt): fix conv bias fuse 2 noline
GitOrigin-RevId: a6ab9f4e5e
3 years ago
Megvii Engine Team
7d2063e35a
perf(cuda): speedup conv backward data with small feature map and large filter size
GitOrigin-RevId: 85592bca6b
4 years ago
Megvii Engine Team
28d48f2f7a
fix(mgb/src): fix megbrain cmake unsupport android_nn
GitOrigin-RevId: 037c197912
4 years ago
Megvii Engine Team
187c1dc081
fix(jit): copy aux var when shallow copying JITExecutor
GitOrigin-RevId: 3b331e1c17
4 years ago
Megvii Engine Team
b6ce02a152
fix(subgraph): fallback back to cg if jit unsupported
GitOrigin-RevId: 853a00a402
4 years ago
Megvii Engine Team
c55fda9a7c
fix(fastrun): don't kill profiling worker
GitOrigin-RevId: 99a0f11a5a
4 years ago
Megvii Engine Team
aa587446fc
feat(subgraph): support shape inference for CompiledOp
GitOrigin-RevId: a96b8f3446
4 years ago
Megvii Engine Team
bdb853ee6f
fix(mgb): fix extra device malloc when load MultipleDeviceTensorWithFormatHolder
GitOrigin-RevId: adf4a7f77a
4 years ago
Megvii Engine Team
e2b79ea00e
feat(mgb): reduce the number of trtruntimeopr create contexts
GitOrigin-RevId: 14e5d1769e
4 years ago