0f1afb09
feat(fallback): imp gi matmul AlgoF32GiMK4_4x8 algo, move AlgoF32GemvMK4 from arm_common to fallback by
2022-05-06 15:00:53 +0800
410dcb6c
feat(fallback): add more gi api for conv, and add gi API test by
2022-05-06 14:54:03 +0800
a0e53118
fix(src/comp_node): fix calling cuda driver api by
2022-04-21 15:44:55 +0800
ccea0e23
fix(dnn/rdnn): add warmup before profile by
2022-04-27 19:09:05 +0800
58df717e
fix(mge/autodiff): fix attaching tensor already in gradient path by
2022-04-29 14:39:17 +0800
05186e7b
fix(midout): fix elemwise crash after midout some dnn backends opr will use agency opr, for example: softmax cpu naive imp will call elemwise opr, at model dump stage, we can not get dnn runtime logic, so we record elemwise mode info at runtime stage. by
2022-05-06 19:57:39 +0800
9be8de60
fix(midout): formatting midout tools by
2022-05-06 16:11:51 +0800
8182af6e
fix(mgb): fix strategy of grad_op and opr_attr by
2022-04-25 11:20:35 +0800
70209667
fix(dnn/test): fix some bug when force_deduce_layout is off by
2022-04-28 14:46:19 +0800
597a1e79
refactor(imperative): add interface to clear algorithm cache by
2022-04-26 21:06:35 +0800
e2f5156b
refactor(megbrain): save fastrun result to algorithm cache by
2022-04-13 14:40:29 +0800
f902ba24
docs(megbrain): add notes for fastrun by
2022-04-21 19:19:32 +0800
d968942f
perf(cuda): speedup direct large kernel conv by
2022-04-05 16:38:02 +0800
b4f9703f
fix(overflow): fix the overflow of the long_scalars in network_node and module_stats by
2022-05-08 21:05:17 +0800
b2cffdde
fix(lite): fix lite cpu default not work by
2022-04-26 14:12:54 +0800
dbce6526
fix(mge/functional): fix return dtype of comparison function by
2022-04-26 19:31:34 +0800
7dc34769
feat(dnn/cuda): add typecvt uint16 by
2022-04-24 22:18:11 +0800
b92866d2
fix(build): fix build depends dirty file issue by
2022-04-24 14:46:30 +0800
9d8983f3
Merge pull request #457 from kagome1007/updatenewlogo by
2022-05-06 14:49:11 +0800
e3f591ca
fix(mge): update logo by
2022-05-05 14:48:46 +0800
4b27e861
fix(ops): implement from_op_node for reshape by
2022-04-13 12:18:51 +0800
4fb3d886
perf(misc): use reinterpret_cast to convert valueshape by
2022-04-12 19:48:03 +0800
951ed476
fix(minigraph): supports varnode forwarding by
2022-04-12 19:45:18 +0800
27d4c4b3
refactor(stats): use static inline variable declaration by
2022-04-12 19:26:07 +0800
787a22a9
perf(tensor): implement __new__ in cpp by
2022-04-12 19:51:45 +0800
99df4a79
fix(dtype): dtype scalar set_retain_dtype supports bool by
2022-04-12 19:29:32 +0800
53d48ca3
fix(fastrun): persistent cache add prefix by
2022-04-12 19:30:28 +0800
7005b7d6
fix(mge): fix fastpath check by
2022-04-20 19:43:05 +0800
7bf5b0ee
test(imperative): check env values after each pytest by
2022-04-21 17:39:05 +0800
811579b0
fix(imperative): restrict value converts to symbolvar by
2022-04-20 11:21:58 +0800
b3f79966
fix(mgb): fix "TRT_ERROR: INVALID_ARGUMENT: Get binding data type failed." by
2022-04-14 16:41:59 +0800
99a85c40
fix(mge): fix advanced indexing grad by
2022-04-18 14:05:55 +0800
409c9881
fix(imperative): add matmul apply_on_varnode by
2022-04-14 10:29:47 +0800
d52ba79d
fix(lite): support set data by copy on device tensor by
2022-04-13 18:25:08 +0800
275f12c9
fix(mge): fix dimshuffle shape infer by
2022-04-12 13:32:12 +0800
e59b6e13
fix(imperative/src): fix empty_tensor bug of convbwd&rng by
2022-04-11 16:20:17 +0800
115c4592
fix(dnn/opencl): fix opencl elemwise tuning issue by
2022-04-01 14:55:05 +0800
275a492b
docs(mge/functional): update functional.nn.max_pool2d docstring by
2022-03-02 15:50:58 +0800
a22669c1
docs(mge/functional): update functional.nn.avg_pool2d docstring by
2022-03-02 16:24:20 +0800
b9cbc101
feat(lite): add pack model by
2022-04-12 17:21:23 +0800
7927e98f
perf(mge): speed up PixelShuffle by
2022-03-22 13:05:46 +0800
43e5f41c
docs(mgb/docs): add alternative interface for deprecated api by
2022-04-08 16:42:26 +0800
ba90a028
feat(mge): opt third_party prepare 2/2 by
2022-04-14 14:19:05 +0800
ff9e6121
fix(lite): fix lar LITE_WITH_CUDA and LITE_WITH_OPENCL invalid for bazel by
2022-04-08 19:42:40 +0800
73112558
feat(mge/dnn): support checknonfinite for fp16 by
2022-03-04 18:45:06 +0800
f7e10ea8
perf(imperative): improve matmul/batch_matmul by
2022-03-22 18:26:57 +0800
1c2a323e
feat(mge): add warning message when mismatched cuda sm is detected by
2022-04-06 17:15:35 +0800
877bda41
perf(mge): improve cross stream memory borrowing by
2022-02-14 14:06:31 +0800
ed7fa104
feat(fallback): move direct multi_thread_common helper to fallback by
2022-04-07 14:12:46 +0800
8871ad74
refactor(fallback): opt gi naive reinterpret by
2022-04-07 18:00:29 +0800
c2eec47b
fix(imperative/src): fix adaptive_pooling bug by
2022-04-09 01:08:50 +0800
ffbf8fad
feat(fallback): add general intrinsic to elemwise multitype by
2022-03-25 15:13:33 +0800
484e1f11
fix(build): fix riscv64 gcc build with > O0 by
2022-04-06 19:43:42 +0800
14e9ad62
fix(megdnn): emit define-but-not-referenced and extra-;-ignored warning on cuda9.0~cuda9.1 by
2022-04-06 20:56:31 +0800
4c0bff1d
refactor(megdnn): refactor TEGRA_X1/X2 macro by
2022-03-26 01:02:05 +0800
758549b9
feat(megengine): support tx2 by
2022-03-19 15:29:43 +0800
84ce94fb
docs(imperative): fix docs about vision related function by
2022-03-22 11:03:00 +0800
c2435d15
perf(imperative): specialize adaptive pooling by
2022-03-22 19:10:17 +0800
8fcbe825
docs(mge/functional): fix functional.eye docstring by
2022-04-01 17:47:09 +0800
6aa42e6e
Merge pull request #439 from Qsingle:fix_python36_pip_url by
2022-04-06 10:33:30 +0800
39d98d45
feat(fallback): add fallback typecvt with general intrinsic by
2022-03-09 18:31:34 +0800
d2278f02
perf(imperative): speed up conv_transpose3d by
2022-03-23 15:42:21 +0800
3a5347ed
perf(imperative): speed up pooling by
2022-03-23 14:36:48 +0800
c0b267ff
refactor(cuda-stub): opt cuda-stub log by
2022-03-31 17:51:40 +0800
0bc5a1c8
feat(mge): opt third_party prepare by
2022-03-30 15:58:19 +0800
3f11b421
docs(functional): replace megengine function testcode with doctest format by
2022-04-01 12:09:42 +0800
60163c5d
docs(functional): replace elem testcode with doctest format by
2022-02-28 19:59:26 +0800
fd5031f8
docs(functional): replace utils testcode with doctest format by
2022-02-28 17:10:19 +0800
aa632305
docs(functional): replace loss function testcode with doctest format by
2022-02-28 17:00:17 +0800
e8d0f9db
docs(example): replace dist.help testcode with doctest format by
2022-02-28 15:14:51 +0800
f5f9249a
docs(compose): update compose API docstring and example by
2022-02-28 13:57:07 +0800
7d3a6db0
docs(tensor): add more introduction about Tensor by
2022-02-25 17:01:15 +0800
30561d72
fix(windows): fix dll install on windows by
2022-04-01 14:34:51 +0800
d9c4ef59
perf(imperative): using simple hash key in heuristic cache by
2022-03-24 19:05:33 +0800
26ea33c6
perf(imperative): improve convbwd performance by
2022-03-31 22:07:18 +0800
3949d425
feat(core): always show MegEngine version and git commit id by
2022-03-29 19:23:12 +0800
5bf31163
(tag: v1.9.0)
fix(mge): fix infer output attrs fallible by
2022-03-30 12:43:42 +0800
94960ecf
fix(imperative): restrict using convert_inputs in py_apply by
2022-03-30 15:39:46 +0800
09dab387
feat(cuda): support int1 simplewq conv by
2022-03-11 20:02:59 +0800
f319d842
fix(scripts): update the pip url for python36 by
2022-02-23 11:48:39 +0800
90cedd3c
fix(imperative): restrict using convert_inputs in py_apply by
2022-03-30 15:39:46 +0800
7a63f1cd
docs(readme): opt cmake build md by
2022-03-29 13:21:44 +0800
2d72de8a
fix(mge): fix infer output attrs fallible by
2022-03-30 12:43:42 +0800
b6ad4572
feat(cuda): support int1 simplewq conv by
2022-03-11 20:02:59 +0800
331567af
fix(opencl/ci): misc opt and fix: by
2022-03-25 17:07:12 +0800
ff6a3bb8
fix(fallback): delete the repeat opcaller in fallback and arm_common by
2022-03-09 18:31:34 +0800
547945e8
feat(fallback): support general intrinsic in elemwise in fallback by
2022-03-08 15:47:32 +0800
a017bed3
fix(fallback): reman general intrinsic type and add more intrinsic by
2022-03-08 15:45:22 +0800
6554e262
chore(release): bump version by
2022-03-28 06:17:52 +0000
fd6f8e58
feat(mgb/dtype): add dtype qint1 by
2022-03-15 17:13:23 +0800
616352b0
fix(imperative): add dtype promote support for concat by
2022-03-25 13:25:29 +0800
95a30eb6
perf(imperative): speed up stackmanager guard by
2022-03-24 15:14:09 +0800
f7a5fe17
feat(mge/config): add a config option for memory forwarding by
2022-03-24 15:27:57 +0800
9ffc2c0a
fix(mge): fix host performance loss caused by dtr by
2022-03-24 15:04:21 +0800
69673f14
fix(imperative): remove convert_inputs from concat by
2022-03-24 14:02:59 +0800
4d5faa3f
fix(imperative): using DnnOprCaller to avoid early destruction of dnn_opr by
2022-03-22 15:23:57 +0800
da620ca1
perf(imperative): specialize batchnorm implementation by
2022-03-21 15:39:19 +0800
5ebc9d50
fix(pylite): fix lite global layout transform and fast run conflict error by
2022-03-07 19:47:06 +0800
49d92d9c
feat(lite): feat layout transform interface for lite model by
2022-02-28 18:18:48 +0800
2a900a69
perf(imperative): improve reduce op performance by
2022-03-01 17:31:22 +0800