8546c15d
feat(gi): make elemwise apply gi class type by
2022-06-14 18:14:48 +0800
74fb63db
feat(gi): make matrix_mul apply gi class type by
2022-06-14 18:13:40 +0800
45b26400
feat(gi): make resize apply gi class type by
2022-06-14 18:12:21 +0800
7d7cc3c8
feat(gi/riscv): add gi support with risc-v by
2022-06-14 18:09:59 +0800
a32b7277
fix(build): upgrade bazel riscv toolchains by
2022-06-14 18:09:08 +0800
588a645b
feat(dnn/opencl): add binary interface for opencl algo and kernel cache by
2022-06-10 18:42:39 +0800
1192a9a6
fix(imperative): fix adaptive pool2d by
2022-06-14 20:52:27 +0800
7b4b94fd
fix(imperative): fix the segmentfault when reduce backward by
2022-06-13 16:25:17 +0800
24c5c19b
fix(imperative): make functional ops support negative axis by
2022-04-22 16:26:30 +0800
c76e80bc
fix(mge): fix mem leak of numpy method of DeviceTensorND by
2022-06-14 11:09:52 +0800
f96429c0
feat(imperative): support empty tensor in roi_align by
2022-05-05 13:36:40 +0800
2f829aaa
test(imperative): speed up integration test by
2022-05-13 16:09:41 +0800
a1ca390d
fix(lite): fix const shape error for lar fitting mode by
2022-06-08 15:46:41 +0800
2001c494
feat(lite): add input shape parse for load and run by
2022-05-26 20:02:32 +0800
64a8aaaf
fix(build): remove ununsed functions when cuda disabled by
2022-06-10 16:53:15 +0800
8f17b84a
fix(dnn): fix dnn run cd4 on cpu by
2022-06-07 14:22:56 +0800
81065cf0
build(mgb/cutlass): merge partial headers by
2022-04-30 11:55:36 +0800
d610c987
feat(lite): add disable configure by model info interface by
2022-06-07 17:08:48 +0800
07bdb3bf
feat(imperative): add swapaxes by
2022-06-06 17:02:59 +0800
a0862865
feat(mge/third_party): update cutlass version by
2022-06-13 11:56:51 +0800
972a8d54
feat(ci): add model compatibility check in ci by
2022-05-20 20:04:43 +0800
e6943017
fix(ops/jit): skip lookup include path when nvcc executable not found by
2022-06-08 19:11:42 +0800
02c1a0c3
fix(imperative/format): warn once when parameter without format been attached by
2022-06-07 15:48:12 +0800
6ef1e12c
fix flops count bug for ConvTranspose2d by
2022-06-09 20:07:48 +0800
c2deef1a
feat(mge): aad atlas710 support by
2022-05-31 20:50:55 +0800
df5ebd3d
fix(imperative/ops): fix the vmemory problem in 1.9 by
2022-06-08 16:13:19 +0800
4e66e0eb
feat(megdnn/softmax): add softmax operator in OpenCL by
2022-05-06 16:53:34 +0800
b36b5bd8
refactor(mgb): check input when profiling by
2022-06-01 11:10:43 +0800
6c9b3a58
refactor(dnn): remove algorithm cache queries by
2022-05-10 13:34:39 +0800
8563f514
fix(imperative): fix buildin reduce keepdim by
2022-06-06 11:51:23 +0800
96d90be1
feat(dnn): fallback support int4 relayout by
2022-06-01 19:40:01 +0800
eef0308b
feat(serialization): add no_change_graph and version param whem dump model by
2022-06-01 14:43:39 +0800
4ab5f970
fix(build): fix ci error by
2022-05-12 18:57:04 +0800
b9a69323
feat(imperative): channel default model format to fbs v2 by
2022-05-12 15:42:09 +0800
283a2f77
feat(serialization): support test for new serialization format by
2022-05-11 18:03:28 +0800
50faabf6
feat(serialization): support the registry for new serialization format by
2022-05-11 18:02:52 +0800
a694fb33
feat(serialization): implement the new serialization format by
2022-05-11 17:57:46 +0800
ca4a5da0
feat(serialization): add new serialization format define by
2022-05-11 17:54:48 +0800
ba32360a
feat(lite): add set opencl buffer kernel cache lite api by
2022-06-07 13:36:13 +0800
711b5bf5
fix(dnn/arm_common): fix some load beyond memory by
2022-05-17 16:49:44 +0800
3ebb8db0
feat(third_party/cutlass): update to version 2.8 by
2022-04-27 17:58:37 +0800
da91e650
refactor(ops/layer_norm): speed up the host speed of layer_norm by
2022-05-24 21:33:48 +0800
67cfce9f
fix(imperative/amp): add is_scalar check in elemwise and concat by
2022-05-31 14:37:55 +0800
d313f926
fix(imperative/amp): fix format transformation for symbol trans by
2022-05-28 16:56:43 +0800
261a5bce
feat(imperative/amp): add dimshuffle in set_format for nhwc by
2022-05-26 18:00:29 +0800
c9e56f49
feat(imperative/amp): add dimshuffle before creating nhwc tensor by
2022-05-26 15:36:27 +0800
d57a0712
feat(imperative/amp): add fallback for op not supported for nhwc tensor by
2022-05-19 22:30:09 +0800
38a9aa9f
feat(imperative/amp): add auto dimshuffle for elemwise and concat by
2022-05-16 15:14:59 +0800
cd263765
style(imperative/amp): reformat code by
2022-05-06 14:58:01 +0800
3892aa0b
fix(imperative/amp): fix bn params for nhwc amp by
2022-04-29 14:48:52 +0800
6f0b5820
chore(imperative/amp): adapt dev by
2022-04-22 14:57:21 +0800
ee984e86
fix(imperative/amp): fix distributed backward callback for nhwc amp by
2022-04-11 18:44:25 +0800
15c6da62
feat(imperative/amp): add nhwc support for adaptive pooling by
2022-04-13 18:09:58 +0800
c28a875f
fix(imperative/amp): adapt new transformation by
2022-03-29 17:11:31 +0800
fd41302c
feat(imperative/amp): add set_format by
2022-03-25 11:01:21 +0800
fc633ce4
fix(imperative/amp): fix custom grad in Subgraph by
2022-03-21 19:55:16 +0800
673b295d
feat(imperative/amp): remove conv_format and bn param_dim configs by
2022-03-11 12:44:23 +0800
7e9aa742
feat(imperative/amp): enable auto_convert_format by default by
2022-03-08 19:05:55 +0800
fc0f4546
fix(dnn/check_non_finite): adjust some details of CheckNonFinite by
2022-02-24 14:59:56 +0800
3bd40887
feat(mgb/opr): add NHWC support for AdaptivePooling by
2022-02-11 14:37:14 +0800
e393d1cf
feat(mge/amp): add convert_format module for NHWC training by
2022-01-25 18:00:41 +0800
533fb5bf
feat(imperative): support formatted tensor and add special op rules by
2022-01-07 16:48:40 +0800
4aa79c45
perf(mge): override grad of matmul by
2022-05-12 17:14:16 +0800
98b5ee78
feat(mge/dnn): add lamb optimizer by
2022-04-07 18:37:31 +0800
a926878c
feat(imperative): remove symbolvar of imperative by
2022-05-15 20:52:59 +0800
14813d13
fix(whl): fix whl broken: patchelf on big (> 4G) file will make elf section broken, as a workaround, do strip firstly, then do patchelf. by
2022-05-26 01:14:38 +0800
9e0583e1
feat(dnn/arm_common): add arm_common chanwise dot 11x11 by
2022-05-24 14:14:25 +0800
115bcbce
feat(lite): add fitting mode for load and run by
2022-05-24 17:13:20 +0800
02bfb8f8
feat(lite): add and fix some feature for load and run fitting mode by
2022-05-17 14:03:15 +0800
c62ddba2
feat(dnn/opencl): optimize heuristic rule by
2022-04-01 14:55:05 +0800
6e839407
Merge pull request #461 from tpoisonooo/patch-1 by
2022-06-06 20:50:10 +0800
91a45d7c
docs(README.md): add link by
2022-05-25 11:37:08 +0800
d404ed18
feat(ci): update cpuinfo by
2022-05-24 14:40:25 +0800
c2500cdb
chore(license): apply change caused by bot forward rebase by
2022-05-24 10:50:33 +0800
5f0e7ffb
feat(fallback): add FB_GI_F32_4x12 benchmark by
2022-05-19 18:46:23 +0800
f249d387
feat(fallback): imp gi matmul FB_GI_F32_4x12 algo by
2022-05-17 16:36:56 +0800
03f78547
feat(dnn/arm_common): add 9x9s1s2 dot chanwise kernel by
2022-05-10 16:57:13 +0800
80e1f38b
fix(gtest): fix ci error report stack-use-after-scope how to reproduce the problem: 1: build with asan(revert this MR) 2: then taskset process to one cpu: taskset 01 ./megbrain_test --gtest_filter=TestAsyncQueue.SynchronizerWaiterStarving by
2022-05-23 19:08:12 +0800
c2e9860f
chore(license): remove all license in file header by
2022-02-25 18:44:59 +0800
38b49272
fix(opr): fix no update ptr in reduce operator when input change by
2022-05-16 13:36:45 +0800
4cce2480
fix(dnn/opencl): fix some bug for dnn opencl conv bias and relayout format by
2022-05-19 15:53:46 +0800
ca0e616f
refactor(lite): refactor load_and_run profiling message by
2022-05-13 14:25:51 +0800
1783b897
feat(profiler): integrate cupti backend by
2022-04-12 19:23:57 +0800
e98049d7
feat(fallback): move arm_common resize f32 algo to fallback gi by
2022-05-16 14:46:08 +0800
5b69af20
Merge pull request #460 from kagome1007/updatereadme by
2022-05-20 11:12:25 +0800
0bed6c0f
(tag: v1.9.1, release-1.9)
chore(release): bump version by
2022-05-19 05:44:33 +0000
9488cd1c
fix(lite): fix lite cpu default not work by
2022-04-26 14:12:54 +0800
518c7f37
fix(imperative/src): fix empty_tensor bug of rng by
2022-04-11 16:20:17 +0800
cca38c4e
fix(mge): fix fastpath check by
2022-04-20 19:43:05 +0800
b9e850a4
test(imperative): check env values after each pytest by
2022-04-21 17:39:05 +0800
d984be59
fix(imperative): restrict value converts to symbolvar by
2022-04-20 11:21:58 +0800
824af20b
fix(mge): update readme by
2022-05-18 16:25:49 +0800
6814cf1c
fix(lite): fix lite test error by
2022-05-12 17:56:27 +0800
7c8f1847
fix(dnn/x86): fix x86 pooling exec by
2022-05-12 22:55:30 +0800
91aaafd5
feat(fallback): move arm_common pooling f32 algo to fallback gi by
2022-05-11 14:37:41 +0800
bde2efa3
feat(lite/load_and_run): support put and get model redis cache by
2022-04-15 19:54:07 +0800
48526abb
fix(mgb): fix concat cd4 tensor check size invalid by
2022-04-28 19:54:10 +0800
c87d998e
feat(mgb): add interface to support opencl IO zero copy when inference by
2022-03-09 15:46:14 +0800
af6cdb20
feat(fallback): fix ci by
2022-05-07 18:50:29 +0800
e4cc85e5
feat(fallback): move arm_common f32 convbias to fallback gi by
2022-05-06 15:06:00 +0800