411253f8
feat(mge): implement warp_affine backward by
2022-08-11 18:46:43 +0800
496070cf
fix(mge/quant): call is_distributed func correctly by
2022-09-14 15:42:20 +0800
f0291883
fix(mgb): make error infomation of group conv input channel mismatch more readable by
2022-09-09 16:21:30 +0800
31218a18
Merge pull request #476 from MegEngine/HuaHua404-patch-4 by
2022-09-26 18:29:49 +0800
bebaab3a
(HuaHua404-patch-4)
docs(readme): update key features description by
2022-08-12 16:55:06 +0800
7cd9b168
Merge pull request #487 from kagome1007/master by
2022-09-26 18:11:15 +0800
e73f4c73
fix(mge): update readme by
2022-09-26 15:48:16 +0800
6a0d797a
feat(ci): update submodules by
2022-09-17 10:39:43 +0800
d9770792
feat(third_party): update cpuinfo by
2022-09-13 15:22:51 +0800
d27a4456
fix(mge): update format by
2022-09-13 16:10:55 +0800
f50db0b8
Merge pull request #464 from fanhqme2:patch-1 by
2022-09-14 16:43:29 +0800
d205e7be
docs(mge/functional): update copy example by
2022-09-11 11:08:02 +0800
6e48e593
docs(mge/functional): update copy example by
2022-09-09 10:32:26 +0800
e89ee631
docs(mge/functional): update copy example by
2022-09-08 17:50:18 +0800
7655d99a
Merge pull request #478 from thunderstudying:docstring-copy by
2022-09-14 16:33:30 +0800
f045d3fb
(dev-support-lite-fork-debug-mode)
Merge pull request #486 from lyq998/support-lite-fork-debug-mode by
2022-09-14 14:05:30 +0800
37d81eae
feat(lite): support base lite ipc fork debug by
2022-09-09 20:43:18 +0800
fb2329e9
feat(dnn) add nchw44 deconv by
2022-08-29 09:59:38 +0800
a5dea703
build(mge): add boost (part 1) by
2022-09-09 14:01:06 +0800
1dd27b39
docs(mge): update functional.nn.max_pool2d docstring by
2022-09-02 11:53:17 +0800
6e54a3bf
docs(mge): update MaxPool2d & functional.nn.max_pool2d docstring by
2022-08-30 15:49:29 +0800
923dc2d5
Merge pull request #441 from Seeker98:maxpool_doc by
2022-09-13 10:23:26 +0800
a8b146d7
Merge pull request #397 from kagome112:master by
2022-09-11 11:03:32 +0800
5655636c
chore(mge/metric): funtional.topk_acc (add nn) use functional.metric by
2022-09-02 15:59:42 +0800
d5688c3d
chore(mge/functional): deprecate debug_param and prefer config by
2022-09-01 13:21:43 +0800
9bb2e395
chore(mge/tensorcache): rename the module making it not public by
2022-09-01 14:06:55 +0800
6e56d1bd
chore(mge/deprecated): add __deprecated__ attr for condition case by
2022-09-01 13:45:26 +0800
98460f58
chore(mge/misc): rename APIs that should not be public by
2022-09-01 17:30:01 +0800
217999b1
feat(arm): add winograd F43 NCHW44 algo and winograd F43 44 algo by
2022-08-15 19:06:44 +0800
f0f6f5fe
build(mge/whl): package libnvrtc-builtins.so.11.x by
2022-08-11 05:59:47 +0000
3f5e8e9c
docs(mge/functional): update copy example by
2022-08-28 14:27:05 +0800
fb9d73b2
fix(src/tensorrt): trt7 manage all workspace by
2022-09-01 20:51:45 +0800
dee02289
fix(serialization): when the dump fbsv2 model is used, the middle_tensor becomes optional by
2022-08-31 18:51:28 +0800
b2959589
docs(mge): update AvgPool2d example by
2022-08-30 11:40:52 +0800
ada54ba5
docs(mge): update AvgPool2d & functional.nn.avg_pool2d example by
2022-08-24 17:23:08 +0800
d8ea0168
docs(mge): update AvgPool2d & functional.nn.max_pool2d docstring by
2022-08-24 16:59:05 +0800
01353b2d
Merge pull request #442 from Seeker98:avgpool_doc by
2022-09-01 18:03:27 +0800
1529bce5
perf(opencl): add opencl weight transpose kernel by
2022-08-22 19:23:39 +0800
5f08b82f
(tag: v1.11.0, release-1.11)
fix(dnn/cuda): fix ptx mma algo compute bugs by
2022-08-17 16:11:49 +0800
d3e786ef
feat(imperative): load_nerwork_and_run enable weight preprocess by
2022-08-24 13:31:39 +0800
c6ff878d
feat(mgb): add cu114 wheel by
2022-08-19 16:58:03 +0800
76cd4a52
feat(mgb): add cu114 wheel by
2022-08-19 16:58:03 +0800
25f97b76
feat(imperative): load_nerwork_and_run enable weight preprocess by
2022-08-24 13:31:39 +0800
5ee00943
fix(dnn/cuda): fix ptx mma algo compute bugs by
2022-08-17 16:11:49 +0800
d1d8ddee
fix(lite): fix lar multithread options invalid by
2022-08-18 15:02:31 +0800
d4bf57d6
fix(lite): fix lar multithread options invalid by
2022-08-18 15:02:31 +0800
1404437a
fix(mgb): fix the compatibility issue of cuda stub with older version drivers by
2022-08-10 17:23:04 +0800
a6a2646c
feat(arm): add AlgoFP32Winograd F43, and add filter size into name of winograd-related algorithms by
2022-08-15 16:59:43 +0800
b8821edb
perf(dnn/aarch64): optimize aarch64 sigmoid with asm by
2022-08-12 14:03:41 +0800
2b99bfec
feat(arm): supports weight pre-processing for winograd benchmark tests by
2022-08-15 16:54:32 +0800
ab34bac4
fix(opencl/extern_c_opr): fix cl_mem UAF issue 2/2 by
2022-08-16 15:21:46 +0800
421bcfd3
style(mgb/tools): add format for tools, dnn and ci by
2022-05-10 17:36:27 +0800
99309fa3
feat(mge/functional): add param output_padding for deconv ops by
2022-08-01 15:18:41 +0800
116781ba
fix(mgb): fix megtee build errors by
2022-08-15 16:37:25 +0800
0467778f
chore(release): bump version by
2022-08-15 06:17:27 +0000
54b5db17
feat(x86/rvv): add AGENT_NCHW_NCHW44 algo by
2022-08-12 17:55:25 +0800
eaa18018
feat(x86/rvv): opt gi intrinsic helper for rvv, detail: https://github.com/riscv-collab/riscv-gnu-toolchain/issues/1106 by
2022-08-10 00:27:34 +0800
399db31a
fix(dnn): fix build by
2022-08-11 12:14:40 +0800
b8c7557b
fix(mm): fix mm error when use sync by
2022-08-10 16:57:24 +0800
73ad06ba
fix(mgb): fix fbsv2 model format no dump tensor format by
2022-08-10 20:46:34 +0800
399200b3
perf(serialization): optimized the memory usage when load new format model by
2022-08-02 19:49:14 +0800
f31e52d5
feat(mgb): warpperspective support multi src input by
2022-08-01 17:10:11 +0800
669816e2
feat(dnn): warpperspective support multi src input by
2022-07-31 00:46:23 +0800
33b27be8
fix(mgb/build): fix trt8 build error by
2022-08-05 19:42:04 +0800
bccda5c4
fix(mgb/imperative): fix repeat bug in trace mode by
2022-08-04 15:50:20 +0800
fca6c76a
fix(lite): fix input invalid bug in lar for fitting mode by
2022-08-05 15:58:18 +0800
1b943807
fix(dnn): fix reduce sum/mean error when b is large by
2022-08-01 15:52:38 +0800
c7a99098
feat(cuda): add int4 ptx 256x64 mma kernel by
2022-07-21 14:04:11 +0800
cf3ca1e9
feat(cuda): add int4 ptx 128x256 mma kernel by
2022-07-21 14:03:51 +0800
1f8e930e
feat(cuda): add int4 ptx 128x128 mma kernel by
2022-07-21 14:03:19 +0800
1a2ed8c4
feat(cuda): add convbias ptx algo testcase by
2022-07-21 14:00:15 +0800
64551105
feat(cuda): add convbias ptx algo by
2022-07-21 13:59:58 +0800
8395a459
fix(dnn/fallback): fix naive shift multidefination error and optimize GiCvtFromInt32V4ToUint8 by
2022-08-09 17:40:35 +0800
cc218550
feat(lite): load_and_run support optimize for inference by
2022-08-04 13:56:59 +0800
9bbe5500
fix(opencl/extern_c_opr): fix cl_mem UAF issue when run model OpenCL + ExternCOprRunner, for example by
2022-08-05 15:31:53 +0800
23a3d133
fix(dnn/softmax): create redcue and elemwise opr when get workspace size by
2022-08-06 10:38:19 +0800
2797fcfa
fix(mge/device): add missed API to __all__ scope by
2022-08-04 15:28:53 +0800
d7c546c9
fix(mge/interpreter): regenerates tensor when its dev value is needed by
2022-07-28 17:59:13 +0800
1f7bf1ad
fix(opr): fix the compatilibity of elemwise multitype new mode by
2022-07-29 21:07:42 +0800
b3a7d149
feat(dnn/fallback): add some new gi api by
2022-08-01 17:13:50 +0800
198ee068
feat(mgb/trt): update tensorRT toolchain to 8 by
2022-07-06 14:26:08 +0800
626222c6
fix(test): fix test for brainpp docker env by
2022-08-03 13:50:31 +0800
fac67e7c
feat(gopt): support nchw44 global pooling with fuse_grain by
2022-07-22 12:07:12 +0800
8461c8d8
fix(lite): fix ldr use lite interface error when open both fast-run and nchw44 by
2022-08-01 18:40:59 +0800
43bd949a
fix(dnn): fix cudnn include by
2022-08-01 16:43:07 +0800
8abc3ab8
fix(imperative): fix convolution in rocm by
2022-07-27 15:10:33 +0800
3b1101b5
feat(ci): update image by
2022-07-29 14:43:25 +0800
32b31fd5
fix(mgb): change the check method of cuda sm code by
2022-07-24 15:31:41 +0800
5f863682
Revert "feat(dnn): add elemwise modes" by
2022-07-28 19:49:44 +0800
d2a1905a
Revert "feat(mgb): add cumprod opr" by
2022-07-28 19:49:24 +0800
49e14f87
feat(mgb): add cumprod opr by
2022-04-25 18:46:17 +0800
87aedc29
feat(dnn): add elemwise modes by
2022-06-14 12:38:17 +0800
25e89d68
feat(gi/rvv): remove winograd rvv do not use FIXLEN workaround by
2022-07-27 20:17:43 +0800
fe5b1834
fix(mgb/imperative): fix the problem of occasional failure during testing of redis by
2022-07-18 11:51:13 +0800
b3f46734
feat(megdnn/softmax): add softmax operator in fallback by
2022-06-01 10:23:14 +0800
6c78e684
fix(lite): fix lite memory leak by
2022-07-12 15:08:21 +0800
ff239c63
feat(lite): add unit test for lar by
2022-07-08 13:43:17 +0800
7bf1c38c
fix(mgb/imperative): fix imperative code gen by
2022-07-19 11:58:16 +0800
c49d3070
refactor(imperative/ops): extends DnnOprCaller with template by
2022-07-01 16:58:22 +0800
2d6476a4
feat(lite): add auto decide model inference format option by
2022-06-30 17:11:16 +0800