10a0349e
feat(lite): add assert log for set_data_by_share and set_data_by_copy. pylite network input is not correct when input np is not continuous by
2022-07-19 17:31:55 +0800
f5597d9a
fix(mgb): make error infomation of input channel mismatch more readable by
2022-07-19 11:24:36 +0800
38bd5999
fix(mgb): make error infomation of invalid MatMul more readable by
2022-07-19 11:23:33 +0800
e0d505e6
fix(mgb/dnn): fix bug that some cutlass file compile very slowly on SM86 by
2022-07-12 16:53:12 +0800
cc31b9db
docs(mge/functional): fix vision docs by
2022-07-05 15:58:56 +0800
6f7649e9
docs(docstring): fix pad docstring by
2022-07-07 15:00:24 +0800
bf32d1e0
docs(dataloader): update dataloader docstring by
2022-07-12 15:58:16 +0800
4a32cc49
docs(mge/data): update Dataset class docstring by
2022-07-12 14:00:47 +0800
70fc5682
docs(mge/data): update MNIST dataset docstring by
2022-07-11 19:41:12 +0800
8fb062df
Merge pull request #468 from MegEngine/HuaHua404-patch-2 by
2022-08-01 14:58:53 +0800
e5ad3ea5
Merge pull request #469 from MegEngine/HuaHua404-patch-3 by
2022-08-01 14:54:10 +0800
b6f247d3
(tag: v1.10.0, release-1.10)
fix(build): disable cupti temporarily by
2022-07-11 19:07:58 +0800
58ba080d
feat(x86/rvv): make gi conv algo adapt to vv and vf model by
2022-07-14 18:37:02 +0800
bd50e457
feat(x86/rvv): make MATRIX_MUL_GI_F32_4x12 and FP32_GEMV_MK4_GI adapt to vv and vf model by
2022-07-14 18:34:18 +0800
5c3b4e95
feat(x86/rvv): opt AlgoFP32WinogradF63_4x4_NCHW44 by
2022-07-13 19:45:19 +0800
fa59a7b0
feat(x86/rvv): opt AlgoF32DirectNCHWNCHW44 and opt GiMaximumFloat32/GiMinimumFloat32 on x86 by
2022-07-13 18:43:55 +0800
0d82e9b7
feat(x86/rvv): opt FB_GI_F32_MK4_4x8 by
2022-07-12 17:23:05 +0800
3fbceb3a
fix(mgb/version): fix nvinfer.h not found by
2022-07-13 13:18:55 +0800
a54d9cb9
feat(x86/rvv): opt FB_GI_F32_MK4_PACK_4x12 algo by
2022-07-11 13:53:21 +0800
d60d028a
feat(mge/device): enable to get cuda/cudnn/tensorrt version by
2022-06-27 14:42:38 +0800
8fed114a
fix(build): disable cupti temporarily by
2022-07-11 19:07:58 +0800
19af2688
fix(mge/tools): fix module_stats for duplicated module by
2022-07-05 18:18:07 +0800
4cd4a38a
fix(mge/tools): fix network_visualize for op without out shapes by
2022-07-05 18:17:18 +0800
7badcb72
build(wheel): copy libcupti.so by
2022-07-09 04:51:49 +0000
83fb1622
build(wheel): copy libcupti.so by
2022-07-08 18:53:06 +0800
6dfd5a4c
fix(win7): workaround for hang when progress exit on win7+32bit by
2022-07-04 20:08:58 +0800
ba1508e3
fix(lite): fix exception bug for load and run by
2022-07-01 12:19:09 +0800
603d0941
fix(lite): fix layout transform bug in lar for testcase model by
2022-07-05 19:00:19 +0800
8e48b17a
(HuaHua404-patch-3)
docs(README_CN): add key features description by
2022-07-06 14:27:19 +0800
0004291a
(HuaHua404-patch-2)
docs(Readme): add key features description by
2022-07-06 14:21:58 +0800
78f90469
(HuaHua404-patch-1)
Update README_CN.md by
2022-07-06 13:59:51 +0800
51397bfc
feat(mgb): supports value infer and empty input tensor in ElemwiseMultiType by
2022-06-24 13:31:33 +0800
247e2f59
feat(mgb/dnn): add modes that the output type is bool in elemwise by
2022-05-09 22:43:38 +0800
f3863810
fix(imperative): fix inplace operation of optim by
2022-07-01 11:36:01 +0800
9330929f
feat(lite): add c_opr_init_interface for lar by
2022-06-30 18:41:42 +0800
16ba05a8
fix(dnn): fix dnn nchwxx elemwise performance by
2022-07-01 15:08:52 +0800
9de8c122
feat(mge/third_party): add cudnn-frontend by
2022-07-05 14:23:37 +0800
2b85948c
feat(mge/third_party): update flatbuffers version by
2022-07-04 18:11:41 +0800
6c4c4ca6
Merge pull request #459 from Qsingle:fix_overflow_of_flops_calculate by
2022-07-01 18:33:31 +0800
7b17c118
refactor(dnn): make cudnn_frontend work by
2022-06-28 15:37:49 +0800
35e9cc98
feat(dnn/cuda): add cudnn frontend api by
2021-12-06 16:30:09 +0800
9be6d7b6
fix(lite): fix lite header by
2022-06-29 17:21:53 +0800
ab8f6398
fix(test): make test install by
2022-06-28 17:43:48 +0800
99cfefbf
fix(test): fix test copybara by
2022-06-27 19:26:56 +0800
0d7ace15
fix(mgb/dnn): suport fp16 for resize nhwc by
2022-06-20 17:09:51 +0800
cfed86f9
feat(persistentcache): change file persistent cache with append model by
2022-06-27 19:05:39 +0800
d19fc2c1
fix(imperative): add alloc TensorPtr in imperative by
2022-06-06 20:44:35 +0800
5d9af3ec
fix(scripts): fix error of building whl with py310 by
2022-06-27 10:41:50 +0800
d1b6c040
feat(mge/third_party): update MegRay version by
2022-06-30 10:20:54 +0800
ac51f780
feat(mge/distributed): add support for batch send recv op by
2022-06-01 21:10:13 +0800
013bb14f
build(tablegen): pregen opdef tablegen targets by
2022-05-13 15:38:02 +0800
f12b75c0
perf(dnn/fallback): optimize some corner case in reduce by
2022-06-22 19:59:46 +0800
35cf0422
fix(ci): relax async timeout by
2022-06-24 14:59:57 +0800
7f024072
perf(dnn): speed up pad kernel by
2022-05-30 12:35:52 +0800
2886245b
perf(imperative/src): improve pad host performance by
2022-05-26 20:22:03 +0800
c6a350b1
chore(release): bump version by
2022-06-28 02:29:02 +0000
b55942a9
feat(dnn/naive/norm,-dnn/cuda/norm,-dnn/test/norm): add norm dnn opr, fwd only by
2022-06-06 20:56:30 +0800
7a7af8d7
fix(scripts): fix error of building whl with py310 by
2022-06-27 10:41:50 +0800
9d397727
fix(traced_module): fix bug of renaming tesnor by
2022-06-23 14:28:22 +0800
31f31cef
fix(lite): fix record invalid for load and run by
2022-06-23 15:08:52 +0800
5a355138
fix(mgb): fix profile skip condition by
2022-06-14 13:19:07 +0800
5bdc430e
fix(mgb/fastrun): fix megbrain fastrun memory overflow bug by
2022-05-24 10:44:23 +0800
d7ddd43f
feat(imperative): support python3.9 and fix some tests by
2022-06-15 15:04:13 +0800
a4f019e7
fix(mgb/atlas): fix issue with profiling on Atlas by
2022-06-15 11:28:34 +0800
5ba76637
fix(traced_module): fix trace_module function may raise error in finally scope by
2022-06-22 15:31:25 +0800
e6dcfbe8
fix(traced_module): fix traced module compatible issues by
2022-06-10 16:57:39 +0800
18f83a25
docs(mge/functional): fix F.svd docstring by
2022-05-10 13:53:14 +0800
0ba9326e
docs(api/lite): add doc for lite common_enum_c_h by
2022-06-22 16:23:52 +0800
92ded721
docs(api/lite): add global setting docstring for megenginelite by
2022-06-21 16:07:36 +0800
171c683d
docs(api/lite): add struct python api desc by
2022-06-20 20:03:44 +0800
3cd54dd6
docs(api/lite): add doc for lite tensor by
2022-05-24 18:50:18 +0800
a891f9b3
docs(api/lite): add megenginelite.network api doc by
2022-05-31 16:07:16 +0800
5ef1ac75
docs(api/lite): add lite network api doc by
2022-05-26 17:09:18 +0800
c47f48ef
docs(api/lite): add lite global.h and pylite utils.py doc by
2022-05-24 18:50:18 +0800
5821c0b6
docs(lite): initial part of lite::Tensor comments as template by
2022-04-19 18:55:58 +0800
26c2563b
feat(ci): update flatbuffers by
2022-06-24 12:16:26 +0800
4cdb7454
feat(rvv/fallback): make nchw44 happly on rvv by
2022-06-22 16:52:21 +0800
5e306b75
feat(x86): make conv1x1 and im2col available on with x86-NCHW44 add AlgoF32GiMK4Pack4x12 matrix_mul algo by
2022-06-21 19:26:11 +0800
481a6cbb
feat(x86): make nchw44 happly on x86 by
2022-06-21 19:23:02 +0800
5873d5f5
feat(gi): add more gi api by
2022-06-21 19:21:59 +0800
cfc41648
fix(mge): fix grad of maximum(x, x) by
2022-06-20 16:21:09 +0800
bbafe699
feat(dnn): add elemwise COND_LT_MOV by
2022-06-20 15:23:30 +0800
ed92b9c1
fix(cmake): fix compilation options for benchmark in cmake by
2022-06-10 16:00:19 +0800
a49d5bf9
fix(autodiff): fix inplace operation on autodiff.Function by
2022-06-20 16:46:03 +0800
7252825c
fix(functional): broadcast_to supports mutable target shape by
2022-06-20 16:45:12 +0800
2484cd27
fix(tensor): check args when construct tensor with existing tensor by
2022-06-20 16:44:43 +0800
e7587617
fix(lite): fix packed model compatibility by
2022-06-14 21:57:50 +0800
a0a5fcf1
feat(dnn): support tf32 by
2022-04-19 17:49:03 +0800
f0088335
feat(mgb): upgrade flatbuffer by
2022-06-15 15:22:25 +0800
657db8dc
chore(tools): remove dump_with_testcase_mge, user should use jit.dump instead by
2022-06-08 16:36:04 +0800
4d22e85b
feat(ci): add completeness compatibility check by
2022-06-10 10:54:39 +0800
f7b03959
perf(mgb/compile): improve compile time according the file map of compile time by
2022-05-13 15:12:05 +0800
124f38c4
perf(mgb/compile): improve compile time for megbrain by
2022-05-12 17:17:47 +0800
d3247bee
fix(dtr): always write shape when tensor produced by
2022-06-13 14:03:02 +0800
0a266d7a
feat(riscv): speed up bazel build and fix rv64gc without rvv build by
2022-06-15 20:12:12 +0800
36ba1d6d
fix(riscv): fix ci fp16 build and move test GI_TEST_NAIVE by megdnn_gi_api_test by
2022-06-16 11:16:57 +0800
dcce4610
feat(cmake/riscv): make riscv happy by
2022-06-14 18:16:58 +0800
698dcef4
feat(gi/x86): fix _mm_slli_si128 build at clang by
2022-06-13 17:59:43 +0800
2d806f9c
feat(gi): make conv_bias apply gi class type by
2022-06-14 18:15:52 +0800
19d36fa0
feat(gi): make pooling apply gi class type by
2022-06-14 18:15:07 +0800