6ce212d2
refactor(mgb): refactor group conv by
2021-06-30 20:02:52 +0800
febd0b17
ci(fix): fail when git user name or email is empty by
2021-07-27 20:07:46 +0800
eb2dd018
build(fp16): fix fp16 build by
2021-07-23 18:28:22 +0800
f76a2cc2
feat(mge/opr): add silu and gelu by
2021-07-16 15:44:33 +0800
f2ac4c34
docs(distributed.functional.all_reduce_sum): googlestring and examples by
2021-07-23 17:15:50 +0800
186bacfb
fix(mge): recover bn freeze fastpath execution by
2021-07-22 14:10:03 +0800
5f558042
fix(imperative/ops): use tblgen to generate FastpathCopy by
2021-07-23 16:23:15 +0800
bfc4e7a9
docs(mge): fix amp docstring problems by
2021-07-21 16:47:04 +0800
0b764cf2
docs(mge/functional): add docs for megengine.functional.full_like by
2021-07-19 21:16:44 +0800
f1411590
refactor(mge): loose the error bound of fastrun by
2021-07-20 17:06:10 +0800
1f043696
refactor(mge/distributed): using nccl as default in distributed training by
2021-07-19 16:26:52 +0800
b17a02d4
feat(mge/distributed): deprecate get_device_count_by_fork by
2021-07-19 14:54:45 +0800
f8b0f2cb
build(dnn/cutlass): fix build for cutlass by
2021-07-19 14:25:58 +0800
0fb4e9a9
fix(ci): git set user and email by
2021-06-24 14:20:40 +0800
6af4a32e
feat(mge/third_party): update MegRay version by
2021-07-19 20:53:10 +0800
093f7ae7
feat(mge/third_party): update cutlass version by
2021-07-19 17:12:41 +0800
c2daea3c
chore(release): bump version by
2021-07-18 16:39:40 +0800
207a3463
chore(mge): run get_device_count("gpu") in subprocess by
2021-01-19 18:27:50 +0800
869a0327
perf(mgb): disable FoldingConvBiasDimshufflePass in cuda10 for performance by
2021-07-13 13:13:03 +0800
0baf6b0d
Merge pull request #175 from tpoisonooo:fix-spell-error by
2021-07-23 18:14:07 +0800
239916a9
fix(mgb/gopt): fix testcase for enable nchw64 pass by
2021-06-16 16:32:36 +0800
2ab5c53f
feat(mgb/gopt): support nhwc conv in tensor reformat pass by
2021-06-09 14:54:29 +0800
009c90a2
feat(mgb/gopt): modify padding policy for 4bit conv bias oprs by
2021-06-08 14:30:19 +0800
4eda3388
feat(dnn/cuda): generate cutlass kimpls using cmake and bazel by
2021-06-21 17:53:19 +0800
8d248a6a
fix(dnn/cuda): fix testcase for fallback nchw qs8 conv by
2021-06-23 15:18:43 +0800
894a2407
feat(dnn/cuda): add relayout format kernel for nchw <-> nhwc by
2021-06-04 14:43:59 +0800
43c59204
refactor(dnn/cuda): refactor relayout format kernels by
2021-05-31 19:01:42 +0800
f41a8086
feat(dnn/cuda): add nhwc int4 conv support by
2021-06-11 14:48:54 +0800
5a14a892
refactor(dnn/cuda): refactor cutlass kernel generator for gemm and gemv by
2021-05-31 11:41:30 +0800
b33217d8
refactor(dnn/cuda): refactor cutlass kernel generator for deconv operation by
2021-05-27 19:32:24 +0800
4abf7bd3
refactor(dnn/cuda): refactor kernel generator for cutlass convolution kernels by
2021-05-27 13:35:25 +0800
b4687ce8
feat(dnn/cuda): add convolution with i8 input and u4 output by
2021-05-24 19:03:53 +0800
00083d13
fix(dnn/cuda): fix recursive algo search for fallback_nchw_qs8 by
2021-05-20 14:41:06 +0800
bba04f02
feat(mgb/gopt): add fusion support for conv, astype(s4) and reformat by
2021-05-18 20:46:10 +0800
66f70578
feat(dnn/cuda): add convolution with i8 input and i4 output by
2021-05-18 19:08:22 +0800
6d686ff2
feat(gopt/inference): allow Float32 output dtype in EnableNCHW64Pass by
2021-06-16 15:56:03 +0800
7d3df995
feat(gopt/inference): allow Float32 output dtype in EnableNCHW4Pass by
2021-06-11 18:38:56 +0800
633016a9
fix(dnn/cuda): fix AlgoFallbackNCHWQS8 to support Float32 dst by
2021-06-10 18:30:41 +0800
e6caa9ff
feat(opr): add bn backward for inference mode by
2021-06-01 12:36:09 +0800
c90fa087
test(mge): delete test_external.py by
2021-07-17 15:46:22 +0800
b2944559
fix(imperative/module): remove ``__getattribute__`` method in module by
2021-07-03 12:25:02 +0800
77ead937
fix(src/serialization): fix compatibility error of oss model by
2021-07-13 14:32:05 +0800
070c8117
fix(imperative): remove convert_inputs by
2021-07-07 10:15:13 +0800
f40df602
docs(mge): refactor docs to remove warnings by
2021-07-09 14:34:20 +0800
1040b778
fix(mge/functional): fix F.topk(kth_only=True) by
2021-06-28 19:42:16 +0800
551cc701
docs(distributed.functional): add return type for all_reduce_max (jira #MGE-2706) by
2021-07-13 18:51:53 +0800
72ff7aec
feat(docs): add docs for megengine.functional.ones_like(jira #MGE-2702) by
2021-07-13 23:56:29 +0800
7c9569e4
fix(mge/random): fix random seed by
2021-07-09 14:33:09 +0800
07de1571
fix(mgb): remove static mem record from tee by
2021-07-06 15:43:35 +0800
d7b6bfd5
test(mge/fakequant): use fixed input for lsq test to temperarily avoid precision error by
2021-07-14 11:23:16 +0800
5cef74a7
feat(mge/amp): add GradScaler support by
2021-06-29 18:17:44 +0800
1bf18252
feat(mge/amp): add mix precision autocast support by
2021-05-11 18:55:40 +0800
f12355f7
fix(imperative/grad): fix hardcode dtype in subtensor_grad_rule by
2021-05-18 10:46:48 +0800
4e4497b9
refactor(mgb/dnn): x86 pooling rebase algochooser by
2021-07-12 17:01:04 +0800
a33c3b73
refactor(mgb/dnn): arm pooling rebase algochooser by
2021-07-11 22:52:32 +0800
8dea6b3c
build(dnn): compat for more windows env by
2021-07-08 14:30:14 +0800
56b94d89
feat(dtr): add sqrt sampling by
2021-07-13 03:01:41 +0800
8a73193c
feat(dtr): remove eviction threshold by
2021-07-12 18:16:53 +0800
69d1fd0f
refactor(opdef): split apply_on_physical_tensor into infer_output_mem_desc and execute by
2021-07-12 17:38:36 +0800
75eb04c5
feat(mge/experimental): add WeightScaler support by
2021-07-01 14:16:07 +0800
dedecf69
fix(imperative/utils): fix logical error of replace var by
2021-06-23 18:12:47 +0800
ea70d99b
fix(mge/convbias): make fallback convbias support nhwcd4 layout by
2021-06-29 13:46:17 +0800
497ef6c3
fix(mge/dist): fix gl oom error by
2021-07-13 12:51:02 +0800
43098fb8
feat(mge): add SlidingWindowTranspose opr by
2021-06-29 14:19:59 +0800
df79334c
feat(mge/distributed): add user_pop function to save device memory by
2021-07-07 20:35:06 +0800
1eaf32cd
fix(mgb): fix typo in message by
2021-07-05 10:32:14 +0800
7225b0f0
fix(mge/utils): use static infer manager to get value of network.varnode by
2021-06-10 17:29:19 +0800
ffe2bb2e
fix(mge): fix some errors caused by unknown shape when using symbolic trace or building graph by
2021-06-10 17:24:43 +0800
2d42455f
fix(mge/utils): fix toposort to get definition order by
2021-06-01 19:15:43 +0800
0c97b2a3
fix(module): remove assert during forward by
2021-07-08 19:56:38 +0800
42711308
fix(module/normalization): fix bug of LayerNorm and support input of any shape by
2021-05-14 20:40:22 +0800
a95f6d4f
perf(trace): add fastpath for const value assert by
2021-07-11 01:45:34 +0800
2cd98232
fix(mgb/tensorrt): fix trt runtime, padding channel to a multiple of 4 when using kCHW4 IOFormat by
2021-07-07 18:56:40 +0800
b078dda9
feat(mge/random): add some random op and remove random/distrbution.py by
2021-06-07 16:44:20 +0800
83e4c9d7
fix(opencl): open opencl topk test when opencl beyond 2.0 by
2021-07-05 14:12:39 +0800
f30c0e06
feat(mgb/opr): add lsq opr by
2021-06-28 13:14:08 +0800
eb66681f
fix(mge/random): fix delete_rng_handel by
2021-07-08 20:30:21 +0800
d6db4fea
fix(mge/module): set no_cache=true when loading state dict by
2021-07-01 13:51:11 +0800
fea1bba2
fix(mge/tools): fix module status error by
2021-07-08 15:32:38 +0800
6cd01d5a
feat(imperative/functional): let elemwise support empty IO & add some tests by
2021-06-25 16:36:55 +0800
dea52781
feat(mgb/opr): let PowC & TypeCvt support empty IO by
2021-06-25 16:34:58 +0800
2f68aeb9
feat(imperative/jit): let trace support empty IO by
2021-06-25 16:27:56 +0800
809d5056
feat(mge/distributed): enable pt shm allreduce by
2021-03-10 16:50:44 +0800
02455941
test(autograd): test jvp emulated by 2nd grad by
2021-07-07 14:16:09 +0800
8480302d
fix(autograd): make higher order grad experimental by
2021-07-06 13:16:43 +0800
72531f2b
test(autograd): add more tests for higher order grad by
2021-06-18 14:40:24 +0800
522e556b
feat(autodiff): support higher order grad by
2021-05-24 16:56:46 +0800
5198b783
fix(mge/functional): fix expand_dims for scalar by
2021-07-02 15:28:45 +0800
88898e63
fix(mgb): replace if_constexpr with runtime function to avoid potential bug by
2021-07-05 15:54:02 +0800
25932352
refactor(mgb/dnn): rocm pooling rebase algochooser by
2021-06-15 13:53:39 +0800
1cfdbc56
feat(dnn): add deterministic max pooling by
2021-05-19 17:49:45 +0800
20ab82d0
fix(tee): fix tee crash by
2021-07-05 19:00:11 +0800
933dd9a4
feat(mge/distributed): add cuda env check before forked thread by
2021-06-29 16:37:01 +0800
2a541961
fix(tee): fix tee link by
2021-07-01 23:09:30 +0800
a5060a2b
feat(mgb/opr): add check_has_inf kernel and opr by
2021-06-23 18:38:22 +0800
3597a6db
feat(dnn/arm): nchw_nchw44 conv support 1x1s1 by
2021-06-23 21:18:09 +0800
c64b1c94
feat(imperative/functional): add roll in functional by
2021-05-25 14:12:27 +0800
d915c5a3
refactor(mgb): make convolution3D handle noncontiguous tensors by
2021-06-24 14:18:02 +0800
d04cd67f
refactor(mgb): make conv-backward-filter handle noncontiguous tensors by
2021-06-11 19:17:20 +0800
44376f70
refactor(mgb): make conv-backward-data handle noncontiguous tensors by
2021-06-11 18:24:29 +0800