8fddd808
fix(profiler): respect record_device option by
2021-07-01 14:32:42 +0800
59d59766
fix(profiler): collect records when thread exit by
2021-07-01 12:50:06 +0800
a841d8e7
refactor(profiler): refactor event processing by
2021-06-30 12:27:12 +0800
dff7719e
feat(mge/distributed): add preload host data with op by
2021-09-10 11:28:25 +0800
8b94f493
fix(dnn/cuda): fix elemwise and relayout int4 bug when last shape is 1 by
2021-09-10 11:23:53 +0800
694aa1bd
feat(dnn): add heuristic cache by
2021-08-20 19:55:51 +0800
ca4374b6
perf(subgraph): remove unnecessary sync by
2021-09-13 00:34:23 +0800
ad079009
fix(interpreter): avoid deadlock in GetValue by
2021-09-10 18:27:16 +0800
4688bbab
refactor(functional): hide functional.vision and replace with functional.nn by
2021-09-13 20:47:14 +0800
8796586b
refactor(functional): import all from metric in nn by
2021-09-13 20:03:32 +0800
fb15b301
docs(readme): fix md relative path by
2021-09-15 18:49:15 +0800
bc9cfc27
feat(mgb): add arm resize nchwxx and naive nearest interp by
2021-08-29 18:07:39 +0800
589b427e
Update README.md by
2021-09-15 18:44:24 +0800
d5cff81c
fix(mge): limit task queue size by
2021-08-05 18:13:50 +0800
4ef91136
docs(readme): update build md by
2021-09-10 17:59:05 +0800
bc9c47e7
fix(mgb): use link.exe when compile windows by
2021-09-14 14:03:43 +0800
668a8486
fix(cmake/windows): fix low probability build failed on Windows by
2021-09-14 15:26:08 +0800
26ac2b00
fix(cmake): opt cmake script: support list option and config ninja max jobs by
2021-09-14 14:49:34 +0800
af80d365
fix(cmake/windows): fix cmake install failed on Windows by
2021-09-14 11:18:59 +0800
8e4c3c53
feat(imperative/opr): fix extern c opr mace example md by
2021-09-14 18:52:25 +0800
27f2ecc4
feat(imperative/opr): add extern c opr and support dump by
2021-09-10 18:06:12 +0800
91264f37
fix(traced_module): fix __getattr__ of TracedModuleBuilder by
2021-09-13 21:47:32 +0800
526c82c8
feat(traced_module): delete value of node when it will not be used by any expr by
2021-09-13 21:15:39 +0800
db9ca196
refactor(traced_module): refactor some testcases and examples by
2021-09-14 10:17:35 +0800
6ce1483b
fix(utils/network): fix replace var in different networks by
2021-09-13 22:13:12 +0800
73d25779
fix(ops): check index layout for IndexingOneHot by
2021-09-07 21:42:22 +0800
10198650
feat(module): add tensors and named_tensors by
2021-09-07 21:16:52 +0800
bd817f3a
perf(syncbn): fallback to bn when sync is not required by
2021-09-07 21:13:24 +0800
ffcb4dac
style(mge): change the pad opr doc and format by
2021-09-13 10:59:35 +0800
43cb305e
feat(mge): add imperative pad by
2021-07-21 17:29:36 +0800
567586a0
feat(mgb): fastrun algo profile deduplication by
2021-09-09 18:22:54 +0800
8110bb21
fix(cmake/env): fix a bug about env set in cmake build mode by
2021-09-06 12:33:17 +0800
0a665ea4
feat(mge/device): enable to get cuda compute capability by
2021-08-11 17:18:00 +0800
722aecd4
feat(mgb): support fp16 nhwc backward by
2021-08-18 11:02:14 +0800
7a9f2ed9
perf(functional/split): add python binding for Split opr by
2021-08-20 19:16:51 +0800
ab309eb5
feat(mgb/opr): let Split support empty IO by
2021-08-20 19:14:49 +0800
a8292704
fix(mge/tracing): replace detach as fast path copy by
2021-07-16 17:05:00 +0800
8485eff1
docs(mge/functional): add docs for megengine.functional.sub by
2021-08-25 20:47:48 +0800
32ad9265
feat(mge): improve presentation of async errors by
2021-09-10 18:01:48 +0800
e6a8b025
fix(mge): ignore errors caused by earlier async errors by
2021-08-23 21:41:39 +0800
0708bc78
fix(dnn/cuda): disallow implicit dtype conversion in cublaslt matmul algos by
2021-08-31 16:52:06 +0800
3f01112a
build(require): update pytest-sphinx version requirement by
2021-09-09 16:48:42 +0800
ee790cb6
fix(imperative/python): fix warp_perspective doc and arange dtype by
2021-09-03 11:33:09 +0800
c5f8b583
docs(docstring): transfer to google style by
2021-09-06 15:18:51 +0800
76ce81e8
fix(mge): fix F.nn.dropout train and inference bugs by
2021-09-03 23:43:41 +0800
5431929e
feat(functional): let advance indexing support empty tensor and add more tests by
2021-08-18 17:21:01 +0800
703b783c
feat(mgb/opr): let Indexing(Set)MultiAxisVec support empty input by
2021-08-18 20:46:40 +0800
a430c912
feat(mgb/opr): let CondTake support empty input by
2021-08-18 17:17:58 +0800
432fdb7e
feat(mgb/opr): let SetSubtensor support empty IO by
2021-08-07 18:48:42 +0800
e954b8f9
feat(mgb/opr): let Subtensor support empty IO by
2021-07-30 18:28:15 +0800
1e83ab63
feat(dnn): add channelwise conv for fp16 nchw88 by
2021-08-15 00:37:08 +0800
28c066ee
fix(cmake/llvm): fix llvm build if env install llvm-12-dev now, we do not depends LLVM_ENABLE_BINDINGS and LLVM_INCLUDE_DOCS module so disable it! if later depends on this module, need upgrade third_party/llvm-project/ to 12 to compat build env which install llvm-12-dev by
2021-09-03 10:14:17 +0800
7b855dc6
fix(dnn/cuda): fix compilation for windows bazel by
2021-09-02 11:57:20 +0800
3abe0b24
fix(mgb): fix rocm pooling by
2021-08-26 18:26:50 +0800
f9722af3
fix(src/json): fix parsing error when inf is dumped as a Number by
2021-09-01 22:10:50 +0800
16678bb9
fix(dnn): fix_short_cutlass_name_gemm by
2021-08-30 14:00:44 +0800
8fb21b3b
(tag: v1.6.0-rc1)
chore(release): bump version by
2021-09-01 19:08:52 +0800
6f455ae9
feat(mge/third_party): update cutlass version by
2021-09-01 15:50:37 +0800
7a0d9626
opt(cpuifno): move cpuinfo from pytorch to MegEngine by
2021-09-01 14:37:14 +0800
4c13bc7e
feat(dnn/cuda): add nhwc int8 deconv by
2021-08-17 13:13:35 +0800
11f022ff
feat(dnn/cuda): add nhwc int8 imma conv and conv fuse typecvt by
2021-08-11 19:01:26 +0800
03e80759
feat(mge/distributed): add hybird parallel Opr by
2021-08-23 17:46:18 +0800
61a5df32
refactor(third-party/cpuinfo): opt cpuinfo repo by
2021-08-27 17:14:55 +0800
a4f0e581
fix(mgb/extern_c_opr): throw exception when extern c opr loader was created by
2021-08-24 18:51:57 +0800
e5217db9
feat(traced_module): move traced_module out of the experimental folder by
2021-08-27 15:22:40 +0800
6d1a4f20
feat(traced_module): support tracing submodules in list/dict by
2021-08-21 19:39:09 +0800
a3f9073c
feat(traced_module): update graph transform and add _module_name by
2021-08-10 16:30:04 +0800
b3d0affa
feat(traced_module): support trace custom qat module by
2021-08-12 17:36:54 +0800
15712807
feat(traced_module): add name to Node by
2021-08-06 17:55:16 +0800
e918f0aa
feat(traced_module): add treedef leaf node check and add some graph api by
2021-07-29 17:48:46 +0800
c7e730bc
feat(traced_module): add some functions of graph modification by
2021-07-26 00:02:04 +0800
f88bd3ae
refactor(traced_module): let TracedModule own argdef_graph_map by
2021-07-26 14:07:06 +0800
b1c46ba4
feat(traced_module): add some functions of graph modification by
2021-07-08 16:15:08 +0800
4bb25369
feat(traced_module): let CallFunction own graph by
2021-07-07 14:37:33 +0800
9a6a3793
feat(traced_module): add visit method by
2021-07-06 15:21:11 +0800
442b4f6c
test(traced_module): add some testcases for traced module by
2021-06-23 10:56:33 +0800
f2691566
feat(traced_module): add pytree by
2021-07-05 19:14:40 +0800
bee305be
feat(traced_module): add functional trace and CallMethod/Function expr by
2021-06-30 19:27:45 +0800
763c56f3
feat(imperative): add traced module by
2021-03-04 14:53:22 +0800
9279104b
feat(mge): add opdef serialization and apply_module_trace by
2021-08-20 19:11:48 +0800
aa204040
feat(lite): add lite static all in one by
2021-08-26 19:09:18 +0800
a0231a79
fix(dnn/cuda): fix algo matmul for conv bwd filter by
2021-08-25 15:59:21 +0800
f3ed59d3
feat(dnn/opencl): add heuristic rule for elemwise by
2021-08-25 14:26:22 +0800
29d24dbb
fix(mge/function): fix interpolate unsupport fp16 error by
2021-08-26 19:54:59 +0800
36df3850
test(mgb): remove the padding random test case by
2021-08-25 18:04:57 +0800
e21967bb
feat(mge): add env MGE_FASTRUN_CACHE_DIR by
2021-08-24 18:07:28 +0800
6a1ec8a8
feat(mge): add git commit-id into fastrun cache key by
2021-08-20 18:14:23 +0800
ae87876d
feat(mge): refactor weightscaler by
2021-08-17 11:08:31 +0800
5d9ac970
fix(mgb): fix fastrun compnode by
2021-08-16 16:53:16 +0800
56c1b626
refactor(dnn): move arch-dependant code to arch.h by
2021-08-19 17:44:04 +0800
67575d58
feat(mge/opr): add interpolate bilinear mode by
2021-08-13 12:20:26 +0800
0558b212
feat(mge/opr): add interpolate nearest mode by
2021-08-06 15:58:21 +0800
171d6915
fix(fp16): fix midout build issue when hit fp16 trace by
2021-08-24 16:54:43 +0800
127870a9
feat(dnn/opencl): add heuristic rule for batched matmul by
2021-07-20 10:31:22 +0800
d86ed426
fix(dtr): simulate the system stack to avoid stack overflow during recomputing by
2021-08-16 18:00:53 +0800
c25125e3
perf(dnn/cuda): sass int8 epilogue remove shared load by
2021-08-04 17:36:33 +0800
bc2b1690
ci(thirdparty): add third_party cache by
2021-07-29 18:17:47 +0800
6070f127
fix(mgb): fix getting static memory alloc info by
2021-06-07 10:19:28 +0800
e8a5932d
perf(mgb/gopt): optimize impl of reformat builders by
2021-07-07 16:24:11 +0800
58b8b145
refactor(mgb/gopt): add checker for reformat emitter by
2021-07-07 13:30:49 +0800