570 Commits (824af20bd895d4908c3f130eed597cb52ee69e14)

Author SHA1 Message Date
  Megvii Engine Team 7c8f184723 fix(dnn/x86): fix x86 pooling exec 3 years ago
  Megvii Engine Team 91aaafd587 feat(fallback): move arm_common pooling f32 algo to fallback gi 3 years ago
  Megvii Engine Team 48526abb79 fix(mgb): fix concat cd4 tensor check size invalid 3 years ago
  Megvii Engine Team af6cdb2004 feat(fallback): fix ci 3 years ago
  Megvii Engine Team e4cc85e52c feat(fallback): move arm_common f32 convbias to fallback gi 3 years ago
  Megvii Engine Team 0f1afb0935 feat(fallback): imp gi matmul AlgoF32GiMK4_4x8 algo, 3 years ago
  Megvii Engine Team 410dcb6c69 feat(fallback): add more gi api for conv, and add gi API test 3 years ago
  Megvii Engine Team 05186e7bd9 fix(midout): fix elemwise crash after midout 3 years ago
  Megvii Engine Team 70209667e8 fix(dnn/test): fix some bug when force_deduce_layout is off 3 years ago
  Megvii Engine Team e2f5156b69 refactor(megbrain): save fastrun result to algorithm cache 3 years ago
  Megvii Engine Team d968942fe3 perf(cuda): speedup direct large kernel conv 3 years ago
  Megvii Engine Team 7dc347697a feat(dnn/cuda): add typecvt uint16 3 years ago
  Megvii Engine Team 73112558d0 feat(mge/dnn): support checknonfinite for fp16 3 years ago
  Megvii Engine Team ed7fa10470 feat(fallback): move direct multi_thread_common helper to fallback 3 years ago
  Megvii Engine Team 8871ad74af refactor(fallback): opt gi naive reinterpret 3 years ago
  Megvii Engine Team ffbf8fad6c feat(fallback): add general intrinsic to elemwise multitype 3 years ago
  Megvii Engine Team 14e9ad625d fix(megdnn): emit define-but-not-referenced and extra-;-ignored warning on cuda9.0~cuda9.1 3 years ago
  Megvii Engine Team 4c0bff1dba refactor(megdnn): refactor TEGRA_X1/X2 macro 3 years ago
  Megvii Engine Team c2435d1561 perf(imperative): specialize adaptive pooling 3 years ago
  Megvii Engine Team 39d98d4525 feat(fallback): add fallback typecvt with general intrinsic 3 years ago
  Megvii Engine Team d2278f02d2 perf(imperative): speed up conv_transpose3d 3 years ago
  Megvii Engine Team 3a5347ed21 perf(imperative): speed up pooling 3 years ago
  Megvii Engine Team d9c4ef59fe perf(imperative): using simple hash key in heuristic cache 3 years ago
  Megvii Engine Team b6ad457269 feat(cuda): support int1 simplewq conv 3 years ago
  Megvii Engine Team 331567af5d fix(opencl/ci): misc opt and fix: 3 years ago
  Megvii Engine Team ff6a3bb819 fix(fallback): delete the repeat opcaller in fallback and arm_common 3 years ago
  Megvii Engine Team 547945e854 feat(fallback): support general intrinsic in elemwise in fallback 3 years ago
  Megvii Engine Team a017bed3aa fix(fallback): reman general intrinsic type and add more intrinsic 3 years ago
  Megvii Engine Team fd6f8e58b0 feat(mgb/dtype): add dtype qint1 3 years ago
  Megvii Engine Team 2b80806f21 perf(imperative/src): improve dot performance 3 years ago
  Megvii Engine Team 3c3fc6f33c refactor(imperative): move python code of elemwise/reduce/conv2d/bn to c++ 3 years ago
  Megvii Engine Team e400b7ffe5 perf(imperative): enable memory forwarding for imperative 4 years ago
  Megvii Engine Team 1ce78aa09b fix(imperative): destruct dnn handles at last 3 years ago
  Megvii Engine Team 3228fb75a5 fix(cuda): conv algo heuristic choose 3 years ago
  Megvii Engine Team 8c415f4ed7 feat(dnn): cuda nhwc nearest resize support not 1 or 3 channel 3 years ago
  Megvii Engine Team 6fb5a34360 build(flatbuffer/cx2): fix cx2 build and fix uclibc build flatbuffer 3 years ago
  Megvii Engine Team 87de704a46 feat(gopt): fuse conv h_swish 3 years ago
  Megvii Engine Team 3726f5cc92 feat(gopt): merger consecutive relayout and dimshuffle to one relayout to optimize CD4 performarce 3 years ago
  Megvii Engine Team ac26bdcef5 fix(cuda): fix direct conv speed and memory problem 3 years ago
  Megvii Engine Team f7994683bd feat(cuda): add large kernel direct conv to heuristic algo chooser 3 years ago
  Megvii Engine Team 6dc0c0b9cc fix(dnn): fix the sync problem in some kernels 3 years ago
  Megvii Engine Team 04193e3bd1 feat(dnn): add nearest mode for remap and resize 3 years ago
  Megvii Engine Team 93c7e45188 feat(arm): delete the reduant implement 3 years ago
  Megvii Engine Team e34a642b31 feat(fallback): reduce support general intrinsic 3 years ago
  Megvii Engine Team 10f23778a8 feat(fallback): add simd general intrinsic 3 years ago
  Megvii Engine Team 286051ede1 feat(dnn): differentiate sass kernel with cuda version 3 years ago
  Megvii Engine Team f78b60ec10 feat(bazel): make bazel gensass depend on cuda toolchain version automatically 4 years ago
  Megvii Engine Team f48227c07d feat(mgb): show more details for cuda driver api call 4 years ago
  Megvii Engine Team d8bb3ff5b4 fix(cuda): fix fp16 tensorcore gemm split k workspace 3 years ago
  Megvii Engine Team d7b0994a3e feat(cuda): add fp16 compute 16 kernel 3 years ago