226 Commits (3bf73ff16f27d11b1a5ff07762b6cff5d93b8a84)

Author SHA1 Message Date
  Megvii Engine Team 3bf73ff16f feat(dnn): add cuda preprocess fusion 5 years ago
  Megvii Engine Team 86cf7490ec feat(dnn/aarch64): add quantizeds4 matmul int4x4x16_k8x8x8 5 years ago
  Megvii Engine Team 142f31a875 perf(dnn/cuda): change conv_bias heu, prefer dnn chanwise impl, dislike dnn batch gemm conv1x1 5 years ago
  Megvii Engine Team f214e14695 refactor(mgb/cuda): use single implementation of get_device_prop from utils 5 years ago
  Megvii Engine Team 54e79dd1d9 perf(mgb/cuda): do not call cudaGetDeviceProperties to avoid io traffic 5 years ago
  Megvii Engine Team a1877ee0fa refactor(dnn): refactor algo interface, use algoinfo instead of global algorithm 5 years ago
  Megvii Engine Team 6f5d0febf1 perf(dnn/cuda): enhance performance for pooling forward 5 years ago
  Megvii Engine Team 6856ce9ce2 feat(dnn): support conv bias activation for nchw4 input tensor format and nchw output tensor format 5 years ago
  Megvii Engine Team 60c6d59fc9 feat(mbg/core): support bias preprocess in conv_bias 5 years ago
  Megvii Engine Team 1f75c7ade4 ci(midout): fix midout and reopen midout test 5 years ago
  Megvii Engine Team 1e71e0afe0 refactor(dnn): refactor deconv algo 5 years ago
  Megvii Engine Team 89ad33aeb3 feat(dnn/cuda): support weight preprocessing for cutlass algorithms 5 years ago
  Megvii Engine Team c03249c059 feat(dnn/opr): add megdnn fake quant opr 5 years ago
  Megvii Engine Team 739f927c4c feat(dnn/cuda): opt dp4a conv for small channel base on cutlass 5 years ago
  Megvii Engine Team 1f8e40753f fix(mkl): fix windows mkl LOG compute exception 5 years ago
  Megvii Engine Team 4aa277a203 refactor(dnn/cuda): misc 5 years ago
  Megvii Engine Team f7b2bdae1a refactor(dnn): refactor algorithm type interface 5 years ago
  Megvii Engine Team 18ec5341f2 refactor(dnn): remove unused costmodel in cuda 5 years ago
  Megvii Engine Team e39f938662 refactor(dnn): remove ProfileCache and matmul algo in x86 5 years ago
  Megvii Engine Team 89303cd829 feat(megdnn/rocm): add bn for rocm backend 5 years ago
  Megvii Engine Team aea829c9fa feat(megdnn/rocm): add average inclusive mode for pooling 5 years ago
  Megvii Engine Team 1217801133 perf(mge): add opdef for broadcast 5 years ago
  Megvii Engine Team 2a3f4d099a refactor(dnn/arm): refactor CPU heuristic algo selection 5 years ago
  Megvii Engine Team ba66e1d039 feat(dnn): add nchw_fp32 nchw44_qint8 cuda dct 5 years ago
  Megvii Engine Team 44b27f0d6e build(3516): fix some cpu flags build failed and fix 3516 ycm 5 years ago
  Megvii Engine Team 8764a6c8ff feat(dnn/cuda): add volta dp4a int8 sass kernel 5 years ago
  Megvii Engine Team 3635af6274 style(atlas): add comment for async d2d 5 years ago
  Megvii Engine Team d68d4d1d99 perf(atlas): use async d2d 5 years ago
  Megvii Engine Team 215f88f373 fix(dnn/argmxx): fix argmxx on inf 5 years ago
  Megvii Engine Team 92b12685db feat(dnn/aarch64): add aarch64 int8X8X16_mk4_k8x8x8 matmul, performance is better 5 years ago
  Megvii Engine Team 912d733ea9 fix(dnn): support bool for IndexingMultiAxisVec 5 years ago
  Megvii Engine Team edb32495c6 feat(dnn/opr): add megdnn adaptive pooling opr 5 years ago
  Megvii Engine Team b8ddca4c38 fix(atlas): add MGB_USE_ATLAS_ASYNC_API to enable async api 5 years ago
  Megvii Engine Team 95eb6ae380 feat(mgb/opr): let more ops support empty IO 5 years ago
  Megvii Engine Team 0307598a80 fix(dnn): keep consistent limit between deduce and compute 5 years ago
  Megvii Engine Team cc952b2b92 fix(rocm): fix rocm megdnntest sleep and a cut code 5 years ago
  Megvii Engine Team 3a03fa7a50 fix(dnn/cuda): disable pascal sass conv2d 5 years ago
  Megvii Engine Team a5fad7d07c feat(dnn): add compile for riscv64 5 years ago
  Megvii Engine Team 3e11d89415 fix(dnn/dump): add more info for dump CD4 5 years ago
  Megvii Engine Team 76fa71573b feat(dnn/cuda): add cutlass nchw4 convolution 5 years ago
  Megvii Engine Team 1f3f4abc38 fix(dnn): fix compile warnings 5 years ago
  Megvii Engine Team 5b6ebeb563 fix(mgb): append json file for dump and ready for midout open source 5 years ago
  Megvii Engine Team 16324e3076 feat(dnn/cuda): add remap backward 5 years ago
  Megvii Engine Team 343335932a fix(dnn/arm): fix read invalid data in arm kernel 5 years ago
  Megvii Engine Team 59dcd3b7f3 fix(mgb/build): do not install cutlass 5 years ago
  Megvii Engine Team 6e882c1a86 feat(whl/imperative): compat for build python whl imperative and legacy runtime 5 years ago
  Megvii Engine Team 7f857bd471 feat(mgb/rocm): add cmake for rocm and fix compile errors and bn 5 years ago
  Megvii Engine Team 199eefbd4c fix(dnn): generate mode files 5 years ago
  Megvii Engine Team 9510136223 fix(mgb/rocm): remove begin-internal of rocm 5 years ago
  Megvii Engine Team 0380811218 feat(dnn/arm_common): add nchw44 8x8x16 stride1 stride2 5 years ago