Megvii Engine Team
c2e9860feb
chore(license): remove all license in file header
GitOrigin-RevId: a0e31247a6
3 years ago
Megvii Engine Team
70209667e8
fix(dnn/test): fix some bug when force_deduce_layout is off
GitOrigin-RevId: d7ccc397df
3 years ago
Megvii Engine Team
7dc347697a
feat(dnn/cuda): add typecvt uint16
GitOrigin-RevId: d1368c414e
3 years ago
Megvii Engine Team
4c0bff1dba
refactor(megdnn): refactor TEGRA_X1/X2 macro
GitOrigin-RevId: 1aa78712c6
3 years ago
Megvii Engine Team
758549b936
feat(megengine): support tx2
GitOrigin-RevId: d1175a1f4a
3 years ago
Megvii Engine Team
b6ad457269
feat(cuda): support int1 simplewq conv
GitOrigin-RevId: 9c37c41bc7
3 years ago
Megvii Engine Team
fd6f8e58b0
feat(mgb/dtype): add dtype qint1
GitOrigin-RevId: abe9fb68b1
3 years ago
Megvii Engine Team
87de704a46
feat(gopt): fuse conv h_swish
GitOrigin-RevId: a3d12991fb
3 years ago
Megvii Engine Team
d7b0994a3e
feat(cuda): add fp16 compute 16 kernel
GitOrigin-RevId: e03435be02
3 years ago
Megvii Engine Team
8a2e92bd6c
refactor(cuda): depthwish large kernel
GitOrigin-RevId: dade8710b4
3 years ago
Megvii Engine Team
6b8a69d5b6
feat(cuda): float16 depthwise large kernel conv compute fp32
GitOrigin-RevId: 3050d48f26
3 years ago
Megvii Engine Team
bc385b5374
feat(cuda): support float16 depthwise large kernel conv
GitOrigin-RevId: fdc1b15fbc
4 years ago
Megvii Engine Team
7d2063e35a
perf(cuda): speedup conv backward data with small feature map and large filter size
GitOrigin-RevId: 85592bca6b
4 years ago
Megvii Engine Team
72403e8929
perf(cuda): speedup chanwise conv with small feature map and large filter size
GitOrigin-RevId: e65b2ce856
4 years ago
Megvii Engine Team
ab6d12caff
feat(mge): add conv padding mode
GitOrigin-RevId: 147ced856e
4 years ago
Megvii Engine Team
47fe766310
feat(dnn/cuda): add implicit bmm kernels for large kernel depthwise convolution backward filter opr
GitOrigin-RevId: 932e7689e8
3 years ago
Megvii Engine Team
6cefabe734
fix(dnn/cuda): fix ci
GitOrigin-RevId: 8267e5f9dd
4 years ago
Megvii Engine Team
888f4e46ae
feat(dnn/cuda): add implicit bmm large kernel dwconv2d dgrad kernels
GitOrigin-RevId: fcb7974d62
4 years ago
Megvii Engine Team
08d8635ff5
feat(dnn/cuda): add implicit bmm large kernel dwconv2d fprop impl
GitOrigin-RevId: feb09ebb58
4 years ago
Megvii Engine Team
95ac055538
feat(dnn,mgb,imperative): add diag opr implement
GitOrigin-RevId: 43016ffa2b
4 years ago
Megvii Engine Team
cbbca5fb10
feat(mge): add softmax op use cudnn api
GitOrigin-RevId: 7734ebf8c4
4 years ago
Megvii Engine Team
82be0aaced
test(dnn): fix compute capability requirement for NCHWX test
GitOrigin-RevId: d2f8022be1
4 years ago
Megvii Engine Team
1999307015
feat(mgb/opr): add dropout kernel
GitOrigin-RevId: d248bd2005
4 years ago
Megvii Engine Team
a93741815b
feat(mgb/opr): add layernorm forward and backward kernel
GitOrigin-RevId: 0cd484e753
4 years ago
Megvii Engine Team
2696e4efaa
feat(dnn): add float16 for remap backward
GitOrigin-RevId: 0263030051
4 years ago
Megvii Engine Team
11d75fecb5
feat(dnn/check_non_finite): add batch check_non_finite
GitOrigin-RevId: e108133282
4 years ago
Megvii Engine Team
ba2f0c2e48
fix(dnn/cuda): fix cudnn_conv algo of conv_bias opr for fp16 add z cases
GitOrigin-RevId: b29b009de0
4 years ago
Megvii Engine Team
c85631aa77
feat(dnn): use ref ptr interface for all backends
GitOrigin-RevId: f65feae5cc
4 years ago
Megvii Engine Team
89186edc5d
fix(dnn): correct reduce/argmxx/fakequant calculation with nan
GitOrigin-RevId: 7e78bdae91
4 years ago
Megvii Engine Team
68cdabd288
feat(opr): indexing_multi_axis_vec support nd index
GitOrigin-RevId: 07b1248bdc
4 years ago
Megvii Engine Team
9b4cd92ba3
fix(mgb/dnn): fix cudnnConvBiasActivation crash on nchw32 int8 with oc > 256
GitOrigin-RevId: 20c0b90575
4 years ago
Megvii Engine Team
10af44abba
fix(dnn/cuda): fix cudnn conv impl for nchw4_nchw hybrid layout
the conv_bias algo *_IMPLICIT_GEMM in cudnn less than 8.0.0 is disabled due to the incorrect result for int8x4->f32 configs
GitOrigin-RevId: 7cc52d0a85
4 years ago
Megvii Engine Team
5885b137fa
feat(dnn/arm): support layout like NHWC channel like broadcast on arm
GitOrigin-RevId: fb4300004c
4 years ago
Megvii Engine Team
369c2ccc5a
style(all): reformat c++ code
GitOrigin-RevId: 3ffd1b211f
4 years ago
Megvii Engine Team
f5cb21ed3a
fix(mgb/opr): add non finite check
GitOrigin-RevId: a9fcd0a350
4 years ago
Megvii Engine Team
bde5cf3564
feat(dnn): add resize linear for arm
GitOrigin-RevId: 14ac5bda3f
4 years ago
Megvii Engine Team
3d3666b6e0
test(dnn/bn): add compatible configs for NHWC BN
GitOrigin-RevId: ac757ca307
4 years ago
Megvii Engine Team
3977b7aa0b
feat(mgb/shuffle): add shuffle opr
GitOrigin-RevId: 80490a6f84
4 years ago
Megvii Engine Team
17371e79b9
fix(dnn/reduce): fix reduce_mean o16c32 is incorrect for large tensor
GitOrigin-RevId: ebf03d814a
4 years ago
Megvii Engine Team
8b40f57738
feat(mgb/dnn): add conv1x1 algo for matrix mul
GitOrigin-RevId: 585b2c045a
4 years ago
Megvii Engine Team
d69b59035d
feat(dnn): add an get_all_algorithms_safe interface
GitOrigin-RevId: e3734e4531
4 years ago
Megvii Engine Team
8b94f49328
fix(dnn/cuda): fix elemwise and relayout int4 bug when last shape is 1
GitOrigin-RevId: e7d64c4987
4 years ago
Megvii Engine Team
722aecd437
feat(mgb): support fp16 nhwc backward
GitOrigin-RevId: 954ac6405a
4 years ago
Megvii Engine Team
0708bc780c
fix(dnn/cuda): disallow implicit dtype conversion in cublaslt matmul algos
disable tensor op matmul kernels when input and output tensors are in f32 data type to avoid potential accuracy loss
GitOrigin-RevId: 36859cba5a
4 years ago
Megvii Engine Team
4c13bc7e1b
feat(dnn/cuda): add nhwc int8 deconv
GitOrigin-RevId: ad361a0f81
4 years ago
Megvii Engine Team
11f022ff7c
feat(dnn/cuda): add nhwc int8 imma conv and conv fuse typecvt
GitOrigin-RevId: 229e1eb4be
4 years ago
Megvii Engine Team
67575d582c
feat(mge/opr): add interpolate bilinear mode
GitOrigin-RevId: f7023a3fd3
4 years ago
Megvii Engine Team
0558b2123d
feat(mge/opr): add interpolate nearest mode
GitOrigin-RevId: d384b87f50
4 years ago
Megvii Engine Team
c25125e3d2
perf(dnn/cuda): sass int8 epilogue remove shared load
GitOrigin-RevId: 2b49f5069b
4 years ago
Megvii Engine Team
ff0e6be7b9
fix(dnn/cuda): fix cutlass tensorop kernels
do not compile cutlass tensorop kernels, when using cuda version less than 10.2
GitOrigin-RevId: d4c37d5f41
4 years ago