nihui
c4a007406d
windows clang ci ( #5469 )
* windows clang ci
* clang msvc use x86intrin.h for xop
* test arm64 compiler features
2 years ago
nihui
08b7d99a75
rnn/lstm/gru dynamic quantization ( #5435 )
2 years ago
Tabbleman
b8fefb977d
clear warning: unused variable while building on x86-wsl platform ( #5444 )
2 years ago
nihui
6cdd7110be
fix instruction extension dispatch ( #5427 )
2 years ago
nihui
9ce7930413
x86 optimization for convolution tiled gemm ( #5426 )
2 years ago
nihui
db035d602d
update ncnnoptimize layers, lightmode=false keeps original weight ( #5414 )
2 years ago
nihui
2f65729873
fix riscv v build with old cpp standard, fix #5366 ( #5391 )
2 years ago
nihui
167501f0c6
fix softmax arm fp16s sum error, fix #5340 ( #5393 )
2 years ago
nihui
84256b1494
pnnx enhance functionize ( #5387 )
* pnnx fix some undefined dtype
* fix ncnn convdw1d dynamic weight loading
2 years ago
hokamilkv
74fda386f3
Update convolution_im2col_gemm_int8.h ( #5365 )
remove _sum0=_sum0
2 years ago
nihui
d38bdbdb84
fix debug build on some compiler, fix #5295 ( #5326 )
2 years ago
Xinyu Yang
7ac42680cf
RVV: Refine riscv gemm fp32 ( #5303 )
* replace storexxx to vsseg2e32_v_f32m1
* refine transpose
---------
Co-authored-by: Xinyu302 <Xinyu302@users.noreply.github.com>
2 years ago
Sophon
294e786d36
convolution_x86: Fix typo in logging ( #5310 )
Signed-off-by: Xilin Wu <wuxilin123@gmail.com>
2 years ago
nihui
0942efab2e
x86 avx512 optimization for mish ( #5309 )
2 years ago
nihui
05b4dcb06c
report vulkan cm 8x8x16 config, enable fp16a cm ( #5298 )
2 years ago
nihui
656b082284
fix cast armv7 sigbus when loading fp16 model ( #5292 )
* fix sigbus error when loading fp16 model on armv7
* apply for bf16
2 years ago
nihui
ba42369c68
workaround l2 norm produce -inf value with subnormals ( #5272 )
2 years ago
nihui
556b79ce4d
create layer decoupled ( #5258 )
* create layer decoupled
* no more virtual public
* allow build test with shared library
* decouple cpu vulkan
* drop old scripts
2 years ago
Molly Sophia
92d49e1f59
requantize: Use activation_ss in fused_activation.h ( #5245 )
Which fixes int8 requantization on risc-v
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2 years ago
nihui
5a8ce63af4
optimize resize bilinear and compress font data ( #5200 )
2 years ago
nihui
eea3fc9b41
optimize vulkan global pooling ( #5191 )
Co-authored-by: nihui <nihui@users.noreply.github.com>
Co-authored-by: michaelcai <michaelcai@tencent.com>
2 years ago
nihui
dba87f8cad
fix build with msvc arm64 asimdhp ( #5176 )
2 years ago
nihui
deae9e61da
disable rtti and exceptions for msvc ( #5167 )
* disable rtti and exceptions for msvc
* warnings--
* erff
* arch sse2 for 32bit build
* enable rtti for cross compiling
2 years ago
nihui
058aa0ad37
enable arm neon intrinsics for msvc build ( #5151 )
2 years ago
AlOa
9f26eeb5a7
Prelu layer uses sse instruction _mm_load_ps but data can be misaligned so it must use _mm_loadu_ps ( #5149 )
2 years ago
nihui
4136de3b8d
arm optimization for convolution int8 packed unified elempack ( #5147 )
2 years ago
ningjiang233
b2f12fdd67
delete useless setences ( #5139 )
2 years ago
nihui
4494aadd74
deconvolution dynamic weight ( #5119 )
2 years ago
nihui
6c6c40edb3
fix deconvolution x86 unaligned bias load ( #5112 )
2 years ago
nihui
14e14a9ae8
slice with indices ( #5103 )
2 years ago
nihui
9dda7e385a
fix gridsample x86 warnings ( #5096 )
2 years ago
nihui
7afdbfa680
simplify vulkan conv1d ( #5095 )
2 years ago
nihui
54ab8051e3
fix warnings ( #5094 )
2 years ago
邓实诚
a1e3ebf8e5
implement simplemath ( #4905 )
* complete abs, fmod and sin function in simplemath.h
* remove some unused variables in simplemath.cpp
* modify test-coverage.yml and add some functions to simplemath.cpp
* modify erf.cpp which included math.h
* include platform.h for NCNN_SIMPLEMATH definition
* move utility constants and functions in simplemath.h to simplemath.cpp
* guard simplemath functions with extern "C"
* add NCNN_EXPORT macro in simplemath.h
* include plateform.h and guard all declarations with NCNN_SIMPLEMATH
* clean unused code in test_unaryop.cpp
* guard #include <vector> with NCNN_SIMPLEMATH in benchncnn.cpp
* add 'static' to guard functions that not declarated in header file
* modify sin and cos with better implementation
---------
Co-authored-by: HonestDeng <HonestDeng@users.noreply.github.com>
2 years ago
nihui
80b3b9c6f0
arm optimization for convolution int8 winograd unified elempack ( #5087 )
* enable out elempack 8 for winograd and sgemm
2 years ago
Yoh
3f437d3f3d
Grid sample op ( #4373 )
* pnnx support grid_sample op
* complete the permute and gridsample operator fusion
* spilt calculation into two stages and support permute fusion
2 years ago
FhqTreap
dc25128195
Vulkan conv1d ( #5060 )
2 years ago
Xinyu302
b82d395753
Add riscv float32 gemm ( #4903 )
Co-authored-by: Xinyu302 <Xinyu302@users.noreply.github.com>
2 years ago
nihui
7b02425246
x86 optimization for convolution int8 winograd unified elempack ( #5054 )
2 years ago
張小凡
b4f8fa6d38
Fixed _mm256_set_m128 is only availble on gcc8+. issue#5072 ( #5075 )
2 years ago
daquexian
75ad1cc749
support tag in memorydata layer ( #5061 )
Signed-off-by: daquexian <daquexian566@gmail.com>
2 years ago
nihui
26a70c9b05
fix build with vanilla c906 toolchain ( #5048 )
2 years ago
nihui
78aca88d67
elu 4d and selu 4d ( #5047 )
2 years ago
Beq Jal
019176c6b2
selu and shufflechannel on x86 ( #5017 )
2 years ago
nihui
fdf2c482dc
fuse adaptive pool dynamic output size, implement ncnn adaptive pooling dynamic outsize ( #5043 )
2 years ago
Amir Ramezani
7e5fa3ade3
shrink operator ( #5022 )
2 years ago
FhqTreap
a12a14f3a6
Gelu afp fix ( #5039 )
2 years ago
nihui
c8662cce5e
arm optimization for convolution int8 gemm unified elempack ( #5016 )
2 years ago
nihui
4da33b195e
prevent some old gcc using high registers as kernel values ( #5036 )
2 years ago
Amir Ramezani
0ea587b8c7
celu activation vulkan and onnx conversion ( #5018 )
2 years ago