nihuini
7a8f68aca6
move vulkan code to subdir, new layer interface create_pipeline and destroy_pipeline for post-loading works
7 years ago
nihuini
c6e075cef7
fuse deconv/innerproduct relu arm
7 years ago
nihui
be81ecf1f6
fix build on msvc
7 years ago
nihuini
528fe8e9e3
gpu convolution/deconvolution/innerproduct fuse activation
7 years ago
nihuini
3f85cafc08
fuse relu leakyrelu clip into convolution/deconvolution/innerproduct
7 years ago
nihuini
7984ffcb4d
ncnnoptimize tool
7 years ago
nihuini
b81e1f3906
get rid of the old workaround :)
7 years ago
791136190
e2e8e1b9d7
mxnet2ncnn tool support symbol.softmax op ( #938 )
* [CHG]when use HybridBlock to call F.softmax,the softmax op name is "softmax"(mxnet_version:10301)
* [CHG]Remove type mismatch error when using static code detection tool
7 years ago
nihuini
5d86014d9c
add missing barrier for transfer dst, fix softmax pack4, fix #932
7 years ago
nihuini
4729ea3505
bottom blob memory never alias, reuse blob memory more elegantly relying on refcount
7 years ago
nihui
274392eb80
convolution padding same on gpu
7 years ago
gfjiangly
9ffe2b84e9
Optimization of the softmax layer ( #914 )
* Optimize the loop structure to improve the speed of the softmax layer & Reduce memory consumption
* use 4 space instead of tab
7 years ago
nihui
8724440c59
bind wait barrier count member to memory, fix #932
7 years ago
nihuini
a0561e345a
fix lrn shader, fix #928
7 years ago
nihuini
3d06c40d10
fix build with vulkan header version 65, fix #907
7 years ago
nihui
f92dcca3b3
compiled spirv nearly always claim uniform buffer 8bit / 16bit access capability
7 years ago
nihui
c180e87502
add compile shader module function, create pipeline from custom shader spv data
7 years ago
nihuini
de7071452d
try posix_memalign first
7 years ago
nihui
9643916281
fix fp16s fp16a deconvolution shader
7 years ago
nihui
c5ab0c86e4
dynamic padding and crop offset size
7 years ago
nihui
c70b1368c6
interp share coeffs and xofs
7 years ago
nihuini
31db9797df
interp bicubic shader, initialize mat member with zero
7 years ago
BUG1989
93a34a897d
add int8 winograd F(4,3) with neon assembly optimization ( #891 )
* add the implement of int8 winograd F(4,3)
* add int8 winograd F(4,3) naive c to arm64-v8a platform
* optimize int8 winograd F(4,3) with neon
* merge dequant op into int8 winograd F(4,3)
* enable int8 wino F(4,3) case with all size
7 years ago
nihuini
9e9ae2322c
use platform aligned malloc
7 years ago
nihui
1634675c96
fix dst write out of range, fix #886
7 years ago
liurs1990
1554438515
fix NaN(var maybe minus due to accuracy sometimes) issue in InstanceNorm ( #874 )
7 years ago
nihuini
dfffb29bb5
resize bicubic
7 years ago
nihuini
c778265658
reuse hresize result properly when enlarging, fix #863
7 years ago
nihuini
a4b74d27b0
move copy cut border function to operator
7 years ago
nihuini
5a905c7cb9
implement substract_mean_normalize with bias and scale op
7 years ago
nihuini
c25c190703
move resize bilinear function to operator
7 years ago
BUG1989
780c7d9a72
merge de/requantize op, optimize some int8 conv layer on arm64-v8a ( #867 )
* optimize the conv sgemm int8 on arm64-v8a platform
* optimize int8 arm64-v8a with sadalp ins
* merge requantize op into latest conv layer
* merge requantize op into conv-int8 op
* update the mobilenet.param in the benchmark
* Update README.md
update Kirin970 and RK3399
* try to fix the travis build error
7 years ago
nihui
7ab968e6e1
fix gpu crop, convert crop offset with axis
7 years ago
nihuini
f90a9898e2
fix priorbox pipeline creation error on adreno
7 years ago
nihui
58ed8e437f
require GL_EXT_shader_16bit_storage only for fp16_storage, explicit type cast
7 years ago
nihui
162c46647d
do not create fp16 shader module on unsupported platform
7 years ago
nihui
d753fe2589
upload fp16 weight, enable fp16 storage and arithmetic
7 years ago
ShuangLiu1992
c6a2d0417a
add missing header for pipeline.cpp and fix compile error for emscripten ( #861 )
7 years ago
nihui
058bd65c88
fix fp16 shader creation
7 years ago
nihuini
4e3df863d5
fix enable feature pointer
7 years ago
nihuini
46dc21c8b1
fp16 shader
7 years ago
Gemfield
add8c73922
Fix the return value of load_param and load_model ( #855 )
7 years ago
nihuini
37573aeeb5
remove unused record download
7 years ago
nihuini
05bf09ba70
rename fp16_storage to support_fp16_storage
7 years ago
nihuini
43737b378f
wrapper function for converting between fp32 and fp16
7 years ago
nihuini
2b8ff843e9
cast layer and shader for fp32 fp16 conversion
7 years ago
Gemfield
573c2bcd93
Fix crash issue during load_model ( #848 )
* Fix crash issue during load_model
* Fix crash issue during load_model 2nd part
7 years ago
nihuini
a3a2548aa2
initial fp16s fp16a shader build system
7 years ago
nihuini
332722af63
fix fp16a int8a exchange oops
7 years ago
nihuini
e59dc6fafe
proper usage of instance extension VK_KHR_get_physical_device_properties2, check fp16 and int8 feature
7 years ago