nihui
9826f3dbf8
shader include vulkan activation, workaround for moltenvk tanh half4 issue ( #3711 )
4 years ago
zhiliu6
814f89ef1a
Fuse HardSwish activation into Convolution and InnerProduct ( #3233 )
* add general fused activation
* add NCNN_FORCE_INLINE option
4 years ago
nihui
cf3cf83cd3
unified image shader storage type ( #2231 )
* drop bug_layout_binding_id_alias flag
5 years ago
nihui
164273de61
online pipeline cache ( #1792 )
* online pipeline cache wip
* device-wide pipeline cache
* enable model-wide pipeline cache
* drop pre-created shader modules
* always use pipeline cache
* use implicit model-wide pipeline cache, code format
* code clean
5 years ago
zhiliu6
3bfabf1d6a
Add fused convolution and mish layer support. ( #1761 )
6 years ago
nihuini
049a6be998
code style clean
6 years ago
nihui
62da1228e1
adreno image shader + fp16 + fp16a ( #1714 )
* wip
* wip
* fix
* image and imageview can not be destroyed until command execution ends
* fast copy path for tightly packed data
* wip
* texture load works
* 1d 3d image
* record clone image, multiple commands share one image reference
* upload download image
* layer forward accept vkimagemat
* vkimagemat graph works
* staging vkimagemat for passing dynamic parameters, macro for fp32+image shader, padding image shader
* vkimagemat elemsize
* convolution test pass
* conv1x1s1 image shader
* fast staging image allocator from host memory, pooling image shader
* convolutiondepthwise image shader
* innerproduct image shader
* packing image shader
* crop deconvolution image shader
* resolve spirv binding types
* image fp16 and fp16a, cast image shader
* eltwise image shader
* wip
* absval image shader
* deconvolutiondepthwise image shader
* concat image shader, squeezenet works
* noop split image shader
* uniform precision hint
* layer support_image_storage
* wip
* vulkan device utility operator
* command is storage and packing option aware
* fallback to cpu on image allocation failed, mobilenetssd works
* flatten image shader, enable more test
* ci test
* check imgfp32 imgfp16 imgfp16a features
* fix ci test
* fix ci test
* upgrade swiftshader
* wip
* opt aggressive
* imgfp16p
* opt none
* convolution winograd image shader
* fix flush range, fast copy path for continous buffer
* minor fix
* fix innerproduct
* wip ...
* wip
* cast fix
* packing test
* wip
* image fp16p is fp16p
* wip
* silence
* more line info
* code clean
* softmax image shader
6 years ago
nihui
0f7e7bca02
shader shape specialization constant and basic local group size partition ( #1523 )
* use Mat class for Shape description
* shape specialization constant in compute shader
* wip
* wip
* test forward_inplace, add binaryop unaryop sigmoid test
* fix arm unaryop test
* fix arm binaryop test
* make shape hint optional, cast int8 to fp32, add cast test
* wip
* follow the good and old local size setting for conv1x1
* the optimal local size rewrite
* fix build on msvc
* add permute shader for all packing layout, add permute test
* concat and slice patial shape constant, slice test
* fix slice test
* interp test
* add lrn test, test packing layout implicitly
* add eltwise test
* add normalize test
* add instancenorm test
* reorg shape constant
* simple local group size partition
* add shape constant param
6 years ago
nihui
33b16811ce
reimplement sfp afp conversion macro as function style buffer load store, drop lds shader for the moment
6 years ago
nihui
5042d14d7d
define sfpvec8 afpvec8 macro, use modern glsl extension for fp16 arithmetic, fix padding aarch64 build
6 years ago
nihui
25b9736f82
shader fp16 packed
7 years ago
nihuini
4b50a97e31
implement vulkan winograd23
7 years ago
nihui
3e003ffd98
fuse sigmoid
7 years ago
nihuini
7a8f68aca6
move vulkan code to subdir, new layer interface create_pipeline and destroy_pipeline for post-loading works
7 years ago