nihui
fca04980f3
enhance padding test ( #2580 )
* workaround nvidia driver crash
* workaround radv buffer_ld1 zero bug
* fix offset elempack
5 years ago
nihui
e68f15d2f0
padding vulkan vec and image, more padding test
5 years ago
nihui
cf3cf83cd3
unified image shader storage type ( #2231 )
* drop bug_layout_binding_id_alias flag
5 years ago
nihui
164273de61
online pipeline cache ( #1792 )
* online pipeline cache wip
* device-wide pipeline cache
* enable model-wide pipeline cache
* drop pre-created shader modules
* always use pipeline cache
* use implicit model-wide pipeline cache, code format
* code clean
6 years ago
nihui
dfee9a75ea
workaround the shape specialization constant not respected properly in padding reflect mode on nvidia gpu
6 years ago
Tijmen Verhulsdonck
da09e5e7f1
Adding channel padding support for blazeface model. ( #1826 )
* Add channel padding and blazeface model support.
* remove python binding
* remove std::min usage
* fix reference blob usage
* Increased padding test coverage
* implement requested changes
6 years ago
nihui
62da1228e1
adreno image shader + fp16 + fp16a ( #1714 )
* wip
* wip
* fix
* image and imageview can not be destroyed until command execution ends
* fast copy path for tightly packed data
* wip
* texture load works
* 1d 3d image
* record clone image, multiple commands share one image reference
* upload download image
* layer forward accept vkimagemat
* vkimagemat graph works
* staging vkimagemat for passing dynamic parameters, macro for fp32+image shader, padding image shader
* vkimagemat elemsize
* convolution test pass
* conv1x1s1 image shader
* fast staging image allocator from host memory, pooling image shader
* convolutiondepthwise image shader
* innerproduct image shader
* packing image shader
* crop deconvolution image shader
* resolve spirv binding types
* image fp16 and fp16a, cast image shader
* eltwise image shader
* wip
* absval image shader
* deconvolutiondepthwise image shader
* concat image shader, squeezenet works
* noop split image shader
* uniform precision hint
* layer support_image_storage
* wip
* vulkan device utility operator
* command is storage and packing option aware
* fallback to cpu on image allocation failed, mobilenetssd works
* flatten image shader, enable more test
* ci test
* check imgfp32 imgfp16 imgfp16a features
* fix ci test
* fix ci test
* upgrade swiftshader
* wip
* opt aggressive
* imgfp16p
* opt none
* convolution winograd image shader
* fix flush range, fast copy path for continous buffer
* minor fix
* fix innerproduct
* wip ...
* wip
* cast fix
* packing test
* wip
* image fp16p is fp16p
* wip
* silence
* more line info
* code clean
* softmax image shader
6 years ago
nihui
0f7e7bca02
shader shape specialization constant and basic local group size partition ( #1523 )
* use Mat class for Shape description
* shape specialization constant in compute shader
* wip
* wip
* test forward_inplace, add binaryop unaryop sigmoid test
* fix arm unaryop test
* fix arm binaryop test
* make shape hint optional, cast int8 to fp32, add cast test
* wip
* follow the good and old local size setting for conv1x1
* the optimal local size rewrite
* fix build on msvc
* add permute shader for all packing layout, add permute test
* concat and slice patial shape constant, slice test
* fix slice test
* interp test
* add lrn test, test packing layout implicitly
* add eltwise test
* add normalize test
* add instancenorm test
* reorg shape constant
* simple local group size partition
* add shape constant param
6 years ago
nihui
33b16811ce
reimplement sfp afp conversion macro as function style buffer load store, drop lds shader for the moment
6 years ago
nihui
5042d14d7d
define sfpvec8 afpvec8 macro, use modern glsl extension for fp16 arithmetic, fix padding aarch64 build
6 years ago
nihuini
a50bcf10aa
per channel pad
6 years ago
nihui
a4a162e36d
workaround validation layer complains about Cannot form constants of 8- or 16-bit types, due to specialization constants conversion
6 years ago
nihui
c2bc0d1b88
padding vulkan reflect mode
6 years ago
nihuini
7a8f68aca6
move vulkan code to subdir, new layer interface create_pipeline and destroy_pipeline for post-loading works
7 years ago