nihuini
054ec09195
adreno device blacklist
6 years ago
nihuini
765003a615
fix build with old vulkan sdk
6 years ago
nihuini
6788384595
query gpu heap budget api
6 years ago
nihui
17c445480f
runtime spir-v compilation with libglslang ( #1779 )
6 years ago
nihuini
b71f22d074
report adreno info, benchncnn enable image storage on adreno
6 years ago
nihuini
c94d1b39ad
force diable image storage on macos and ios, fix #1738
6 years ago
SunTY
705dd36a31
simplestl is an alternative std vector string implementation ( #1762 )
* 去掉对stl的依赖
* 头文件名,push_back改正
* 去掉构造托管
* 好像是折腾
* data 的返回改为指针,非指针引用
* resize一处写错
* stdint
* 加入c_str
* 改文件名为小写
* NCNN_SIMPLESTL option
* simplestl default to OFF
* Update linux-x64-cpu-gcc.yml
* Update linux-x64-cpu-gcc.yml
* Update linux-x64-cpu-clang.yml
* drop functional header
* arm32 arm64 simplestl ci
* 修改一处内存泄漏, 去掉编译器警告
* resize时默认量的bug
Co-authored-by: nihuini <nihuini@tencent.com>
Co-authored-by: nihui <shuizhuyuanluo@126.com>
6 years ago
Naiyang Lin
ceef2470a5
Add logger.h ( #1753 )
6 years ago
nihuini
6682cd1638
image fp16pa, mark some bugihfa todo
6 years ago
nihuini
cefe8d38c3
dynamic image storage support from shape hint
6 years ago
nihuini
1e4a0752b4
fix interp ci test
6 years ago
nihui
9a9a618229
image storage is mandatory, less options makes life easier
6 years ago
nihui
e8688b042f
fuse packing cast storage, binaryop image shader, dummy buffer and image, device-wide utility packing converter operators, fix multi-blob layer test
6 years ago
nihui
62da1228e1
adreno image shader + fp16 + fp16a ( #1714 )
* wip
* wip
* fix
* image and imageview can not be destroyed until command execution ends
* fast copy path for tightly packed data
* wip
* texture load works
* 1d 3d image
* record clone image, multiple commands share one image reference
* upload download image
* layer forward accept vkimagemat
* vkimagemat graph works
* staging vkimagemat for passing dynamic parameters, macro for fp32+image shader, padding image shader
* vkimagemat elemsize
* convolution test pass
* conv1x1s1 image shader
* fast staging image allocator from host memory, pooling image shader
* convolutiondepthwise image shader
* innerproduct image shader
* packing image shader
* crop deconvolution image shader
* resolve spirv binding types
* image fp16 and fp16a, cast image shader
* eltwise image shader
* wip
* absval image shader
* deconvolutiondepthwise image shader
* concat image shader, squeezenet works
* noop split image shader
* uniform precision hint
* layer support_image_storage
* wip
* vulkan device utility operator
* command is storage and packing option aware
* fallback to cpu on image allocation failed, mobilenetssd works
* flatten image shader, enable more test
* ci test
* check imgfp32 imgfp16 imgfp16a features
* fix ci test
* fix ci test
* upgrade swiftshader
* wip
* opt aggressive
* imgfp16p
* opt none
* convolution winograd image shader
* fix flush range, fast copy path for continous buffer
* minor fix
* fix innerproduct
* wip ...
* wip
* cast fix
* packing test
* wip
* image fp16p is fp16p
* wip
* silence
* more line info
* code clean
* softmax image shader
6 years ago
nihuini
5580da4525
bump engine version
6 years ago
nihui
7365bb80a2
vkmat and command api breaks ( #1689 )
* vkmat and command api breaks
* always use compute queue for compute buffer transfer
* no barrier for readonly weight buffer
* record clone, drop queue_owner
* bring back layer forward
* fix validation errors
* lifecycle inside command makes life easier
* update doc
* record_import_android_hardware_buffer
6 years ago
nihuini
1ea9de3bdf
create shader pipeline by type index, resolve binding count and push constant count from spirv. since we don't create compound shader module for macos and ios compatibility, it is enough to use fixed main as the shader entry point
6 years ago
nihui
f972bf49d1
enable bugihfa on rk3288 and rk3399
6 years ago
nihui
7c97142524
old qcom adreno driver seems to have the same bug as mali does
6 years ago
nihui
bbaa4dcce2
compile fp16pa, optimize shader for size, enable implicit fp16 arithmetic for qcom855 and qcom855plus
6 years ago
nihuini
b361b24832
do not enforce coherent memory type, queue transfer after uploading model weight
6 years ago
nihui
038666e049
the initial auto test ( #1464 )
* cpu test
* wip
* ci run test
* travis ci for arm64
* arm64 ctest
* copy vulkan loader
* wip
* run
* Update ccpp.yml
* gpu test
* swiftshader
* cache macos swiftshader
* try MoltenVK
* try vulkaninfo
* give swiftshader another try
* disable failed macos gpu test
* more conv test, fix conv3x3s1 gpu test fail
* fix deconvolution test
* dilation test
* cmake option to build tests
* ncnn_add_layer_test macro
* host barrier before upload and after download, handle packing layout option
* test packing layout
* wip
* wip
* merge deconvolution packing and non-packing code
* merge convolution packing and non-packing code
* pass top_blob_count param
* fix build
* take care of non-coherent mappable memory
6 years ago
nihuini
a477aee0ba
print graphics queue info, const++
6 years ago
nihui
8a87f0267a
workaround local workgroup size specialization constant bug for old arm mali vulkan driver, fix #1424
6 years ago
nihui
a867d96822
dynamic memory type querying, respect memory requirement memory type bits
6 years ago
nihui
7e68c5e1e9
enable ycbcr conversion feature, get graphics queue
6 years ago
nihui
cb41b00e6e
setup VK_KHR_bind_memory2 functions
6 years ago
nihui
b29e8b0e09
check and enable more vulkan extensions
6 years ago
nihuini
21b5508c96
shared locked vkallocator cannot prevent concurrent accessing during actual gpu inference, use seperated vkallocator for each queue
7 years ago
nihuini
e9ffdb5bdd
16bit storage on arm mali is buggy
7 years ago
nihuini
040a8d2427
set vulkan device by gpu index
7 years ago
nihuini
5fdffbcaac
destroy_gpu_instance is not threadsafe anyway, fix deadlock on exit
7 years ago
nihuini
838c5df839
option api changes
7 years ago
nihuini
7f7bbf12e5
new api for getting the default gpu device
7 years ago
nihuini
cd7559c639
more fix for fp16p, still disabled by default
7 years ago
nihui
25b9736f82
shader fp16 packed
7 years ago
nihuini
4b50a97e31
implement vulkan winograd23
7 years ago
nihuini
37e150162a
do not retrieve timestamp availabitliy bits
7 years ago
nihuini
8e2fb2e710
expose timestamp_period and timestamp_valid_bits
7 years ago
nihuini
c9a9486307
merge command submit and wait, expose queue_count, concurrent queue submission shall work
7 years ago
nihuini
3d06c40d10
fix build with vulkan header version 65, fix #907
7 years ago
nihui
f92dcca3b3
compiled spirv nearly always claim uniform buffer 8bit / 16bit access capability
7 years ago
nihui
c180e87502
add compile shader module function, create pipeline from custom shader spv data
7 years ago
nihuini
31db9797df
interp bicubic shader, initialize mat member with zero
7 years ago
nihui
162c46647d
do not create fp16 shader module on unsupported platform
7 years ago
nihui
058bd65c88
fix fp16 shader creation
7 years ago
nihuini
4e3df863d5
fix enable feature pointer
7 years ago
nihuini
05bf09ba70
rename fp16_storage to support_fp16_storage
7 years ago
nihuini
332722af63
fix fp16a int8a exchange oops
7 years ago
nihuini
e59dc6fafe
proper usage of instance extension VK_KHR_get_physical_device_properties2, check fp16 and int8 feature
7 years ago