nihui
d395000edc
flexible coopmat mnk and unified elempack for vulkan convolution 1x1s1d1 ( #6154 )
* helper function for selecting the optimal coopmat mnk size
11 months ago
nihui
171b9d1bba
use spdx license header, copyright Tencent ( #6152 )
1 year ago
nihui
075d07ede2
compute-only vulkan ( #6131 )
1 year ago
nihui
9f832c19c1
vulkan int8 packing quantize dequantize requantize ( #3731 )
* add int8 definitions
* packing vulkan int8/int32, quantize vulkan
* vulkan dequantize
* requantize vulkan
1 year ago
nihui
626d9d0910
vulkan packing code clean, drop image storage type, unified fp16p fp16s packing ( #6128 )
1 year ago
nihui
211e238639
drop layer forward vkimagemat ( #6124 )
vkimagemat was originally used as a mat storage in the hope of improving performance on old adreno gpus, but in fact it is slower than the cpu in most cases and is no longer suitable for the latest adreno architecture and large shapes
1 year ago
nihui
cc40332804
discover VK_KHR_vulkan_memory_model ( #6121 )
1 year ago
nihui
8998a13d06
discover VK_EXT_shader_float8 ( #6120 )
1 year ago
nihui
12f57fb3d1
discover VK_NV_cooperative_matrix2 ( #6118 )
1 year ago
nihui
510b461e9a
discover VK_NV_cooperative_vector ( #6117 )
1 year ago
nihui
9cdc02bb7a
unified vulkan khr/nv cooperative matrix shader ( #6116 )
1 year ago
nihui
6510fe6125
discover VK_KHR_shader_float_controls2 ( #6068 )
1 year ago
nihui
ca045ac579
gpu info query cooperative matrix properties ( #6067 )
1 year ago
nihui
7f899f2e94
update glslang, discover VK_KHR_shader_integer_dot_product and VK_KHR_shader_bfloat16 ( #6066 )
1 year ago
nihui
8dbcfee5ec
option owns vulkan device index ( #5973 )
1 year ago
nihui
b284dbd0f4
discover VK_KHR_shader_non_semantic_info, checked convolution imagestore ( #5955 )
1 year ago
nihui
eed257df1f
ci update llvmpipe ( #5954 )
* check image fp16
1 year ago
nihui
bf13c30210
define device feature macros for glslang, discover VK_EXT_shader_atomic_float and VK_EXT_shader_atomic_float2 ( #5949 )
1 year ago
nihui
8211930a6f
discover VK_KHR_shader_subgroup_rotate ( #5948 )
1 year ago
nihui
1b6485fa17
discover VK_KHR_zero_initialize_workgroup_memory ( #5947 )
1 year ago
nihui
40f7b4e527
discover all subgroup features and VK_KHR_shader_subgroup_extended_types ( #5946 )
1 year ago
nihui
0b9925cfef
intergrate VK_EXT_subgroup_size_control features and properties ( #5940 )
1 year ago
Upliner Mikhalych
cbd17cd062
Fix #5741 don't crash when vkCreateDevice fails ( #5742 )
1 year ago
nihui
bd1f39ed82
blacklist mesa vulkan cooperative matrix feature ( #5739 )
ref https://gitlab.freedesktop.org/mesa/mesa/-/issues/10847
1 year ago
張小凡
3b048d1923
destroy_gpu_instance() function wait for all devices to be idle before destroy ( #4763 )
* destroy_gpu_instance() will internally ensure that all vulkan devices are idle before proceeding with destruction.
2 years ago
nihui
05b4dcb06c
report vulkan cm 8x8x16 config, enable fp16a cm ( #5298 )
2 years ago
nihui
5329d32e74
check vulkan fp16 uniform support and implement lfp conversion without fp16u ( #5287 )
2 years ago
nihui
b4f26237cb
in-house vulkan loader ( #5130 )
* vulkan-driver-loader.md
* static vulkan on apple
2 years ago
nihui
c45c01c7c1
enable VK_KHR_cooperative_matrix ( #4823 )
* enable VK_KHR_cooperative_matrix
* add khr cm shader
* update glslang
* print matrix info
2 years ago
nihui
15cf81c40d
workaround multiheadattention vulkan nan issue on nvidia gpu ( #4682 )
* fix vulkan validation error, prefer VK_KHR_buffer_device_address over VK_EXT_buffer_device_address
* enable validation extension features
3 years ago
nihui
a2106f840f
setup more extension entrypoint ( #4636 )
3 years ago
張小凡
d87e895a1f
Add get_gpu_instance() function and Organized the instance class codes. ( #4630 )
3 years ago
張小凡
772b13a1d1
Add three extension capability support check ( #4626 )
* Add some extension capability for vma
3 years ago
ws
643285a08c
fix macos vulkan instance create failed when vulkan sdk version >= 1.… ( #4472 )
* enable VK_KHR_portability_subset extension if device support it
Co-authored-by: w1ndseeker <w1ndseeker@users.noreply.github.com>
3 years ago
nihui
559e5b23f9
vulkan tensorcore optimization ( #3628 )
* query and enable cooperative matrix
* fix build with old vulkan sdk
* implement cooperative matrix optimization
* add nvidia-t4 coverage
* adjust test option for more coverage
4 years ago
nihui
9fd4d371ae
bridge image for adreno image upload and download ( #2658 )
* add bridge image for adreno image storage upload and download
* enable sbn1, print bugbilz flag
* blacklist old adreno
* let user choose use_image_storage option even when bug_storage_buffer_no_l1
5 years ago
nihui
54c0a13b9f
build shared library ( #2525 )
* build shared lib and enable lto
* reserved for layer and option
* allocator pimpl
* datareader pimpl
* paramdict pimpl, disable copy assign for allocator and datareader
* modelbin pimpl
* net extractor pimpl
* gpu pimple
* disable copy assign vulkandevice, code format
* command pimpl, dummy image readonly
* pipeline pipelinecache pimpl, export platform class
* code format, export simple family
* update ci
* disable lto on android armv7, merge webassembly ci
* link libgcc, fix macos dylib version
* pipeline pimpl, gpu info pimpl
* destroy gpu info after vulkan device
* ignore msvc stl class warning
* fix ncnn_paramdict_get_float return type
* fix vktransfer upload fp16 without flatten, add command test
5 years ago
nihui
1f44e5c6a3
enable ios arm64e ( #2475 )
* enable ios arm64e
* fix build with old vulkan sdk
* link vulkan loader on macos, fix ios moltenvk library path
* there is no moltenvk arm64e library atm, link moltenvk directly for macos-arm64
5 years ago
nihui
2b0b2fa388
enable more vulkan extensions, set subgroup size per vendor
5 years ago
nihui
cf3cf83cd3
unified image shader storage type ( #2231 )
* drop bug_layout_binding_id_alias flag
5 years ago
nihui
b9296c259d
bring up vulkan 1.1 ( #2191 )
* query subgroup features
* compile spirv 1.3
* drop offline spirv build
* do not build tests for android and ios, as they are never tested anyway
* code style
5 years ago
youzainn
1c5af3d83c
add device_name field for class GpuInfo ( #2122 )
5 years ago
nihui
9f5b660483
compile spirv
5 years ago
nihuini
bf279dcf17
workaround corrupted pipeline cache on old qcom adreno
6 years ago
nihui
193e08e834
lazy initialize utility operator, fix #1923
6 years ago
nihui
164273de61
online pipeline cache ( #1792 )
* online pipeline cache wip
* device-wide pipeline cache
* enable model-wide pipeline cache
* drop pre-created shader modules
* always use pipeline cache
* use implicit model-wide pipeline cache, code format
* code clean
6 years ago
nihui
3ef995ed1e
format code style and setup restyled.io ( #1840 )
6 years ago
nihuini
aeba24b371
enable implicit fp16a on arm mali variants, add bug tag for layout binding id alias
6 years ago
nihuini
6788384595
query gpu heap budget api
6 years ago
nihui
17c445480f
runtime spir-v compilation with libglslang ( #1779 )
6 years ago