nihui
80499bd64a
enable VK_LAYER_KHRONOS_validation layer in modern vulkan sdk
5 years ago
nihuini
9b949d65b3
fuse onnx lstm, codeformat exclude pybind11, fix #2562
5 years ago
nihui
54c0a13b9f
build shared library ( #2525 )
* build shared lib and enable lto
* reserved for layer and option
* allocator pimpl
* datareader pimpl
* paramdict pimpl, disable copy assign for allocator and datareader
* modelbin pimpl
* net extractor pimpl
* gpu pimple
* disable copy assign vulkandevice, code format
* command pimpl, dummy image readonly
* pipeline pipelinecache pimpl, export platform class
* code format, export simple family
* update ci
* disable lto on android armv7, merge webassembly ci
* link libgcc, fix macos dylib version
* pipeline pimpl, gpu info pimpl
* destroy gpu info after vulkan device
* ignore msvc stl class warning
* fix ncnn_paramdict_get_float return type
* fix vktransfer upload fp16 without flatten, add command test
5 years ago
nihuini
5650b77054
fix gpu extension conditions
5 years ago
nihui
1f44e5c6a3
enable ios arm64e ( #2475 )
* enable ios arm64e
* fix build with old vulkan sdk
* link vulkan loader on macos, fix ios moltenvk library path
* there is no moltenvk arm64e library atm, link moltenvk directly for macos-arm64
5 years ago
nihui
2b0b2fa388
enable more vulkan extensions, set subgroup size per vendor
5 years ago
nihui
cf3cf83cd3
unified image shader storage type ( #2231 )
* drop bug_layout_binding_id_alias flag
5 years ago
nihui
9be3f074a9
ci ndk-r16b ( #2104 )
* reset
* fix build with old vulkan header
5 years ago
nihui
b9296c259d
bring up vulkan 1.1 ( #2191 )
* query subgroup features
* compile spirv 1.3
* drop offline spirv build
* do not build tests for android and ios, as they are never tested anyway
* code style
5 years ago
nihui
4463c3b455
disable image shader on adreno until a better workaround figured out
5 years ago
youzainn
1c5af3d83c
add device_name field for class GpuInfo ( #2122 )
5 years ago
nihuini
a334513b5e
fp16a option fix
5 years ago
nihuini
9047741129
always disable fp16/int8 arithmetic for gpu uop
5 years ago
nihui
9f5b660483
compile spirv
5 years ago
Leo
5afd318b86
Support remove libstdc++ denpendency ( #2030 )
* [build] add toolchain file w/o stdcxx dependency
* [build] link m and gcc lib explicitly
* [ncnn] complete simple stl impl
* [ncnn] adapt for ncnn simplestl
* [test] adapt for ncnn simplestl
* [ncnn] fix missing algorithm and list when simplestl disabled
* [ncnn] fix guard for operator new and delete
* [style] fix the code style
* [build] fix build failed on darwin and emscripten
* [ci] do not import cxx to avoid operator conflict
* [ncnn] add temporary partial_sort impl using bubble sort
heap sort should be used for better perf.
* [ncnn] add std greater and less function
* [ncnn] fix placement new operator overload
* [ncnn] add operator delete with size info
* [build] disable exception, rtti, example and tools when simplestl on
* [build] add toolchain for arm simplestl
* [build] add toolchain for aarch64 simplestl
* [ncnn] move initializer to constructor
* [ncnn] use deteiled type instead of auto
* [ncnn] use plain lib name in target_link_libraries
5 years ago
nihui
1322ae40cb
update engine version
5 years ago
nihuini
bf279dcf17
workaround corrupted pipeline cache on old qcom adreno
5 years ago
nihui
11cffce114
armv8.2 infrastructure ( #1856 )
* runtime cpu dispatch
* force thread one
* disable openmp for coverage
* simplify test layer
* print NCNN_TARGET_ARCH
* less ci build variants
* weight fp16 storage option
* test convdw int8
* apple a12 a13
* ncnn_add_layer ncnn_add_shader cmake macro
5 years ago
nihui
193e08e834
lazy initialize utility operator, fix #1923
5 years ago
nihui
27e099961c
fix double gpu instance destruction
5 years ago
nihui
164273de61
online pipeline cache ( #1792 )
* online pipeline cache wip
* device-wide pipeline cache
* enable model-wide pipeline cache
* drop pre-created shader modules
* always use pipeline cache
* use implicit model-wide pipeline cache, code format
* code clean
5 years ago
nihuini
d2bf77cd88
create new allocator when pre-allocated allocators exhausted, fix #1862
6 years ago
nihuini
c38d304369
the implicit gpu instance makes life easier :)
6 years ago
nihuini
187a3e672d
implicit gpu instance destruction, fix #1849
6 years ago
nihuini
9bb06e46cf
implicit gpu instance creation, fix #1849
6 years ago
nihuini
fd7d87e098
allow linking with external glslang
6 years ago
nihui
3ef995ed1e
format code style and setup restyled.io ( #1840 )
6 years ago
nihuini
554890cda8
fp16p and fp16s cannot be both enabled in shader source
6 years ago
nihuini
1a3a99d7c9
old qcom driver cannot handle binding id alias
6 years ago
nihuini
f87f21779f
resolve cast from type properly, no more fp16p to/from fp16s conversion
6 years ago
nihuini
bb56b5439f
fix vkmat download on integrated gpu, workaround priorbox fp16s with online spirv, fix #1700 fix #1805
6 years ago
nihui
8fec0038ba
fix ci test
6 years ago
nihuini
aeba24b371
enable implicit fp16a on arm mali variants, add bug tag for layout binding id alias
6 years ago
nihuini
054ec09195
adreno device blacklist
6 years ago
nihuini
765003a615
fix build with old vulkan sdk
6 years ago
nihuini
6788384595
query gpu heap budget api
6 years ago
nihui
17c445480f
runtime spir-v compilation with libglslang ( #1779 )
6 years ago
nihuini
b71f22d074
report adreno info, benchncnn enable image storage on adreno
6 years ago
nihuini
c94d1b39ad
force diable image storage on macos and ios, fix #1738
6 years ago
SunTY
705dd36a31
simplestl is an alternative std vector string implementation ( #1762 )
* 去掉对stl的依赖
* 头文件名,push_back改正
* 去掉构造托管
* 好像是折腾
* data 的返回改为指针,非指针引用
* resize一处写错
* stdint
* 加入c_str
* 改文件名为小写
* NCNN_SIMPLESTL option
* simplestl default to OFF
* Update linux-x64-cpu-gcc.yml
* Update linux-x64-cpu-gcc.yml
* Update linux-x64-cpu-clang.yml
* drop functional header
* arm32 arm64 simplestl ci
* 修改一处内存泄漏, 去掉编译器警告
* resize时默认量的bug
Co-authored-by: nihuini <nihuini@tencent.com>
Co-authored-by: nihui <shuizhuyuanluo@126.com>
6 years ago
Naiyang Lin
ceef2470a5
Add logger.h ( #1753 )
6 years ago
nihuini
6682cd1638
image fp16pa, mark some bugihfa todo
6 years ago
nihuini
cefe8d38c3
dynamic image storage support from shape hint
6 years ago
nihuini
1e4a0752b4
fix interp ci test
6 years ago
nihui
9a9a618229
image storage is mandatory, less options makes life easier
6 years ago
nihui
e8688b042f
fuse packing cast storage, binaryop image shader, dummy buffer and image, device-wide utility packing converter operators, fix multi-blob layer test
6 years ago
nihui
62da1228e1
adreno image shader + fp16 + fp16a ( #1714 )
* wip
* wip
* fix
* image and imageview can not be destroyed until command execution ends
* fast copy path for tightly packed data
* wip
* texture load works
* 1d 3d image
* record clone image, multiple commands share one image reference
* upload download image
* layer forward accept vkimagemat
* vkimagemat graph works
* staging vkimagemat for passing dynamic parameters, macro for fp32+image shader, padding image shader
* vkimagemat elemsize
* convolution test pass
* conv1x1s1 image shader
* fast staging image allocator from host memory, pooling image shader
* convolutiondepthwise image shader
* innerproduct image shader
* packing image shader
* crop deconvolution image shader
* resolve spirv binding types
* image fp16 and fp16a, cast image shader
* eltwise image shader
* wip
* absval image shader
* deconvolutiondepthwise image shader
* concat image shader, squeezenet works
* noop split image shader
* uniform precision hint
* layer support_image_storage
* wip
* vulkan device utility operator
* command is storage and packing option aware
* fallback to cpu on image allocation failed, mobilenetssd works
* flatten image shader, enable more test
* ci test
* check imgfp32 imgfp16 imgfp16a features
* fix ci test
* fix ci test
* upgrade swiftshader
* wip
* opt aggressive
* imgfp16p
* opt none
* convolution winograd image shader
* fix flush range, fast copy path for continous buffer
* minor fix
* fix innerproduct
* wip ...
* wip
* cast fix
* packing test
* wip
* image fp16p is fp16p
* wip
* silence
* more line info
* code clean
* softmax image shader
6 years ago
nihuini
5580da4525
bump engine version
6 years ago
nihui
7365bb80a2
vkmat and command api breaks ( #1689 )
* vkmat and command api breaks
* always use compute queue for compute buffer transfer
* no barrier for readonly weight buffer
* record clone, drop queue_owner
* bring back layer forward
* fix validation errors
* lifecycle inside command makes life easier
* update doc
* record_import_android_hardware_buffer
6 years ago
nihuini
1ea9de3bdf
create shader pipeline by type index, resolve binding count and push constant count from spirv. since we don't create compound shader module for macos and ios compatibility, it is enough to use fixed main as the shader entry point
6 years ago