nihui
7365bb80a2
vkmat and command api breaks ( #1689 )
* vkmat and command api breaks
* always use compute queue for compute buffer transfer
* no barrier for readonly weight buffer
* record clone, drop queue_owner
* bring back layer forward
* fix validation errors
* lifecycle inside command makes life easier
* update doc
* record_import_android_hardware_buffer
6 years ago
nihui
043a8f1ac1
add yolact example, fix #1679
6 years ago
nihuini
1469bc8b19
reclaim immediately after submitting, so we don't block the queue access during the long-time waiting fence, fix #1682
6 years ago
nihuini
0ae11b6e4a
deepcopy layer
6 years ago
nihuini
f5e52e1bae
codecov is much better, drop coveralls.io
6 years ago
nihuini
a05b97a430
dropout prelu scale gpu test
6 years ago
nihuini
17bf5247cc
visualize Mat
6 years ago
nihuini
9f3af60b3a
dropout prelu scale test
6 years ago
nihui
163e2c0655
Travis ci armv7 ( #1680 )
* try checkout v2 to resolve some ci issue
6 years ago
nihuini
85d5e5d3e4
fix innerproduct vulkan pack8 and arm neon, disable packing_layout for int8 test
6 years ago
nihuini
32a9a489bc
fix flatten vulkan fp16p image pack1to4 and pack1to8
6 years ago
nihuini
6077066b02
binaryop broadcasting special type 3 4 for lhs
6 years ago
nihuini
8e415348ee
to_pixels family arm neon optimize
6 years ago
nihui
e344e3cde6
drop VkMat range reference for potential alignment issue
6 years ago
nihuini
69b7683d0e
adapt api changes
6 years ago
nihuini
1ea9de3bdf
create shader pipeline by type index, resolve binding count and push constant count from spirv. since we don't create compound shader module for macos and ios compatibility, it is enough to use fixed main as the shader entry point
6 years ago
nihuini
ee118e7d70
reconstruct import android hardwarebuffer api, wip
6 years ago
kalcohol
8fecf1989e
add vs src group ( #1659 )
* add vs src group
* check condition first
6 years ago
nihui
f22e5d4a6d
fix build
6 years ago
nihui
3cd7a30172
shufflechannel bf16s
6 years ago
nihui
90e6be457b
conv1x1s1 bf16s neon kernel
6 years ago
nihuini
4d6cf47db8
pixel type BGRA
6 years ago
nihuini
3cbdd077e9
sometimes binaryop weight is missing in graph nodes, fix #1640
6 years ago
nihui
7a89ce6223
slice bf16s
6 years ago
nihui
f3b39a3b4f
improve onnx hardsigmoid hardswish fusion, fix #1650 , ignore mips ci failure
6 years ago
nihui
e23d5038ab
clip sigmoid tanh bf16s
6 years ago
nihuini
9ce0ad78ff
hardswish bf16s
6 years ago
nihuini
3b243cc7d5
hardsigmoid bf16s
6 years ago
nihuini
867ff7ae97
binaryop bf16s
6 years ago
nihuini
89ef1f0d66
enable bitcode build
6 years ago
nihuini
253e505765
improve compatibility with onnx slice opset version 10+
6 years ago
nihuini
d2f7fc5a76
fix dwconv5x5s1 pack4 bf16s on aarch64
6 years ago
nihui
efaa1a4af1
dwconv5x5s1 pack4 bf16s neon kernel
6 years ago
新无止竞博客
5ea683f202
Fix ncnn error in MinGW compilation of windows system ( #1645 )
6 years ago
nihui
ec40b4dbd7
test bf16s ( #1644 )
* wip
* wip
* wip
* fix avx2 test
6 years ago
Michael Grad
1dea8774b9
fix arch64 build ( #1633 )
Co-authored-by: mgrad <mgrad@meraki.com>
6 years ago
nihuini
5255d2c328
dwconv5x5s2 pack4 bf16s neon kernel
6 years ago
yehao
f6642ac631
Update FAQ-ncnn-produce-wrong-result.md ( #1641 )
fix spell mistakes
6 years ago
nihui
d023137426
test fp16 packed and shader pack8 option ( #1636 )
* wip
* fix slice pack8 test
* fix flatten pack8 test
* fix binaryop pack8 test
* fix interp pack8 test
* rewrite cast test for different blob type and packing
6 years ago
nihuini
9b09cc16b5
fix compability with onnx clip opset11, fix #1609
6 years ago
nihuini
1984cad0e1
conv5x5s2 bf16s neon kernel
6 years ago
nihuini
ee41ef4a37
include <limit.h> for INT_MAX, fix #1631
6 years ago
nihuini
c009928628
sizeof return byte but not bit
6 years ago
nihuini
4c6bf24205
explicit cpu thread affinity
6 years ago
Xu Yang
dbd9cbab4a
fix layer innerproduct when build with requant option on ( #1624 )
6 years ago
kalcohol
548cba8b21
add build script for building android lib on windows via ndk ( #1622 )
* add build script for building android lib on windows via ndk
* rename
6 years ago
Leo
63be294d81
Add mips layers ( #1496 )
* Add mips softmax layer
* Fix bias value error in bias_layer
* Fix max and min value error in clip_layer
* Remove unused elempack variable
* Add sigmoid mips layer
* Add mips tanh math func
* Add mips tanh layer
* Add msa_fill_w_f32 for load float type data
* Remove conv layer header
6 years ago
nihuini
17577775ae
conv5x5s1 bf16s neon kernel
6 years ago
nihui
8a84077429
ncnnoptimize lstm
6 years ago
nihui
f972bf49d1
enable bugihfa on rk3288 and rk3399
6 years ago