nihui
|
d85775fbcd
|
fix softmax axis order on 3-dim, fix caffe reshape conversion, regenerate ssd param
|
7 years ago |
nihui
|
979ed57487
|
packing param for identity packing when padding disabled, auto packing conversion between cpu and gpu blob
|
7 years ago |
nihui
|
b49cb56ad9
|
constify vulkan device handle, use default local vulkan device if not specified
|
7 years ago |
nihui
|
5e07749a4a
|
do not emit upload transfer on unified memory
|
7 years ago |
nihui
|
9ebac3fe9e
|
dedicated reference counter for staging data
|
7 years ago |
nihui
|
68afd1fa17
|
reset fence
|
7 years ago |
nihui
|
81ee56b209
|
copy buffer has offset alignment limit, re-implement concat as compute pipeline
|
7 years ago |
nihuini
|
83efa73cf6
|
fallback to cpu forward if layer not support vulkan, automatically!
|
7 years ago |
nihuini
|
bdd305638d
|
command reset
|
7 years ago |
nihuini
|
10a088397e
|
concat interleave image row
|
7 years ago |
nihuini
|
1ace8068e3
|
zero detected is not error
|
7 years ago |
nihuini
|
14efdd8e00
|
reorg shader
|
7 years ago |
nihui
|
b62e9c4b1e
|
shufflechannel shader
|
7 years ago |
nihuini
|
bb04055e80
|
permute shader
|
7 years ago |
nihui
|
24f423b0c6
|
fix build on msvc
|
7 years ago |
nihui
|
cc4376d8e6
|
do not upload unnecessary pack1 weight, reduce gpu memory usage
|
7 years ago |
nihui
|
0ad0c07526
|
drop duplicated weight data in convolution-fc, use the more light-weight pipelines
|
7 years ago |
nihuini
|
43c4b57201
|
group deconvolution packing family
|
7 years ago |
nihuini
|
8547864b6f
|
group convolution packing family
|
7 years ago |
nihuini
|
675fcc72a5
|
interp vulkan
|
7 years ago |
nihuini
|
37413ea95c
|
implement depthwise deconvolution vulkan, fix top blob state
|
7 years ago |
nihuini
|
468516879f
|
implement deconvolution vulkan family support
|
7 years ago |
nihuini
|
e213605cd4
|
reduce memory usage of weight packing
|
7 years ago |
nihuini
|
7312887671
|
transfer command hold data context
|
7 years ago |
nihuini
|
4a57f88c3c
|
vkcompute auto begin end, use proper alignment for vktransfer staging buffer offset
|
7 years ago |
nihuini
|
39f2c71d5b
|
fix name conflict on ios
|
7 years ago |
nihui
|
f4e12101c0
|
fix convolution typed innerproduct pack4
|
7 years ago |
nihui
|
0acdbebf3b
|
merge refcount into buffer memory cookie
|
7 years ago |
nihui
|
960ffa1a50
|
optimize workgroup size for convolution depthwise and innerproduct pack4
|
7 years ago |
nihui
|
a15b389d86
|
fix innerproduct pack1to4 pack4to1 weight upload
|
7 years ago |
Emmanuel Benazera
|
a8fd79e1bc
|
fixed cell initialization in LSTM layer
|
7 years ago |
nihui
|
62543f9b1e
|
flatten pack1to4
|
7 years ago |
nihui
|
9480dcbc36
|
fix innerproduct out packing
|
7 years ago |
nihui
|
f9dc551081
|
add innerproduct pack1to4 pack4to1 glue code
|
7 years ago |
nihui
|
3f91d6b529
|
add innerproduct pack1to4 pack4to1 shader
|
7 years ago |
nihui
|
cd7f120250
|
lrn norm across channel pack4, rename member name with pipeline prefix
|
7 years ago |
nihui
|
7ee3216fff
|
add convolution pack1to4 pack4to1
|
7 years ago |
nihui
|
9d2b345eab
|
lrn region within channel pack4
|
7 years ago |
nihui
|
ad68e1e0e6
|
enable googlenet alexnet vulkan benchmark, fix build on msvc
|
7 years ago |
nihui
|
559183904b
|
fix random crash on dedicated allocation
|
7 years ago |
nihui
|
f9ea621305
|
pooling full padding
|
7 years ago |
nihui
|
ee59f14900
|
add lrn shader
|
7 years ago |
nihui
|
1792fe79ec
|
drop deprectaed softmax shader, destory softmax pipeline
|
7 years ago |
nihui
|
9e2b327c17
|
packing shader for 3-dim blob
|
7 years ago |
nihuini
|
9a805b045e
|
innerproduct receive flattened blob
|
7 years ago |
nihui
|
c60773bde4
|
add transfer-transfer barrier, concat pack4
|
7 years ago |
nihui
|
303996af4c
|
auto flatten before innerproduct
|
7 years ago |
nihuini
|
ba723706bb
|
add flatten pack4
|
7 years ago |
nihui
|
f0b4933eac
|
massive simd optimize in compute shader (#772)
* init vec4 shader
* more vec4 shader ...
* convolutiondepthwise is depthwise
* pooling pack4, fix global pooling
* dropout pack4, relu pack4
* softmax pack4
* more shader vec4 ..
* fix staging remap, remove layer pipeline member, add destroy_pipeline interface, add pack4 glue code
* eltwise pack4 glue code
* add binary pack4, unary pack4
* add binaryop unaryop pack4 glue code
|
7 years ago |
nihui
|
8e5674363b
|
element packing (#770)
* mat packing
* packing layer
* packing works
* convert_packing function
|
7 years ago |