nihui
|
d0bcda70ca
|
arm neon optimize for winograd input output transform, about 4%~22% faster
|
8 years ago |
nihui
|
c656c572ad
|
use float type for input transform
|
8 years ago |
nihuini
|
0219f507b7
|
convert ConvTranspose and InstanceNormalization
|
8 years ago |
nihuini
|
9b6445cd05
|
convert reshape and transpose
|
8 years ago |
nihuini
|
4f79b89635
|
comment for reshape flag
|
8 years ago |
nihuini
|
aee6552360
|
memcpy is fast :)
|
8 years ago |
nihuini
|
db4b04da53
|
fix slice with implicit slice point
|
8 years ago |
nihuini
|
bc905288a7
|
set default CMAKE_BUILD_TYPE before project command, since the later can set CMAKE_BUILD_TYPE itself
|
8 years ago |
wangyakun
|
c4cbacf8f5
|
Add support for Reshape in mxnet2ncnn
|
8 years ago |
Yantao Xie
|
97d41f8e7d
|
Make debug easy (#341)
* Make CMAKE_BUILD_TYPE not overwirte the exist setting if there is.
* Disable the optimization setting during debug.
|
8 years ago |
BUG1989
|
67732a83d8
|
Add the benchmark of RK3288 (#340)
* Update README.md
Update benchmark of RK3288
|
8 years ago |
BUG1989
|
2ecad8ab32
|
Update README.md
update Qualcomm MSM8996 Snapdragon 820 benchmark
|
8 years ago |
nihuini
|
593703b71f
|
sleep 10 seconds for cooling down SOC, more warmups
|
8 years ago |
nihui
|
e2d35702ce
|
Update README.md
|
8 years ago |
nihui
|
0f390ad731
|
Update README.md
|
8 years ago |
nihui
|
89ef9d779d
|
Update README.md
|
8 years ago |
nihui
|
addfcbafe0
|
build onnx2ncnn by default
|
8 years ago |
nihui
|
b27cd92dfa
|
paste oops
|
8 years ago |
nihui
|
2cc6abde6a
|
convert ImageScaler LeakyRelu, tiny-yolov2 works
|
8 years ago |
nihui
|
2abeb038a0
|
unroll outch for conv3x3s2, about 30% faster :)
|
8 years ago |
nihui
|
97025668a9
|
unroll conv1x1s1 outch 6 inch 4 on armv7, about 2%~18% faster
|
8 years ago |
nihui
|
0a666c8cb9
|
Update README.md
|
8 years ago |
nihuini
|
dd0ae756de
|
batchnorm and scale on vector and image, fix #331
|
8 years ago |
AlanNewImage
|
e77eef3f3d
|
Update priorbox.cpp (#330)
|
8 years ago |
daquexian
|
3a5d7cfcce
|
Fix mobilenet v2's stride (#327)
|
8 years ago |
Yantao Xie
|
cd3617b11d
|
Set ArgMax's one_blob_only as true. (#325)
|
8 years ago |
Joe
|
9748c00b44
|
add image-level feature support (#320)
* add image-level feature support
* move special case out
* tab to space
|
8 years ago |
nihui
|
0e41c37250
|
convert convolution pad_w pad_h
|
8 years ago |
Tiancai Ye
|
ea95c7a7fc
|
fix a fix~~~ (#323)
|
8 years ago |
Tiancai Ye
|
3977d32eb9
|
Fix windows build fails (#321)
* fix windows build error
* remove wrong commit
|
8 years ago |
Yantao Xie
|
73340578c8
|
Remove the destructor definition from the lstm layer. (#319)
|
8 years ago |
Howave
|
5e7332e507
|
align memory start address (#318)
* make memory start address 4bytes aligned
* align memory start address for MSVC
|
8 years ago |
nihuini
|
fac262658c
|
use android prebuild releas folder, build arm64 jni library
|
8 years ago |
nihuini
|
4019505642
|
use clang, fix build, fix #292
|
8 years ago |
nihuini
|
e4c1ddbc45
|
rewrite inner loop in assembly, since gcc is sometimes foolish qaq, fix #312
|
8 years ago |
nihuini
|
aac70893f8
|
fix build on gcc
|
8 years ago |
nihuini
|
394bca8dbb
|
Merge branch 'master' of https://github.com/Tencent/ncnn
|
8 years ago |
nihuini
|
9ac305e160
|
create 3-dim sub blob for group convolution, fix #315
|
8 years ago |
Howave
|
415bfbdfa7
|
added arm layer compilation for arm-linux system (#316)
|
8 years ago |
nihuini
|
318d3abe66
|
bind register explicitly, fix #306, fix #310, fix #312
|
8 years ago |
Yantao Xie
|
2e9da1b95b
|
Add the epsilon parameter to the BatchNorm layer. (fix #303) (#311)
* Add the epsilon parameter to the BatchNorm layer. (fix #303)
* Move the eps into the sqrt.
|
8 years ago |
nihuini
|
231a52e469
|
fix build on aarch64 with gcc, fix #309
|
8 years ago |
BUG1989
|
4ebab2725d
|
Updata benchmark ReadMe (#308)
|
8 years ago |
BUG1989
|
af7019d3fc
|
fix compile error (#305)
|
8 years ago |
nihui
|
d7e31987fa
|
Update README.md
|
8 years ago |
nihui
|
004632d9e5
|
Update README.md
|
8 years ago |
nihui
|
ce9eeaba9a
|
Update README.md
|
8 years ago |
nihui
|
875a188d10
|
pre interleave kernel memory for winograd4, about 3%~20% speed gains
|
8 years ago |
nihui
|
d2c01019aa
|
fix convert depthwise deconvolution, fix #300
|
8 years ago |
dong
|
6ea09ebf2c
|
Use aarch64 assembly to replace arm intrinsics
|
8 years ago |