nihui/ncnn - ncnn - 开源协同云脑生态支撑系统

Commit Graph

Author	SHA1	Message	Date
nihui	6019f47f08	ci loongarch64 lsx (#4344 )	3 years ago
陸言	0f38cb2cd8	Add TH1520 (4*C910V) toolchain support. (#4267 )	3 years ago
Xavier Hsinyuan	e7eadca6c1	RVV: use new interface for segment load/store & change word_type to size_t&add clang ci (part #4100 ) (#4118 ) * RVV: use size_t for vl * RVV: replace vsseg.v tuple type by using regex ----- search: vsseg([1-9])e(8\|16\|32)_v_(f\|i\|u)\2m(1\|2\|4\|8)x\1$([ -~]+), vcreate_\3\2m\4x\1\(([ -~]+)$, vl\); substitute by: vsseg$1e$2_v_$3$2m$4($5, $6, vl); * RVV: replace vssseg.v tuple types by using regex --- search: vssseg([1-9])e(8\|16\|32)_v_f\2m1x\1$([ -~]+), vcreate_f\2m1x\1\(([ -~]+)$, vl\); substitute by: vssseg$1e$2_v_f$2m1($3, $4, vl); * RVV: replace vlseg.v tuple types in load/store * RVV: replace vloxseg2ei32.v tuple types * RVV: add a wrapper for old compilers * RVV: add segment load/store wrapper in pakcing * RVV: fix cmake test * RVV: make clang happy by dropping VLAs in sgemm * RVV: add clang cmake toolchain configure * RVV: add clang ci, riscv64-unknown-linux-gnu Co-authored-by: thelastlin <thelastlin@users.noreply.github.com> Co-authored-by: nihui <shuizhuyuanluo@126.com>	3 years ago
nihui	b4ba207c18	more strict compiler rvv checks, drop rvv-071 support (#4094 )	4 years ago
nihui	e64245c44a	ci x86 no sse, do not force sse2 for x86 32bit toolchain (#4043 )	4 years ago
nihui	c5bb0e52ed	add ingenic-x2000 toolchain file	4 years ago
陸言	cae8d0f1d7	Add Loongson 2F toolchain support (refer to AOSC) (#3992 )	4 years ago
nihui	1fd7138d2f	armv7 vfpv4 infrastructure (#3929 ) * armv7 vfpv4 infrastructure * optional fp16 format ieee * arm neon assembly optimization for cast fp16/bf16	4 years ago
nihui	d373407bcb	add c906 c910 v240 toolchain	4 years ago
nihui	f79073c182	update how-to-build doc for raspberrypi and d1	4 years ago
Xavier Hsinyuan	29b6a32ac0	RVV: follow intrinsic doc, replace vfredsum_* with vfredusum_* (#3790 ) * RVV: follow intrinsic doc, vfredusum -> vfredsum * C906: change toolchains for vfredusum * RVV: test compiler for vfredusum_vs_*	4 years ago
jasonZhang	2b0064c3db	fix armv7 platform build tests error (#3710 )	4 years ago
nihui	5b7268d95f	loongarch64 ci (#3455 )	4 years ago
nihui	c0a94cd9ca	fix armv7 without neon (#3514 )	4 years ago
nihui	3a83704c38	binary4d, unary4d (#3443 )	4 years ago
Zhuo Zhang	880e2805fe	add c906 v223 toolchain (#3449 )	4 years ago
nihui	64ba12be23	add c906 v2.2.2 toolchain	4 years ago
Martin Han	7633257ac5	Add support to Ingenic X2000 & T40 CPU (#3343 ) * Support Ingenic X2000 & T40 T40 claimed to have MIPS32R2 with MSA, but tested not working on my unit, so set to mips32r2 without MSA. * Add Ingenic T40XP benchmark result	4 years ago
tsuibin	1699aaaca6	added support for loongarch64 (#3094 ) Co-authored-by: tsuibin <tsuibin@loongson.cn>	5 years ago
nihui	1f49fc4b67	mips msa optimization for padding packing flatten innerproduct convolution convolutiondepthwise general cases	5 years ago
nihui	b8e03ced3c	allow examples building with simpleocv	5 years ago
nihuini	26ea87cc25	do not define android macro for jetson, fix #2939	5 years ago
nihui	655fb3b5eb	fix rv64gcv test coverage build	5 years ago
nihui	11958424c2	runtime riscv v and zfh dispatch, riscv v optimization for cast	5 years ago
nihui	48d5dbf33f	ci riscv rvv coverage (#2886 )	5 years ago
Zhang Xianyi	a1ece94f51	Use RVV spec 0.7.1 for C906. (#2868 )	5 years ago
nihui	45bf3cd779	add runtime riscv v detection function, the initial c906 riscv linux toolchain	5 years ago
DC Technology	11d1de7858	fix iOS minimum version,when used Vukan 1.2 (#2621 ) * fix iOS minimum version,when used Vukan 1.2 * Update IOS mininum version to 9.0	5 years ago
sunnycase	124d2c3d85	Support V831 (#2478 )	5 years ago
PENGUINLIONG	8f8f2de4d0	SSE2 optimization pack (#2123 ) * SSE2: BatchNorm * Fixed batch norm in AVX configuration * Optimized register size switch * Attempt to pass CI * Attempt to pass CI * Bias op * Element wise ops * Support packing on x86 by default * Fixed macro range in bias * Use aligned read for packed data * Update testutil.h * Update pooling_x86.cpp * Support wasn SIMD * Fix emscripten compiler flags * fix build * more ci fix * concat x86 pack4 * flatten x86 pack4 * more x86 pack4 * ci pass * fix * enable sse2 mathfun * enable --experimental-wasm-simd Co-authored-by: nihui <shuizhuyuanluo@126.com> Co-authored-by: nihuini <nihuini@tencent.com>	5 years ago
Zhuo Zhang	4a1c463cff	fix: fix gnueabihf cmake toolchain neon flags (#2272 ) Test platform: ZYNQ 7020, there is vfpv3 but isn't vfpv4	5 years ago
nihui	de039b3546	ci x86 32bit (#2116 )	5 years ago
nihui	926743ddae	cache cflags and cxxflags for hisi toolchain files	5 years ago
nihuini	1cdb66f614	c++03 ci	5 years ago
nihui	7ace8a933d	powerpc64, fix #2054 (#2058 ), workaround gcc altivec bug	5 years ago
Leo	5afd318b86	Support remove libstdc++ denpendency (#2030 ) * [build] add toolchain file w/o stdcxx dependency * [build] link m and gcc lib explicitly * [ncnn] complete simple stl impl * [ncnn] adapt for ncnn simplestl * [test] adapt for ncnn simplestl * [ncnn] fix missing algorithm and list when simplestl disabled * [ncnn] fix guard for operator new and delete * [style] fix the code style * [build] fix build failed on darwin and emscripten * [ci] do not import cxx to avoid operator conflict * [ncnn] add temporary partial_sort impl using bubble sort heap sort should be used for better perf. * [ncnn] add std greater and less function * [ncnn] fix placement new operator overload * [ncnn] add operator delete with size info * [build] disable exception, rtti, example and tools when simplestl on * [build] add toolchain for arm simplestl * [build] add toolchain for aarch64 simplestl * [ncnn] move initializer to constructor * [ncnn] use deteiled type instead of auto * [ncnn] use plain lib name in target_link_libraries	5 years ago
nihui	e0627ac855	Ci riscv64 linux (#1951 )	6 years ago
nihui	aee9f8a637	Ci mips mips64 (#1948 )	6 years ago
nihui	11cffce114	armv8.2 infrastructure (#1856 ) * runtime cpu dispatch * force thread one * disable openmp for coverage * simplify test layer * print NCNN_TARGET_ARCH * less ci build variants * weight fp16 storage option * test convdw int8 * apple a12 a13 * ncnn_add_layer ncnn_add_shader cmake macro	6 years ago
nihui	3ff40b0679	Ci rv32imc (#1940 )	6 years ago
nihui	fe6bc1ed4d	Ci rv64gcv and rv64gc (#1936 )	6 years ago
Tijmen Verhulsdonck	73aa99e83c	LSTM arm/x86 + fp16 innerproduct arm (#1881 ) * added fp16 weight storage version * Small changes * Fixed fp16 weight storage layers * fix innerproduct * fix loop error * Fix windows build. Disable fp 16 conversion when detecting int8 weights. Implement requested changes. * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * Update option.cpp Set fp16 storage based on vulkan being used or not. * added ability for storing state in lstm layer * added avx lstm * added arm lstm * fix innerproduct activation location and add 4 parallel channel version * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * revert arm file * commit before switch * implement requested changes * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * More x86 optimized implementations of common layers. Added LSTM layers for arm and x86 + a ctest to verify the layer accuracy Added fp16 innerproduct for arm * fix non avx build * Add fp16 arm compiler and cpu checks. Remove statefullness from LSTM implementation. * Fix build check for fp16 arm * Bypass lstm_fp16 if not supported * Build order was incorrect * fix std::min missing in windows build * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle * attempting to fix gnu build by enabling: -mfp16-format=ieee to fix the missing __fp16 type * remove double "fix" * Specify ieee fp16 format * implement requested changes * fix arm non-fp16 build * fix arm lstm * Restyled/pull 1881 (#15) * Restyled by clang-format * Restyled by astyle * Restyled by clang-format * Restyled by astyle Co-authored-by: Restyled.io <commits@restyled.io> * Check blob size on arm lstm * fix styling Co-authored-by: Restyled.io <commits@restyled.io>	6 years ago
nihuini	99d34135d7	build both ios native and bitcode library, fix #1742	6 years ago
nihui	163e2c0655	Travis ci armv7 (#1680 ) * try checkout v2 to resolve some ci issue	6 years ago
nihuini	89ef1f0d66	enable bitcode build	6 years ago
Leo	fc96039cef	Add some mips layers (#1442 ) * Add mips architecture optimization support * Add mipsisa32r6 toolchain files * Add AbsVal_mips layer * Remove unused constructor override * Keep the same programming style with upstream * Add Bias_mips layer * Add Clip_mips layer * Update copyright header * Add mips-mti-gnu compile to travis * Add codescape mips toolchain config * Update the toolchain comments * Fix conflicts in travis config file * Add travis status in README.md	6 years ago
tpoisonooo	1ca4387c9c	Auto choose conv implementation (#1085 ) * add relative README_CN.md; * obtain time cost with op->forward().	7 years ago
Christopher	6cfd09b429	add toolchains hisiv600toolchain for hi3559V100 (#1090 )	7 years ago
kalcohol	a6aab42f95	add himix200 toolchain for Hi3516CV500, Hi3516DV300, Hi3519AV100. (#989 )	7 years ago
nihuini	e46a3e428a	cmake warning--	7 years ago

1 2

55 Commits (057b5bb515d551fa64decdb7350422c19feba447)