nihui
9b91fe5153
implement flip layer and pnnx torch.flip conversion ( #6233 )
Co-authored-by: 佰阅 <43716063+Baiyuetribe@users.noreply.github.com>
9 months ago
Copilot
4644540ea4
Add Windows XP support merging PRs #6176 and #6177 ( #6204 )
Co-authored-by: Sugar-Baby <87747602+Sugar-Baby@users.noreply.github.com>
Co-authored-by: AtomAlpaca <66774326+AtomAlpaca@users.noreply.github.com>
10 months ago
GIBEREZ
44982d0d23
About the update to the GLSL documentation after the image functions are deprecated ( #6173 )
10 months ago
chri321
4c72f52954
docs: update Chinese glsl-extension documentation ( #6162 )
- synchronize the latest English content to the Chinese documentation
- correct spelling errors in the English version of glsl-extention
- Fix spelling 'enable_validation_layer' in src/gpu.cpp
10 months ago
nihui
bd0b111775
vulkan tight fp16p pack1 ( #6127 )
11 months ago
nihui
80da741307
glsl define ncnn_glsl_version macro ( #6003 )
1 year ago
nihui
ef0b0e631c
interp output size expression ( #5994 )
1 year ago
nihui
39c055d7f2
crop axes starts ends expression ( #5976 )
* skip dynamic tensor index
* handle clone oom
1 year ago
nihui
84970eed4d
vulkan validation layer enables NCNN_LOGE in shader source ( #5963 )
* NCNN_LOGE in glsl
* Update glsl-extension.md
1 year ago
nihui
6396a732ef
reshape shape expression, drop reshape permute, test reshape oom ( #5918 )
1 year ago
nihui
1e3fcb9dda
paramdict value string type, natural array representation ( #5915 )
1 year ago
nihui
0734b657d9
spectrogram and inverse spectrogram ( #5779 )
* only supports hann, hamming and all-one window
* inverse spectrogram does not support length parameter
* spectrogram always returns torch.view_as_real(out) as ncnn does not support complex typed mat yet
* inverse spectrogram always accepts torch.view_as_complex(in) as ncnn does not support complex typed mat yet
1 year ago
nihui
66b54cbea2
multiheadattention int8 quantization ( #5733 )
* x86 vulkan fallback
* comment about bf16s
1 year ago
nihui
1c7af00499
gemm int8 quantization ( #5706 )
* quantize gemm
* write gemm quantize scales
* update doc
* less openmp args
* x86 riscv fallback
* skip gemm vulkan int8
* fix noint8 test, fix arm bf16 test
* enable vfpv4 on neon build only
* fix gemm vulkan without C
* fp16 pack8 output
* enable elempack=8 only for asimdhp+
* tiled gemm int8 test
* opt arm64 tiles, fix asimdhp dispatch
1 year ago
nihui
5df5413c81
embed int8 quantization and add embed test ( #5667 )
1 year ago
nihui
fdf0df3079
RMSNorm ( #5630 )
1 year ago
nihui
4c3debae2d
multiheadattention scale param ( #5526 )
* update swiftshader
* skip vs2017 swiftshader
1 year ago
nihui
db035d602d
update ncnnoptimize layers, lightmode=false keeps original weight ( #5414 )
2 years ago
luqiang-guo
8ddc85f4dd
fix doc dual issue ( #5342 )
2 years ago
hugo-syn
f35eb4b3b8
chore: Fix multiple typos ( #5301 )
Signed-off-by: hugo-syn <hugo.vincent@synacktiv.com>
2 years ago
nihui
5329d32e74
check vulkan fp16 uniform support and implement lfp conversion without fp16u ( #5287 )
2 years ago
nihui
c222208cc9
feat mask for disable threading, make some extractor setter no-op, update doc ( #5270 )
2 years ago
張小凡
2ecaf37a3e
Fix find GPU driver dll path in windows ( #5141 )
2 years ago
nihui
b4f26237cb
in-house vulkan loader ( #5130 )
* vulkan-driver-loader.md
* static vulkan on apple
2 years ago
nihui
4494aadd74
deconvolution dynamic weight ( #5119 )
2 years ago
nihui
14e14a9ae8
slice with indices ( #5103 )
2 years ago
Yoh
3f437d3f3d
Grid sample op ( #4373 )
* pnnx support grid_sample op
* complete the permute and gridsample operator fusion
* spilt calculation into two stages and support permute fusion
2 years ago
Amir Ramezani
7e5fa3ade3
shrink operator ( #5022 )
2 years ago
Beq Jal
bcfec1da33
Celu layer and export to ncnn ( #5019 )
2 years ago
Beq Jal
c851231832
add diag layer and its converter ( #4935 )
2 years ago
nihui
4abadd2ffb
binaryop implicit broadcast B with 1 dimension rank for outer axis ( #4930 )
2 years ago
張小凡
1e0d70af8c
Add translated document: glsl-extension.zh.md ( #4818 )
2 years ago
nihui
43aba6badb
Update glsl-extension.md
2 years ago
nihui
172b748c74
add ncnn glsl extension doc ( #4817 )
2 years ago
nihui
9022b7162a
implement all explicit binaryop broadcast types ( #4809 )
* simplify binaryop
* less gpu test
* update binaryop broadcast doc
* do not test atan2 zero
2 years ago
nihui
c28c8c04a1
multiheadattention attn mask ( #4668 )
3 years ago
nihui
b640574b88
rough vulkan gemm and multiheadattention ( #4618 )
3 years ago
nihui
afc9310c62
update new operators for modelwriter ( #4540 )
3 years ago
nihui
fc6ce4a641
copyto operator ( #4522 )
3 years ago
nihui
242e775d21
pnnx convert torch log10, pow 2 as square ( #4518 )
3 years ago
nihui
246e71c526
implement atan2 ( #4516 )
3 years ago
Fangjun Kuang
92e75105c9
Support torch.cumsum ( #4505 )
3 years ago
nihui
ab4cfbf5b0
enrich ncnn binary broadcast rules ( #4513 )
3 years ago
nihui
fed99fd35b
gemm output transpose, prepack c ( #4479 )
* mha is now permute and reshape free
* gemm user defined tile mnk param
3 years ago
WuJinxuan
10e9d91576
Add x86 MultiHeadAttention ( #4443 )
* fix doc, sync x86 gemm fix
Co-authored-by: EdVince <EdVince@users.noreply.github.com>
Co-authored-by: nihuini <nihuini@tencent.com>
3 years ago
nihui
fd1ac3c7a0
x86 optimization for gemm unified elempack ( #4387 )
3 years ago
nihui
eceac35a7f
implement MultiheadAttention kdim vdim ( #4347 )
3 years ago
Lry89757
6a47f8d15c
gridsample op support ( #4288 )
Co-authored-by: LRY89757 <LRY89757@users.noreply.github.com>
Co-authored-by: nihuini <nihuini@tencent.com>
Co-authored-by: nihui <shuizhuyuanluo@126.com>
3 years ago
Fangjun Kuang
5281d51535
implement GLU and pnnx conversion ( #4283 )
3 years ago
nihui
77eda4c19f
implement lstm proj_size ( #4263 )
3 years ago