nihuini
84faed0b6d
implement flip layer and pnnx torch.flip conversion
10 months ago
nihuini
0bdd5a3f3a
revert doc
10 months ago
nihuini
5b13f451f5
merge
10 months ago
Copilot
4644540ea4
Add Windows XP support merging PRs #6176 and #6177 ( #6204 )
Co-authored-by: Sugar-Baby <87747602+Sugar-Baby@users.noreply.github.com>
Co-authored-by: AtomAlpaca <66774326+AtomAlpaca@users.noreply.github.com>
10 months ago
GIBEREZ
44982d0d23
About the update to the GLSL documentation after the image functions are deprecated ( #6173 )
11 months ago
chri321
4c72f52954
docs: update Chinese glsl-extension documentation ( #6162 )
- synchronize the latest English content to the Chinese documentation
- correct spelling errors in the English version of glsl-extention
- Fix spelling 'enable_validation_layer' in src/gpu.cpp
11 months ago
nihui
bd0b111775
vulkan tight fp16p pack1 ( #6127 )
11 months ago
nihui
80da741307
glsl define ncnn_glsl_version macro ( #6003 )
1 year ago
nihui
ef0b0e631c
interp output size expression ( #5994 )
1 year ago
nihui
39c055d7f2
crop axes starts ends expression ( #5976 )
* skip dynamic tensor index
* handle clone oom
1 year ago
nihui
84970eed4d
vulkan validation layer enables NCNN_LOGE in shader source ( #5963 )
* NCNN_LOGE in glsl
* Update glsl-extension.md
1 year ago
nihui
6396a732ef
reshape shape expression, drop reshape permute, test reshape oom ( #5918 )
1 year ago
nihui
1e3fcb9dda
paramdict value string type, natural array representation ( #5915 )
1 year ago
佰阅
f87ead3d88
init flip
1 year ago
nihui
0734b657d9
spectrogram and inverse spectrogram ( #5779 )
* only supports hann, hamming and all-one window
* inverse spectrogram does not support length parameter
* spectrogram always returns torch.view_as_real(out) as ncnn does not support complex typed mat yet
* inverse spectrogram always accepts torch.view_as_complex(in) as ncnn does not support complex typed mat yet
1 year ago
nihui
66b54cbea2
multiheadattention int8 quantization ( #5733 )
* x86 vulkan fallback
* comment about bf16s
1 year ago
nihui
1c7af00499
gemm int8 quantization ( #5706 )
* quantize gemm
* write gemm quantize scales
* update doc
* less openmp args
* x86 riscv fallback
* skip gemm vulkan int8
* fix noint8 test, fix arm bf16 test
* enable vfpv4 on neon build only
* fix gemm vulkan without C
* fp16 pack8 output
* enable elempack=8 only for asimdhp+
* tiled gemm int8 test
* opt arm64 tiles, fix asimdhp dispatch
1 year ago
nihui
5df5413c81
embed int8 quantization and add embed test ( #5667 )
1 year ago
nihui
fdf0df3079
RMSNorm ( #5630 )
1 year ago
nihui
4c3debae2d
multiheadattention scale param ( #5526 )
* update swiftshader
* skip vs2017 swiftshader
1 year ago
nihui
db035d602d
update ncnnoptimize layers, lightmode=false keeps original weight ( #5414 )
2 years ago
luqiang-guo
8ddc85f4dd
fix doc dual issue ( #5342 )
2 years ago
hugo-syn
f35eb4b3b8
chore: Fix multiple typos ( #5301 )
Signed-off-by: hugo-syn <hugo.vincent@synacktiv.com>
2 years ago
nihui
5329d32e74
check vulkan fp16 uniform support and implement lfp conversion without fp16u ( #5287 )
2 years ago
nihui
c222208cc9
feat mask for disable threading, make some extractor setter no-op, update doc ( #5270 )
2 years ago
張小凡
2ecaf37a3e
Fix find GPU driver dll path in windows ( #5141 )
2 years ago
nihui
b4f26237cb
in-house vulkan loader ( #5130 )
* vulkan-driver-loader.md
* static vulkan on apple
2 years ago
nihui
4494aadd74
deconvolution dynamic weight ( #5119 )
2 years ago
nihui
14e14a9ae8
slice with indices ( #5103 )
2 years ago
Yoh
3f437d3f3d
Grid sample op ( #4373 )
* pnnx support grid_sample op
* complete the permute and gridsample operator fusion
* spilt calculation into two stages and support permute fusion
2 years ago
Amir Ramezani
7e5fa3ade3
shrink operator ( #5022 )
2 years ago
Beq Jal
bcfec1da33
Celu layer and export to ncnn ( #5019 )
2 years ago
Beq Jal
c851231832
add diag layer and its converter ( #4935 )
2 years ago
nihui
4abadd2ffb
binaryop implicit broadcast B with 1 dimension rank for outer axis ( #4930 )
2 years ago
張小凡
1e0d70af8c
Add translated document: glsl-extension.zh.md ( #4818 )
2 years ago
nihui
43aba6badb
Update glsl-extension.md
3 years ago
nihui
172b748c74
add ncnn glsl extension doc ( #4817 )
3 years ago
nihui
9022b7162a
implement all explicit binaryop broadcast types ( #4809 )
* simplify binaryop
* less gpu test
* update binaryop broadcast doc
* do not test atan2 zero
3 years ago
nihui
c28c8c04a1
multiheadattention attn mask ( #4668 )
3 years ago
nihui
b640574b88
rough vulkan gemm and multiheadattention ( #4618 )
3 years ago
nihui
afc9310c62
update new operators for modelwriter ( #4540 )
3 years ago
nihui
fc6ce4a641
copyto operator ( #4522 )
3 years ago
nihui
242e775d21
pnnx convert torch log10, pow 2 as square ( #4518 )
3 years ago
nihui
246e71c526
implement atan2 ( #4516 )
3 years ago
Fangjun Kuang
92e75105c9
Support torch.cumsum ( #4505 )
3 years ago
nihui
ab4cfbf5b0
enrich ncnn binary broadcast rules ( #4513 )
3 years ago
nihui
fed99fd35b
gemm output transpose, prepack c ( #4479 )
* mha is now permute and reshape free
* gemm user defined tile mnk param
3 years ago
WuJinxuan
10e9d91576
Add x86 MultiHeadAttention ( #4443 )
* fix doc, sync x86 gemm fix
Co-authored-by: EdVince <EdVince@users.noreply.github.com>
Co-authored-by: nihuini <nihuini@tencent.com>
3 years ago
nihui
fd1ac3c7a0
x86 optimization for gemm unified elempack ( #4387 )
3 years ago
nihui
eceac35a7f
implement MultiheadAttention kdim vdim ( #4347 )
3 years ago