You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 
Rohit Goswami 2fe1f31161 MAINT: Start working on kernels and driver L2 1 year ago
..
KERNEL Add kernel definitions for CSUM and ZSUM 1 year ago
KERNEL.ATOM Remove all trailing whitespace except lapack-netlib 11 years ago
KERNEL.BARCELONA Bugfix for ztrmv 10 years ago
KERNEL.BOBCAT Fixed #395. Enable optimized cgemm for Sandybridge. Added optimized sdot kernel. 11 years ago
KERNEL.BULLDOZER Add trivially optimized dsdot based on sdot 8 years ago
KERNEL.COOPERLAKE Make AVX512 BFLOAT16 kernels conditional on compiler capability 2 years ago
KERNEL.CORE2 Remove all trailing whitespace except lapack-netlib 11 years ago
KERNEL.DUNNINGTON Remove all trailing whitespace except lapack-netlib 11 years ago
KERNEL.EXCAVATOR Add trivially optimized dsdot based on sdot 8 years ago
KERNEL.HASWELL Add sscal.c + microkernels for Haswell, Zen, Skylake and newer. 3 years ago
KERNEL.NANO Remove all trailing whitespace except lapack-netlib 11 years ago
KERNEL.NEHALEM Add trivially optimized dsdot based on sdot 8 years ago
KERNEL.OPTERON Remove all trailing whitespace except lapack-netlib 11 years ago
KERNEL.OPTERON_SSE3 Fixed #395. Enable optimized cgemm for Sandybridge. Added optimized sdot kernel. 11 years ago
KERNEL.PENRYN Remove all trailing whitespace except lapack-netlib 11 years ago
KERNEL.PILEDRIVER Add trivially optimized dsdot based on sdot 8 years ago
KERNEL.PRESCOTT fallback to zgemm_kernel_4x2_sse.S 11 years ago
KERNEL.SANDYBRIDGE Add trivially optimized dsdot based on sdot 8 years ago
KERNEL.SAPPHIRERAPIDS Make AVX512 BFLOAT16 kernels conditional on compiler capability 2 years ago
KERNEL.SKYLAKEX Add kernel definitions for CSUM and ZSUM 1 year ago
KERNEL.STEAMROLLER Add trivially optimized dsdot based on sdot 8 years ago
KERNEL.ZEN Add sscal.c + microkernels for Haswell, Zen, Skylake and newer. 3 years ago
KERNEL.generic Add ?sum definitions for generic kernel 6 years ago
Makefile Import GotoBLAS2 1.13 BSD version codes. 15 years ago
amax.S use emms instead, add WIN guards 5 years ago
amax_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
amax_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
amax_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
asum.S use emms instead, add WIN guards 5 years ago
asum_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
asum_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
asum_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
axpy.S Remove all trailing whitespace except lapack-netlib 11 years ago
axpy_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
axpy_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
axpy_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
bf16_common_macros.h x86_64: BFLOAT16: fix build warning 4 years ago
bf16to.c Add bfloat16 based dot and conversion with single/double 5 years ago
builtin_stinit.S Remove all trailing whitespace except lapack-netlib 11 years ago
cabs.S Remove all trailing whitespace except lapack-netlib 11 years ago
casum.c Fix casum fallback kernel. 2 years ago
casum_microk_skylakex-2.c Use _mm_set1_epi{32,64x} to init mask in x86-64 [cz]asum 2 years ago
caxpy.c initial support for Sapphire Rapids platform 4 years ago
caxpy_microk_bulldozer-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
caxpy_microk_haswell-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
caxpy_microk_sandy-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
caxpy_microk_steamroller-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
cdot.c initial support for Sapphire Rapids platform 4 years ago
cdot_microk_bulldozer-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 7 years ago
cdot_microk_haswell-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 7 years ago
cdot_microk_sandy-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 7 years ago
cdot_microk_steamroller-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 7 years ago
cgemm3m_kernel_8x4_haswell.c Update cgemm3m_kernel_8x4_haswell.c 6 years ago
cgemm_kernel_4x2_bulldozer.S change line endings from CRLF to LF 3 years ago
cgemm_kernel_4x2_piledriver.S change line endings from CRLF to LF 3 years ago
cgemm_kernel_4x8_sandy.S Update organization info. 11 years ago
cgemm_kernel_8x2_haswell.S modification for clang compiler 11 years ago
cgemm_kernel_8x2_haswell.c Update cgemm_kernel_8x2_haswell.c 6 years ago
cgemm_kernel_8x2_sandy.S change line endings from CRLF to LF 3 years ago
cgemm_kernel_8x2_skylakex.c AVX512 CGEMM & ZGEMM kernels 6 years ago
cgemv_n.S Remove all trailing whitespace except lapack-netlib 11 years ago
cgemv_n_4.c Disable gcc's tree-vectorizer pass on all operating systems 2 years ago
cgemv_n_microk_bulldozer-4.c Tag %1 and %2 as both input and output operands 8 years ago
cgemv_n_microk_haswell-4.c Tag %1 and %2 as both input and output 8 years ago
cgemv_t.S Remove all trailing whitespace except lapack-netlib 11 years ago
cgemv_t_4.c Disable gcc's tree-vectorizer pass on all operating systems 2 years ago
cgemv_t_microk_bulldozer-4.c Tag %1 and %2 as both input and output operands 8 years ago
cgemv_t_microk_haswell-4.c Tag %1 and %2 as both input and output 8 years ago
copy.S Remove all trailing whitespace except lapack-netlib 11 years ago
copy_sse.S Import GotoBLAS2 1.13 BSD version codes. 15 years ago
copy_sse2.S Convert aligned moves to unaligned 5 years ago
cscal.c handle corner cases involving NAN and/or INF 1 year ago
cscal_microk_bulldozer-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 7 years ago
cscal_microk_haswell-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 7 years ago
cscal_microk_skylakex-2.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
cscal_microk_steamroller-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 7 years ago
csum.c Add CSUM and ZSUM kernels (trivially derived from their existing ASUM counterparts) 1 year ago
csum_microk_skylakex-2.c Add CSUM and ZSUM kernels (trivially derived from their existing ASUM counterparts) 1 year ago
ctrsm_kernel_LN_bulldozer.c added optimized trsm_kernels 10 years ago
ctrsm_kernel_LT_bulldozer.c added optimized trsm_kernels 10 years ago
ctrsm_kernel_RN_bulldozer.c added optimized trsm_kernels 10 years ago
ctrsm_kernel_RT_bulldozer.c added optimized trsm_kernels 10 years ago
dasum.c Use SkylakeX ?ASUM microkernel for Cooperlake/Sapphirerapids as well 2 years ago
dasum_microk_haswell-2.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
dasum_microk_skylakex-2.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
daxpy.c initial support for Sapphire Rapids platform 4 years ago
daxpy_bulldozer.S Remove all trailing whitespace except lapack-netlib 11 years ago
daxpy_microk_bulldozer-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 7 years ago
daxpy_microk_haswell-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
daxpy_microk_nehalem-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 7 years ago
daxpy_microk_piledriver-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 7 years ago
daxpy_microk_sandy-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 7 years ago
daxpy_microk_skylakex-2.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
daxpy_microk_steamroller-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 7 years ago
dcopy_bulldozer.S added dcopy_bulldozer.S 12 years ago
ddot.c fix improper function prototypes (empty parentheses) 2 years ago
ddot_bulldozer.S Remove all trailing whitespace except lapack-netlib 11 years ago
ddot_microk_bulldozer-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 7 years ago
ddot_microk_haswell-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
ddot_microk_nehalem-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 7 years ago
ddot_microk_piledriver-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
ddot_microk_sandy-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
ddot_microk_skylakex-2.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
ddot_microk_steamroller-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
dgemm_beta_skylakex.c Fix thinko in skylake beta handling 7 years ago
dgemm_kernel_4x4_haswell.S change line endings from CRLF to LF 3 years ago
dgemm_kernel_4x8_haswell.S change line endings from CRLF to LF 3 years ago
dgemm_kernel_4x8_sandy.S Change file comments to work around clang 3.9 assembler bug 9 years ago
dgemm_kernel_4x8_skylakex.c Use p2align instead of align for OSX compatibility 7 years ago
dgemm_kernel_4x8_skylakex_2.c change line endings from CRLF to LF 3 years ago
dgemm_kernel_6x4_piledriver.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemm_kernel_8x2_bulldozer.S change line endings from CRLF to LF 3 years ago
dgemm_kernel_8x2_piledriver.S change line endings from CRLF to LF 3 years ago
dgemm_kernel_8x8_skylakex.c Update dgemm_kernel_8x8_skylakex.c 6 years ago
dgemm_kernel_16x2_haswell.S change line endings from CRLF to LF 3 years ago
dgemm_kernel_16x2_skylakex.S Use AVX512 also for DGEMM 7 years ago
dgemm_kernel_16x2_skylakex.c GEMM: skylake: improve the performance when m is small 4 years ago
dgemm_ncopy_2.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemm_ncopy_4.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemm_ncopy_8.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemm_ncopy_8_bulldozer.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemm_ncopy_8_skylakex.c Fix warnings 3 years ago
dgemm_small_kernel_nn_skylakex.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
dgemm_small_kernel_nt_skylakex.c Small Matrix: use proper inline asm input constraint for AVX512 mask 3 years ago
dgemm_small_kernel_permit_skylakex.c Small Matrix: skylakex: add DGEMM_SMALL_M_PERMIT and tune for TN kernel 4 years ago
dgemm_small_kernel_tn_skylakex.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
dgemm_small_kernel_tt_skylakex.c Small Matrix: skylakex: fix build error in old compiler 4 years ago
dgemm_tcopy_2.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemm_tcopy_4.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemm_tcopy_8.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemm_tcopy_8_bulldozer.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemm_tcopy_8_skylakex.c Add optimized *copy versions for skylakex 7 years ago
dgemm_tcopy_16_skylakex.c Fix build with -Werror=return-type 5 years ago
dgemv_n.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemv_n_4.c initial support for Sapphire Rapids platform 4 years ago
dgemv_n_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemv_n_bulldozer.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemv_n_microk_haswell-4.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
dgemv_n_microk_nehalem-4.c Replace .align with .p2align in the Nehalem microkernels 8 years ago
dgemv_n_microk_piledriver-4.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
dgemv_n_microk_skylakex-4.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
dgemv_t.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemv_t_4.c initial support for Sapphire Rapids platform 4 years ago
dgemv_t_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemv_t_bulldozer.S Remove all trailing whitespace except lapack-netlib 11 years ago
dgemv_t_microk_haswell-4.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
dger.c optimized dger kernel for sandybridge 10 years ago
dger_microk_sandy-2.c Fix declaration of input arguments in the Sandybridge GER microkernels (#1967) 7 years ago
dot.S use emms instead, add WIN guards 5 years ago
dot_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
dot_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
dot_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
drot.c fix improper function prototypes (empty parentheses) 2 years ago
drot_microk_haswell-2.c replace spurious avx512 requirement with fma check 4 years ago
drot_microk_skylakex-2.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
dscal.c x86: Fixed numpy CI failure when the target is ZEN. 1 year ago
dscal_microk_bulldozer-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 7 years ago
dscal_microk_haswell-2.c dscal: use ymm registers in Haswell microkernel 3 years ago
dscal_microk_sandy-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 7 years ago
dscal_microk_skylakex-2.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
dsymv_L.c initial support for Sapphire Rapids platform 4 years ago
dsymv_L_microk_bulldozer-2.c Fix declaration of arguments in inline assembly 7 years ago
dsymv_L_microk_haswell-2.c Fix declaration of arguments in inline assembly 7 years ago
dsymv_L_microk_nehalem-2.c Fix declaration of arguments in inline assembly 7 years ago
dsymv_L_microk_sandy-2.c Fix declaration of arguments in inline assembly 7 years ago
dsymv_L_microk_skylakex-2.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
dsymv_U.c initial support for Sapphire Rapids platform 4 years ago
dsymv_U_microk_bulldozer-2.c Fix declaration of assembly arguments in SSYMV and DSYMV microkernels 7 years ago
dsymv_U_microk_haswell-2.c Fix declaration of assembly arguments in SSYMV and DSYMV microkernels 7 years ago
dsymv_U_microk_nehalem-2.c Fix declaration of assembly arguments in SSYMV and DSYMV microkernels 7 years ago
dsymv_U_microk_sandy-2.c Fix declaration of assembly arguments in SSYMV and DSYMV microkernels 7 years ago
dtobf16_microk_cooperlake.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
dtrmm_kernel_4x8_haswell.c Replace vpermpd with vpermilpd in the Haswell DTRMM kernel 6 years ago
dtrsm_kernel_LN_bulldozer.c Remove unused variables from Haswell dtrmm and Bulldozer dtrsm 8 years ago
dtrsm_kernel_LT_8x2_bulldozer.S Remove all trailing whitespace except lapack-netlib 11 years ago
dtrsm_kernel_RN_8x2_bulldozer.S Remove all trailing whitespace except lapack-netlib 11 years ago
dtrsm_kernel_RN_haswell.c Replace most vpermpd calls in the Haswell DTRSM_RN kernel 6 years ago
dtrsm_kernel_RT_bulldozer.c Fix inline assembly constraints in Bulldozer TRSM kernels 7 years ago
gemm_beta.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_kernel_2x8_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_kernel_4x2_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_kernel_4x4_barcelona.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_kernel_4x4_core2.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_kernel_4x4_penryn.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_kernel_4x4_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_kernel_4x4_sse3.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_kernel_4x8_nano.S Fix crash in sgemm SSE/nano kernel on x86_64 7 years ago
gemm_kernel_4x8_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_kernel_8x4_barcelona.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_kernel_8x4_core2.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_kernel_8x4_penryn.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_kernel_8x4_sse.S Fix crash in sgemm SSE/nano kernel on x86_64 7 years ago
gemm_kernel_8x4_sse3.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_ncopy_2.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_ncopy_2_bulldozer.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_ncopy_4.S Add forgotten conditional uses of PREFETCH 1 year ago
gemm_ncopy_4_opteron.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_tcopy_2.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_tcopy_2_bulldozer.S Remove all trailing whitespace except lapack-netlib 11 years ago
gemm_tcopy_4.S Add forgotten conditional uses of PREFETCH 1 year ago
gemm_tcopy_4_opteron.S Remove all trailing whitespace except lapack-netlib 11 years ago
iamax.S use emms instead, add WIN guards 5 years ago
iamax_sse.S Silence a redefinition warning 5 years ago
iamax_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
izamax.S use emms instead, add WIN guards 5 years ago
izamax_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
izamax_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
lsame.S Import GotoBLAS2 1.13 BSD version codes. 15 years ago
mcount.S Import GotoBLAS2 1.13 BSD version codes. 15 years ago
meson.build MAINT: Start working on kernels and driver L2 1 year ago
nrm2.S Allow negative INCX (API change from version 3.10 of the reference implementation) 2 years ago
nrm2_sse.S Allow negative INCX (API change from version 3.10 of the reference implementation) 2 years ago
omatcopy_rt.c Fix warnings 3 years ago
qconjg.S use emms instead, add WIN guards 5 years ago
qdot.S use emms instead, add WIN guards 5 years ago
qgemm_kernel_2x2.S use emms instead, add WIN guards 5 years ago
qgemv_n.S use emms instead, add WIN guards 5 years ago
qgemv_t.S use emms instead, add WIN guards 5 years ago
qtrsm_kernel_LN_2x2.S use emms instead, add WIN guards 5 years ago
qtrsm_kernel_LT_2x2.S use emms instead, add WIN guards 5 years ago
qtrsm_kernel_RT_2x2.S use emms instead, add WIN guards 5 years ago
rot.S Remove all trailing whitespace except lapack-netlib 11 years ago
rot_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
rot_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
sasum.c Use SkylakeX ?ASUM microkernel for Cooperlake/Sapphirerapids as well 2 years ago
sasum_microk_haswell-2.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
sasum_microk_skylakex-2.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
saxpy.c initial support for Sapphire Rapids platform 4 years ago
saxpy_microk_haswell-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
saxpy_microk_nehalem-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 7 years ago
saxpy_microk_piledriver-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
saxpy_microk_sandy-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 7 years ago
saxpy_microk_skylakex-2.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
sbdot.c initial support for Sapphire Rapids platform 4 years ago
sbdot_microk_cooperlake.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
sbgemm_block_microk_cooperlake.c x86_64: BFLOAT16: fix build warning 4 years ago
sbgemm_kernel_16x4_cooperlake.c Prevent compiler attempts to use k0 as mask register 4 years ago
sbgemm_kernel_16x16_spr.c sbgemm: spr: kernel handle alpha != 1.0 4 years ago
sbgemm_kernel_16x16_spr_tmpl.c Fix spr sbgemm error 2 years ago
sbgemm_microk_cooperlake_template.c really fix definition of SHUFFLE_MAGIC_NO 4 years ago
sbgemm_ncopy_4_cooperlake.c sbgemm: cooperlake: kernel works for NN 4 years ago
sbgemm_ncopy_16_cooperlake.c Fix non-portable u_int64_t 4 years ago
sbgemm_oncopy_16_spr.c sbgemm: spr: oncopy: use tile load/store instead 4 years ago
sbgemm_otcopy_16_spr.c sbgemm: spr: implement otcopy_16 4 years ago
sbgemm_small_kernel_nn_cooperlake.c sbgemm: cooperlake: enable SBGEMM by small matrix path 4 years ago
sbgemm_small_kernel_nt_cooperlake.c sbgemm: cooperlake: enable SBGEMM by small matrix path 4 years ago
sbgemm_small_kernel_permit_cooperlake.c sbgemm: cooperlake: tuning for small matrix 4 years ago
sbgemm_small_kernel_permit_spr.c sbgemm: spr: disable small matrix path by default 4 years ago
sbgemm_small_kernel_template_cooperlake.c sbgemm: cooperlake: make sure hot buffer aligned to 64 4 years ago
sbgemm_small_kernel_tn_cooperlake.c sbgemm: cooperlake: enable SBGEMM by small matrix path 4 years ago
sbgemm_small_kernel_tt_cooperlake.c sbgemm: cooperlake: enable SBGEMM by small matrix path 4 years ago
sbgemm_tcopy_4_cooperlake.c sbgemm: cooperlake: add n24 kernel for tcopy_4 4 years ago
sbgemm_tcopy_16_cooperlake.c sbgemm: cooperlake: implement tcopy_4 4 years ago
sbgemv_n.c initial support for Sapphire Rapids platform 4 years ago
sbgemv_n_microk_cooperlake.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
sbgemv_n_microk_cooperlake_template.c x86_64: BFLOAT16: fix build warning 4 years ago
sbgemv_t.c initial support for Sapphire Rapids platform 4 years ago
sbgemv_t_microk_cooperlake.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
sbgemv_t_microk_cooperlake_template.c x86_64: BFLOAT16: fix build warning 4 years ago
scal.S Import GotoBLAS2 1.13 BSD version codes. 15 years ago
scal_atom.S fix NAN handling 1 year ago
scal_sse.S make NAN handling depend on dummy2 parameter 1 year ago
scal_sse2.S make NAN handling depend on dummy2 parameter 1 year ago
sdot.c initial support for Sapphire Rapids platform 4 years ago
sdot_microk_bulldozer-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 7 years ago
sdot_microk_haswell-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
sdot_microk_nehalem-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 7 years ago
sdot_microk_sandy-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
sdot_microk_skylakex-2.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
sdot_microk_steamroller-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 7 years ago
sgemm_beta_skylakex.c sbgemm: cooperlake: add dummy source files 4 years ago
sgemm_direct_performant.c [WIP] Refactor the driver code for direct SGEMM (#2782) 5 years ago
sgemm_direct_skylakex.c initial support for Sapphire Rapids platform 4 years ago
sgemm_kernel_8x4_bulldozer.S Remove all trailing whitespace except lapack-netlib 11 years ago
sgemm_kernel_8x4_haswell.c Update sgemm_kernel_8x4_haswell.c 6 years ago
sgemm_kernel_8x4_haswell_2.c Strip UTF8 byte order marker from source 5 years ago
sgemm_kernel_8x8_sandy.S Update organization info. 11 years ago
sgemm_kernel_16x2_bulldozer.S change line endings from CRLF to LF 3 years ago
sgemm_kernel_16x2_piledriver.S change line endings from CRLF to LF 3 years ago
sgemm_kernel_16x4_haswell.S change line endings from CRLF to LF 3 years ago
sgemm_kernel_16x4_sandy.S change line endings from CRLF to LF 3 years ago
sgemm_kernel_16x4_skylakex.S Use AVX512 also for DGEMM 7 years ago
sgemm_kernel_16x4_skylakex.c make skylakex sgemm code more friendly for readers 6 years ago
sgemm_kernel_16x4_skylakex_2.c AVX512 STRMM kernel 6 years ago
sgemm_kernel_16x4_skylakex_3.c Use "old" compute(24) function with clang due to register limitations 4 years ago
sgemm_ncopy_4_skylakex.c Use sgemm_ncopy_4_skylakex.c also for Haswell 7 years ago
sgemm_small_kernel_nn_skylakex.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
sgemm_small_kernel_nt_skylakex.c Small Matrix: use proper inline asm input constraint for AVX512 mask 3 years ago
sgemm_small_kernel_permit_skylakex.c Small Matrix: skylakex: add sgemm tt kernel 4 years ago
sgemm_small_kernel_tn_skylakex.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
sgemm_small_kernel_tt_skylakex.c Small Matrix: skylakex: fix build error in old compiler 4 years ago
sgemm_tcopy_16_skylakex.c Add a C+intrinsics version of the SGEMM/skylakex kernel 7 years ago
sgemv_n.S Remove all trailing whitespace except lapack-netlib 11 years ago
sgemv_n.c removed obsolete gemv kernel files 11 years ago
sgemv_n_4.c Disable gcc's tree-vectorizer pass on all operating systems 2 years ago
sgemv_n_microk_bulldozer-4.c Fix inline assembly constraints 7 years ago
sgemv_n_microk_haswell-4.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
sgemv_n_microk_nehalem-4.c Fix inline assembly constraints 7 years ago
sgemv_n_microk_sandy-4.c Fix inline assembly constraints 7 years ago
sgemv_n_microk_skylakex-8.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
sgemv_t.S Remove all trailing whitespace except lapack-netlib 11 years ago
sgemv_t.c removed obsolete gemv kernel files 11 years ago
sgemv_t_4.c Disable gcc's tree-vectorizer pass on all operating systems 2 years ago
sgemv_t_microk_bulldozer-4.c Tag %1 and %2 as both input and output operands 8 years ago
sgemv_t_microk_haswell-4.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
sgemv_t_microk_nehalem-4.c Replace .align with .p2align in the Nehalem microkernels 8 years ago
sgemv_t_microk_sandy-4.c Use .p2align instead of .align for compatibility on Sandybridge as well 8 years ago
sgemv_t_microk_skylakex.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
sgemv_t_microk_skylakex_template.c sgemv: skylakex: fix build warning 4 years ago
sger.c added optimized sger kernel for sandybridge 10 years ago
sger_microk_sandy-2.c Fix declaration of input arguments in the Sandybridge GER microkernels (#1967) 7 years ago
srot.c fix improper function prototypes (empty parentheses) 2 years ago
srot_microk_haswell-2.c Remove spurious AVX512 requirement and add AVX2/FMA3 guard 4 years ago
srot_microk_skylakex-2.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
sscal.c x86: Fixed numpy CI failure when the target is ZEN. 1 year ago
sscal_microk_haswell-2.c Fix typo in clobber list, should be xmm14 instead of ymm14. 3 years ago
sscal_microk_skylakex-2.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
ssymv_L.c Disable gcc's tree-vectorizer pass on all operating systems 2 years ago
ssymv_L_microk_bulldozer-2.c Fix declaration of arguments in inline assembly 7 years ago
ssymv_L_microk_haswell-2.c Fix declaration of arguments in inline assembly 7 years ago
ssymv_L_microk_nehalem-2.c Fix declaration of arguments in inline assembly 7 years ago
ssymv_L_microk_sandy-2.c Fix declaration of arguments in inline assembly 7 years ago
ssymv_U.c Disable gcc's tree-vectorizer pass on all operating systems 2 years ago
ssymv_U_microk_bulldozer-2.c Fix declaration of assembly arguments in SSYMV and DSYMV microkernels 7 years ago
ssymv_U_microk_haswell-2.c Fix declaration of assembly arguments in SSYMV and DSYMV microkernels 7 years ago
ssymv_U_microk_nehalem-2.c Fix declaration of assembly arguments in SSYMV and DSYMV microkernels 7 years ago
ssymv_U_microk_sandy-2.c Fix declaration of assembly arguments in SSYMV and DSYMV microkernels 7 years ago
staticbuffer.S Import GotoBLAS2 1.13 BSD version codes. 15 years ago
stobf16_microk_cooperlake.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
strsm_kernel_8x4_haswell_LN.c Strip UTF8 byte order marker from source 5 years ago
strsm_kernel_8x4_haswell_LT.c AVX2 STRSM kernel 5 years ago
strsm_kernel_8x4_haswell_L_common.h Strip UTF8 byte order marker from source 5 years ago
strsm_kernel_8x4_haswell_RN.c change line endings from CRLF to LF 3 years ago
strsm_kernel_8x4_haswell_RT.c change line endings from CRLF to LF 3 years ago
strsm_kernel_8x4_haswell_R_common.h change line endings from CRLF to LF 3 years ago
strsm_kernel_LN_bulldozer.c Fix inline assembly constraints in Bulldozer TRSM kernels 7 years ago
strsm_kernel_LT_bulldozer.c Fix inline assembly constraints in Bulldozer TRSM kernels 7 years ago
strsm_kernel_RN_bulldozer.c Fix inline assembly constraints in Bulldozer TRSM kernels 7 years ago
strsm_kernel_RT_bulldozer.c Fix inline assembly constraints in Bulldozer TRSM kernels 7 years ago
sum.S use emms instead, add WIN guards 5 years ago
swap.S Remove all trailing whitespace except lapack-netlib 11 years ago
swap_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
swap_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
symv_L_sse.S initial support for Sapphire Rapids platform 4 years ago
symv_L_sse2.S initial support for Sapphire Rapids platform 4 years ago
symv_U_sse.S initial support for Sapphire Rapids platform 4 years ago
symv_U_sse2.S initial support for Sapphire Rapids platform 4 years ago
tobf16.c Avoid exceeding the configured thread count in x86_64 TOBF16 (#4748) 1 year ago
trsm_kernel_LN_2x8_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LN_4x2_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LN_4x4_barcelona.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LN_4x4_core2.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LN_4x4_penryn.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LN_4x4_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LN_4x4_sse3.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LN_4x8_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LN_8x4_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LT_2x8_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LT_4x2_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LT_4x4_barcelona.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LT_4x4_core2.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LT_4x4_penryn.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LT_4x4_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LT_4x4_sse3.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LT_4x8_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_LT_8x4_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_RT_2x8_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_RT_4x2_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_RT_4x4_barcelona.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_RT_4x4_core2.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_RT_4x4_penryn.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_RT_4x4_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_RT_4x4_sse3.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_RT_4x8_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
trsm_kernel_RT_8x4_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
xdot.S use emms instead, add WIN guards 5 years ago
xgemm3m_kernel_2x2.S use emms instead, add WIN guards 5 years ago
xgemm_kernel_1x1.S use emms instead, add WIN guards 5 years ago
xgemv_n.S use emms instead, add WIN guards 5 years ago
xgemv_t.S use emms instead, add WIN guards 5 years ago
xtrsm_kernel_LT_1x1.S use emms instead, add WIN guards 5 years ago
zamax.S use emms instead, add WIN guards 5 years ago
zamax_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
zamax_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
zamax_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zasum.S use emms instead, add WIN guards 5 years ago
zasum.c Use SkylakeX ?ASUM microkernel for Cooperlake/Sapphirerapids as well 2 years ago
zasum_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
zasum_microk_skylakex-2.c Use _mm_set1_epi{32,64x} to init mask in x86-64 [cz]asum 2 years ago
zasum_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
zasum_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zaxpy.S Remove all trailing whitespace except lapack-netlib 11 years ago
zaxpy.c initial support for Sapphire Rapids platform 4 years ago
zaxpy_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
zaxpy_microk_bulldozer-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
zaxpy_microk_haswell-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
zaxpy_microk_sandy-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
zaxpy_microk_steamroller-2.c x86_64: clobber all xmm registers after vzeroupper 5 years ago
zaxpy_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
zaxpy_sse2.S use shortcut only when both incx and incy are zero 2 years ago
zcopy.S Remove all trailing whitespace except lapack-netlib 11 years ago
zcopy_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
zcopy_sse2.S Import GotoBLAS2 1.13 BSD version codes. 15 years ago
zdot.S use emms instead, add WIN guards 5 years ago
zdot.c fix improper function prototypes (empty parentheses) 2 years ago
zdot_atom.S Import GotoBLAS2 1.13 BSD version codes. 15 years ago
zdot_microk_bulldozer-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 7 years ago
zdot_microk_haswell-2.c Replace vpermpd with vpermilpd 6 years ago
zdot_microk_sandy-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 7 years ago
zdot_microk_steamroller-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 7 years ago
zdot_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
zdot_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_2x8_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_4x2_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_4x4_barcelona.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_4x4_core2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_4x4_haswell.c Update zgemm3m_kernel_4x4_haswell.c 6 years ago
zgemm3m_kernel_4x4_penryn.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_4x4_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_4x4_sse3.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_4x8_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_8x4_barcelona.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_8x4_core2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_8x4_penryn.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_8x4_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm3m_kernel_8x4_sse3.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_beta.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_1x4_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_2x1_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_2x2_barcelona.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_2x2_bulldozer.S change line endings from CRLF to LF 3 years ago
zgemm_kernel_2x2_core2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_2x2_penryn.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_2x2_piledriver.S change line endings from CRLF to LF 3 years ago
zgemm_kernel_2x2_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_2x2_sse3.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_2x4_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_4x2_barcelona.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_4x2_core2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_4x2_haswell.S change line endings from CRLF to LF 3 years ago
zgemm_kernel_4x2_haswell.c Update zgemm_kernel_4x2_haswell.c 6 years ago
zgemm_kernel_4x2_penryn.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_4x2_skylakex.c AVX512 CGEMM & ZGEMM kernels 6 years ago
zgemm_kernel_4x2_sse.S Add forgotten conditional uses of PREFETCH 1 year ago
zgemm_kernel_4x2_sse3.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_kernel_4x4_sandy.S Update organization info. 11 years ago
zgemm_ncopy_1.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_ncopy_2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_tcopy_1.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemm_tcopy_2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemv_n.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemv_n_4.c Disable gcc's tree-vectorizer pass on all operating systems 2 years ago
zgemv_n_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemv_n_dup.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemv_n_microk_bulldozer-4.c Tag %1 and %2 as both input and output operands 8 years ago
zgemv_n_microk_haswell-4.c Tag %1 and %2 as both input and output 8 years ago
zgemv_n_microk_sandy-4.c Use .p2align instead of .align for compatibility on Sandybridge as well 8 years ago
zgemv_t.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemv_t_4.c Disable gcc's tree-vectorizer pass on all operating systems 2 years ago
zgemv_t_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemv_t_dup.S Remove all trailing whitespace except lapack-netlib 11 years ago
zgemv_t_microk_bulldozer-4.c Tag %1 and %2 as both input and output operands 8 years ago
zgemv_t_microk_haswell-4.c Tag %1 and %2 as both input and output 8 years ago
znrm2.S Allow negative INCX (API change from version 3.10 of the reference implementation) 2 years ago
znrm2_sse.S Allow negative INCX (API change from version 3.10 of the reference implementation) 2 years ago
zrot.S Remove all trailing whitespace except lapack-netlib 11 years ago
zrot_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
zrot_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
zscal.S use emms instead, add WIN guards 5 years ago
zscal.c Update zscal.c 1 year ago
zscal_atom.S fix NAN handling 1 year ago
zscal_microk_bulldozer-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 7 years ago
zscal_microk_haswell-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 7 years ago
zscal_microk_skylakex-2.c Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 2 years ago
zscal_microk_steamroller-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 7 years ago
zscal_sse.S fix alpha=NAN case 1 year ago
zscal_sse2.S Fix handling of NAN and INF arguments 2 years ago
zsum.S use emms instead, add WIN guards 5 years ago
zsum.c Add CSUM and ZSUM kernels (trivially derived from their existing ASUM counterparts) 1 year ago
zsum_microk_skylakex-2.c Add CSUM and ZSUM kernels (trivially derived from their existing ASUM counterparts) 1 year ago
zsum_sse.S Add CSUM and ZSUM kernels (trivially derived from their existing ASUM counterparts) 1 year ago
zsum_sse2.S Add CSUM and ZSUM kernels (trivially derived from their existing ASUM counterparts) 1 year ago
zswap.S Remove all trailing whitespace except lapack-netlib 11 years ago
zswap_sse.S Remove all trailing whitespace except lapack-netlib 11 years ago
zswap_sse2.S Import GotoBLAS2 1.13 BSD version codes. 15 years ago
zsymv_L_sse.S initial support for Sapphire Rapids platform 4 years ago
zsymv_L_sse2.S Add forgotten conditional uses of PREFETCH 1 year ago
zsymv_U_sse.S initial support for Sapphire Rapids platform 4 years ago
zsymv_U_sse2.S Add forgotten conditional uses of PREFETCH 1 year ago
ztrsm_kernel_LN_2x1_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LN_2x2_core2.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LN_2x2_penryn.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LN_2x2_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LN_2x2_sse3.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LN_2x4_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LN_4x2_sse.S Add forgotten conditional uses of PREFETCH 1 year ago
ztrsm_kernel_LN_bulldozer.c added optimized trsm_kernels 10 years ago
ztrsm_kernel_LT_1x4_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LT_2x1_atom.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LT_2x2_core2.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LT_2x2_penryn.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LT_2x2_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LT_2x2_sse3.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LT_2x4_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_LT_4x2_sse.S Add forgotten conditional uses of PREFETCH 1 year ago
ztrsm_kernel_LT_bulldozer.c added optimized trsm_kernels 10 years ago
ztrsm_kernel_RN_bulldozer.c added optimized trsm_kernels 10 years ago
ztrsm_kernel_RT_1x4_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_RT_2x2_core2.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_RT_2x2_penryn.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_RT_2x2_sse2.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_RT_2x2_sse3.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_RT_2x4_nehalem.S Remove all trailing whitespace except lapack-netlib 11 years ago
ztrsm_kernel_RT_4x2_sse.S Add forgotten conditional uses of PREFETCH 1 year ago
ztrsm_kernel_RT_bulldozer.c added optimized trsm_kernels 10 years ago