Martin Kroeker
2e8ff4a781
Merge pull request #3266 from martin-frbg/powerparam
Remove spurious casts from PPC parameters and fix compilation for older targets
5 years ago
Martin Kroeker
dbba381dc3
Merge pull request #3260 from intelmy/sgemv_t_opt
Optimized sgemv_t for small N based on AVX512
5 years ago
Martin Kroeker
f61991d439
Merge pull request #3264 from RajalakshmiSR/sbgemmp10
POWER10: Fixes for sbgemm kernel
5 years ago
Martin Kroeker
efdbdd8f82
Add prefetch values for power3
5 years ago
Martin Kroeker
3906ef3b0f
Add prefetch values for power3
5 years ago
Martin Kroeker
8adf0971d8
Add prefetch values for power3
5 years ago
Martin Kroeker
08e2e60762
Add prefetch values for power3
5 years ago
Martin Kroeker
fb9e678235
Fix caxpy/zaxpy for big-endian
5 years ago
Martin Kroeker
dc4fcb48df
Fix inverted conditional for caxpy/zaxpy
5 years ago
Martin Kroeker
7a48247761
fix c/zrot and sgemv for POWER5
5 years ago
Martin Kroeker
7dfc45e840
Remove casts for PPC/POWER and complete parameters for POWER3/4
5 years ago
Arthur Williams
7fb6e576c2
Removed use of non portable '-p' arg to install
Not all versions of install support '-p' flag and it isn't worth failing
the build in the installed files' timestamps get updated.
5 years ago
Rajalakshmi Srinivasaraghavan
cbb70438df
POWER10: Fixes for sbgemm kernel
While testing bfloat16 sbgemm kernel, there are some failures
for odd value inputs due to array access beyond the boundary.
5 years ago
Ma, Yu
706a08d4a0
Optimized sgemv_t for small N based on AVX512
5 years ago
Zhang Xianyi
9f3d903817
Merge pull request #3259 from zhaofengli/riscv64-fixes
riscv64 fixes
5 years ago
Zhaofeng Li
590be3fae3
riscv64: Add Makefile
5 years ago
Zhaofeng Li
3521cd48cb
RISCV64_GENERIC: Use generic kernel for DSDOT for better precision
The implementation in `riscv64/dot.c` fails the `test_dsdot` test, and
the generic kernel seems to have better precision. Tested on SiFive
FU740 (HiFive Unmatched) and QEMU.
Also see #1469 .
5 years ago
Zhaofeng Li
1e0192a5cc
riscv64/imin: Fix wrong comparison
Same as #1990 .
5 years ago
Martin Kroeker
fe9aff17fe
Merge pull request #3258 from martin-frbg/hbaction
revert "try to work around gcc update problems" in Homebrew workflow
5 years ago
Martin Kroeker
8c25b440a0
revert "try to work around gcc update problems"
...as homebrew has dropped at least gcc8 now
5 years ago
Martin Kroeker
f84197c1a7
Add shortcuts for (small) cases that do not need expensive buffer allocation
5 years ago
Martin Kroeker
734bd265a8
revert symv changes for now
5 years ago
王滋涵 Zephyr Wang
a62cfc3ccf
Fix typo in common.h
5 years ago
Martin Kroeker
1217eb910d
Fix copy-paste errors in variables used
5 years ago
Martin Kroeker
d6d7a6685d
Add shortcuts for (small) cases that do not need expensive buffer allocation
5 years ago
Martin Kroeker
f0e7345fb8
Add shortcut for small-size gemv_n with increments of one
5 years ago
Martin Kroeker
42f048cf6c
Merge pull request #3249 from MikaelUrankar/develop
Fix typo
5 years ago
MikaelUrankar
4fbc0777f4
Fix typo
5 years ago
Martin Kroeker
d7472606d5
Merge pull request #3244 from martin-frbg/issue3237
Add fast path for small xSYR with INCX==1
5 years ago
Martin Kroeker
03297ff9f0
Add fast path for small xSYR with INCX==1
5 years ago
Martin Kroeker
2d8d0af0ea
Merge pull request #3243 from martin-frbg/lapack564
Fix spurious error exit test failures in the ?chktsqr tests (LAPACK564)
5 years ago
Martin Kroeker
5f677e782e
Merge pull request #3196 from guowangy/skylakex-gemm-batch-k
GEMM: skylake: improve the performance when m is small
5 years ago
Martin Kroeker
04c60cee5d
Merge pull request #3242 from martin-frbg/issue3239
Handle inadvertent use of DYNAMIC_ARCH=0
5 years ago
Martin Kroeker
3a53207cc9
Fix spurious error exit test failures in the ?chktsqr tests (LAPACK564)
5 years ago
Martin Kroeker
0e73d20629
Handle inadvertent use of DYNAMIC_ARCH=0
5 years ago
Martin Kroeker
02087a62e7
Merge pull request #3205 from intelmy/sgemv_n_opt
optimize on sgemv_n for small n
5 years ago
Martin Kroeker
03b4d79a7e
Merge pull request #3238 from martin-frbg/lapack555
Correct function name in error message from SLASQ2 (LAPACK PR555)
5 years ago
Martin Kroeker
5c729c6dce
Correct function name in error message from SLASQ2 (Reference-LAPACK PR 555)
5 years ago
Martin Kroeker
e1911b2e60
Merge pull request #3236 from martin-frbg/issue3234
Add -lm for FreeBSD on ARM/ARM64
5 years ago
Martin Kroeker
8f33da4f94
Merge pull request #3235 from dnoan/develop
Update Makefile.arm64
5 years ago
Martin Kroeker
26ccf643a3
Add -lm for FreeBSD on ARM/ARM64
5 years ago
Noan
32264ba496
Update Makefile.arm64
Added -march and -mtune flags for EMAG processors when GCC 9 or later
5 years ago
Martin Kroeker
4ecf631f95
Merge pull request #3228 from martin-frbg/issue3226
filter out -mavx flag on Sandybridge zgemm/ztrmm kernels
5 years ago
Martin Kroeker
5af510081d
Merge pull request #3233 from martin-frbg/issue3230
Add autodetection for Intel Ice Lake SP
5 years ago
Martin Kroeker
164551d5a2
Merge pull request #3232 from martin-frbg/lapack553
Reduce stack size requirements in the LAPACK LIN tests (LAPACK PR 553)
5 years ago
Martin Kroeker
310b76aad7
Merge pull request #3231 from martin-frbg/issue3227
Support compilation with pre-C99 versions of MSVC
5 years ago
Martin Kroeker
c4da892ba0
Only filter out -mavx on Sandybridge ZGEMM/ZTRMM kernels
5 years ago
Martin Kroeker
cbfd3c87e1
Recognize Intel Ice Lake SP as Cooper Lake
5 years ago
Martin Kroeker
26e87ac517
Support Intel Ice Lake SP as Cooper Lake
5 years ago
Martin Kroeker
15b9d6b4a7
Delete zchkaa.f
5 years ago