| @@ -1,4 +1,77 @@ | |||||
| OpenBLAS ChangeLog | OpenBLAS ChangeLog | ||||
| ==================================================================== | |||||
| Version 0.3.10 | |||||
| 14-Jun-2020 | |||||
| common: | |||||
| * Improved thread locking behaviour in blas_server and parallel getrf | |||||
| * Imported bugfix 394 from LAPACK (spurious reference to "XERBL" | |||||
| due to overlong lines) | |||||
| * Imported bugfix 403 from LAPACK (compile option "recursive" required | |||||
| for correctness with Intel and PGI) | |||||
| * Imported bugfix 408 from LAPACK (wrong scaling in ZHEEQUB) | |||||
| * Imported bugfix 411 from LAPACK (infinite loop in LARGV/LARTG/LARTGP) | |||||
| * Fixed mismatches between BUFFERSIZE and GEMM_UNROLL parameters that | |||||
| could lead to crashes at large matrix sizes | |||||
| * Restored internal soname in dynamic libraries on FreeBSD and Dragonfly | |||||
| * Added API (openblas_setaffinity) to set the thread affinity on Linux | |||||
| * Added initial infrastructure for half-precision floating point | |||||
| (bfloat16) support with a generic implementation of SHGEMM | |||||
| * Added CMAKE build system support for building the cblas_Xgemm3m | |||||
| functions | |||||
| * Fixed CMAKE support for building in a path with embedded spaces | |||||
| * Fixed CMAKE (non)handling of NO_EXPRECISION and MAX_STACK_ALLOC | |||||
| * Fixed GCC version detection in the Makefiles | |||||
| * Allowed overriding the names of AR, AS and LD in Makefile builds | |||||
| POWER: | |||||
| * Fixed big-endian POWER8 ELFv2 builds on FreeBSD | |||||
| * Fixed GCC version checks and DYNAMIC_ARCH builds on POWER9 | |||||
| * Fixed CMAKE build support for POWER9 | |||||
| * fixed a potential race condition in the thread buffer allocation | |||||
| * Worked around LAPACK test failures on PPC G4 | |||||
| MIPS: | |||||
| * Fixed a potential race condition in the thread buffer allocation | |||||
| * Added support for MIPS 24K/24KE family based on P5600 kernels | |||||
| MIPS64: | |||||
| * fixed a potential race condition in the thread buffer allocation | |||||
| * Added TARGET=GENERIC | |||||
| ARMV7: | |||||
| * Fixed a race condition in the thread buffer allocation | |||||
| ARMV8: | |||||
| * Fixed a race condition in the thread buffer allocation | |||||
| * Fixed zero initialisation in the assembly for SGEMM and DGEMM BETA | |||||
| * Improved performance of the ThunderX2 DAXPY kernel | |||||
| * Added an optimized SGEMM kernel for Cortex A53 | |||||
| * Fixed Makefile support for INTERFACE64 (8-byte integer) | |||||
| x86_64: | |||||
| * Fixed a syntax error in the CMAKE setup for SkylakeX | |||||
| * Improved performance of STRSM on Haswell, SkylakeX and Ryzen | |||||
| * Improved SGEMM performance on SGEMM for workloads with ldc a | |||||
| multiple of 1024 | |||||
| * Improved DGEMM performance on Skylake X | |||||
| * Fixed unwanted AVX512-dependency of SGEMM in DYNAMIC_ARCH | |||||
| builds created on SkylakeX | |||||
| * Removed data alignment requirement in the SSE2 copy kernels | |||||
| that could cause spurious crashes | |||||
| * Added a workaround for an optimizer bug in AppleClang 11.0.3 | |||||
| * Fixed LAPACK test failures due to wrong options for Intel Fortran | |||||
| * Fixed compilation and LAPACK test results with recent Flang | |||||
| and AMD AOCC | |||||
| * Fixed DYNAMIC_ARCH builds with CMAKE on OS X | |||||
| * Fixed missing exports of cblas_i?amin, cblas_i?min, cblas_i?max, | |||||
| cblas_?sum, cblas_?gemm3m in the shared library on OS | |||||
| * Fixed reporting of cpu name in DYNAMIC_ARCH builds (would sometimes | |||||
| show the name of an older generation chip supported by the same kernels) | |||||
| IBM Z: | |||||
| * Improved performance of SGEMM/STRMM and DGEMM/DTRMM on Z14 | |||||
| ==================================================================== | ==================================================================== | ||||
| Version 0.3.9 | Version 0.3.9 | ||||
| 1-Mar-2020 | 1-Mar-2020 | ||||