Browse Source

Merge pull request #421 from wernsaar/develop

optimized sgemm- and cgemm-kernel for haswell
tags/v0.2.11^2
Zhang Xianyi 11 years ago
parent
commit
21f7768b26
3 changed files with 8683 additions and 2391 deletions
  1. +4927
    -2285
      kernel/x86_64/cgemm_kernel_8x2_haswell.S
  2. +3754
    -105
      kernel/x86_64/sgemm_kernel_16x4_haswell.S
  3. +2
    -1
      param.h

+ 4927
- 2285
kernel/x86_64/cgemm_kernel_8x2_haswell.S
File diff suppressed because it is too large
View File


+ 3754
- 105
kernel/x86_64/sgemm_kernel_16x4_haswell.S
File diff suppressed because it is too large
View File


+ 2
- 1
param.h View File

@@ -1237,10 +1237,11 @@ USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#define CGEMM_DEFAULT_P 384
#define ZGEMM_DEFAULT_P 256

#define SGEMM_DEFAULT_Q 384
#ifdef WINDOWS_ABI
#define SGEMM_DEFAULT_Q 320
#define DGEMM_DEFAULT_Q 128
#else
#define SGEMM_DEFAULT_Q 384
#define DGEMM_DEFAULT_Q 256
#endif
#define CGEMM_DEFAULT_Q 192


Loading…
Cancel
Save