DGEMM on KNL

Overview

Include DGEMM code tunned for KNL. The purpose of this code is to demonstrate how to tune a general GEMM algorithm for a specific hardware (i.e. Intel KNL). Thus the implemented GEMM do not cover the case where input is not multiply of kernel size.

The implemented DGEMM is row major. The implementation mostly follow [1]. However, we failed to achieve near MKL performence like [1] does. Author of [1] does not mentioned how they implemente their packing. We think part of us lower performence related to how packing is done

One should also notice, the Algoritm 1 used in [1] corresbonding to Figure 8 in [2], which is designed for col major order. This make packing unable to use SIMD transpose (i.e. If use for col major order and n_r is multiply of 8, then pack B can be done with SIMD transpose).

size	our imp % of peak	mkl imp % of peak	percent of MKL
m (2418) x k (3264) x n (2400)	56.18	73.86	76.06%
m (2418) x k (3264) x n (2400)	56.10	73.92	75.90%
m (4836) x k (4896) x n (4840)	56.99	75.12	75.87%

Notice to UC Berkeley CS267 student

You should NOT using this code (in part or whole) for your GEMM assignment.Using part or whole of this code is considered as violation of UC Berkeley Student Honor Code and would be reported to Center for Student Conduct.

Build the code

module load cmake
# Use GNU compiler
module swap PrgEnv-intel PrgEnv-gnu
# Use Intel compiler
module swap PrgEnv-gnu PrgEnv-intel 
mkdir build && cd build
make -j 10

Submit job

# submit job
sbatch job-blas
sbatch job-blocked

# check queue
sqs

# Interactive session
salloc -N 1 -C knl -q interactive -t 01:00:00

Reference

An implementation of matrix–matrix multiplication on the Intel KNL processor with AVX-512
Anatomy of High-Performance Matrix Multiplication
UC Berkeley CS267 Homework1

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.vscode		.vscode
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
benchmark.cpp		benchmark.cpp
benchmark.in		benchmark.in
dgemm-blas.c		dgemm-blas.c
dgemm-blocked.c		dgemm-blocked.c
test-gemm.cpp		test-gemm.cpp
test-packa.cpp		test-packa.cpp
test-packb.cpp		test-packb.cpp
test.in		test.in

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DGEMM on KNL

Overview

Notice to UC Berkeley CS267 student

Build the code

Submit job

Reference

About

Releases

Packages

Languages

License

XiaoSong9905/dgemm-knl

Folders and files

Latest commit

History

Repository files navigation

DGEMM on KNL

Overview

Notice to UC Berkeley CS267 student

Build the code

Submit job

Reference

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages