-
Notifications
You must be signed in to change notification settings - Fork 5
lec05‐roofline
CSWater edited this page Mar 12, 2024
·
6 revisions
本节课实验目的是绘制给定CPU的roofline模型,目前暂时只支持x86平台。原理上可以在测试kernel那里自由扩展到其它平台。
代码在cpufp基础上修改而来。
./build.sh
./pe_bench --thread_pool=[xx-xx]
下述命令表示在CPU core0-core15供16核上进行测试。
./pe_bench --thread_pool=[0-15]
如果机器上分numa节点,则需要指定在某一个numa上测试以消除内存带宽的影响.如下所示:
numactl -m 0 ./pe_bench --thread_pool=[0-15]
下面是画图示例代码。代码运行版本,python 3.12.0,不能确保所有python版本都能正常运行。
import numpy as np
#aritmatic intensity
ai = np.arange(0, 25, 0.1)
# for avx512
avx512_dp = 1277.8
avx512_bw = 83.9
perf_roofline_avx512 = np.array([avx512_bw * x if avx512_bw * x < avx512_dp else avx512_dp for x in ai ])
#for avx
# avx_dp = 319.3
# avx_bw = 84.9
# perf_roofline_avx = [avx_bw * x if avx_bw * x < avx_dp else avx_dp for x in ai ]
#for sse, int5 ....
#you can add yourself
#your application
ai_app = 10
perf_app = 200
plt.plot(ai, perf_roofline_avx512, label='avx512')
# plt.plot(ai, perf_roofline_avx, label='avx')
plt.plot(ai_app, perf_app, 'r-*')
plt.text(ai_app, perf_app+20, 'your application', fontsize=12, ha='center', va='bottom', color='red')
#x label, y label
plt.xlabel('Arithmetic Intensity')
plt.ylabel('Gflop/s')
#fig title
plt.title('Roofline model')
#legend
plt.legend(loc='lower right')
plt.ylim([0,1400])
#plt.savefig("test.png")
# plt.legend()
plt.show()
参考 likwid