Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WSL2 only recognizes L3 cache from 1 chiplet #8500

Closed
1 of 2 tasks
Thernn88 opened this issue Jun 11, 2022 · 9 comments
Closed
1 of 2 tasks

WSL2 only recognizes L3 cache from 1 chiplet #8500

Thernn88 opened this issue Jun 11, 2022 · 9 comments

Comments

@Thernn88
Copy link

Thernn88 commented Jun 11, 2022

Version

Microsoft Windows [Version 10.0.22000.708]

WSL Version

  • WSL 2
  • WSL 1

Kernel Version

5.10.102.1

Distro Version

20.04

Other Software

WSL version: 0.60.0.0
WSLg version: 1.0.34

Repro Steps

Have chiplet based processor.

Start WSL2.

Run lscpu

Look at L3 cache. L3 cache is equal to that of only 1 chiplet.

Expected Behavior

Even considering the existing processor group bug in which WSL2 only sees 32 of 64 cores, one should still expect 128MiB of L3 cache out of the 256MiB L3.

5950x should return 64MiB L3.

1950x should return 32MiB L3

Actual Behavior

3990X
L3 cache: 16 MiB/256MiB or 16MiB/128MiB after considering the processor group bug.

5950x
L3 cache: 32 MiB/64MiB

Strangely, 1950x returns the correct amount of L3 cache.

1950x
L3 cache: 32 MiB/32MiB

This bug is present on 2 of 3 chiplet based CPUs tested.

Diagnostic Logs

$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 48 bits physical, 48 bits virtual
CPU(s): 64
On-line CPU(s) list: 0-63
Thread(s) per core: 2
Core(s) per socket: 32
Socket(s): 1
Vendor ID: AuthenticAMD
CPU family: 23
Model: 49
Model name: AMD Ryzen Threadripper 3990X 64-Core Processor
Stepping: 0
CPU MHz: 2894.557
BogoMIPS: 5789.11
Virtualization: AMD-V
Hypervisor vendor: Microsoft
Virtualization type: full
L1d cache: 1 MiB
L1i cache: 1 MiB
L2 cache: 16 MiB
L3 cache: 16 MiB
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Full AMD retpoline, IBPB conditional, STIBP conditional, RSB filling
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxs
r sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl t
sc_reliable nonstop_tsc cpuid extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 mo
vbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm s
se4a misalignsse 3dnowprefetch osvw topoext perfctr_core ssbd ibpb stibp vmmcall fsgsba
se bmi1 avx2 smep bmi2 rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 x
saves clzero xsaveerptr arat npt nrip_save tsc_scale vmcb_clean flushbyasid decodeassis
ts pausefilter pfthreshold v_vmsave_vmload umip rdpid

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 48 bits physical, 48 bits virtual
CPU(s): 32
On-line CPU(s) list: 0-31
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 1
Vendor ID: AuthenticAMD
CPU family: 25
Model: 33
Model name: AMD Ryzen 9 5950X 16-Core Processor
Stepping: 0
CPU MHz: 3400.134
BogoMIPS: 6800.26
Virtualization: AMD-V
Hypervisor vendor: Microsoft
Virtualization type: full
L1d cache: 512 KiB
L1i cache: 512 KiB
L2 cache: 8 MiB
L3 cache: 32 MiB
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Full AMD retpoline, IBPB conditional, IBRS_FW, STIBP conditional, RSB filli
ng
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxs
r sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl t
sc_reliable nonstop_tsc cpuid extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 mo
vbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm s
se4a misalignsse 3dnowprefetch osvw topoext ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1
avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xget
bv1 xsaves clzero xsaveerptr arat npt nrip_save tsc_scale vmcb_clean flushbyasid decode
assists pausefilter pfthreshold v_vmsave_vmload umip vaes vpclmulqdq rdpid fsrm

@Thernn88
Copy link
Author

Additional notes: 1950x is on Win10. It's possible that the NUMA "fix" in Win 11 is the cause of the cache problems.

If someone could chime in with LSCPU from Win10 with these or a related processor that would be appreciated.

@elsaco
Copy link

elsaco commented Jun 11, 2022

@Thernn88 use a live distro, or boot into Linux, and see if lscpu reports the same info as inside WSL. Post the results please. Thx!

@Thernn88
Copy link
Author

Thernn88 commented Jun 11, 2022

Used a VM. Assume that is ok.

ubuntu@ubuntu:~$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 48 bits physical, 48 bits virtual
CPU(s): 1
On-line CPU(s) list: 0
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 1
NUMA node(s): 1
Vendor ID: AuthenticAMD
CPU family: 23
Model: 49
Model name: AMD Ryzen Threadripper 3990X 64-Core Processor
Stepping: 0
CPU MHz: 2894.576
BogoMIPS: 5789.15
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 32 KiB
L1i cache: 32 KiB
L2 cache: 512 KiB
L3 cache: 256 MiB

@elsaco
Copy link

elsaco commented Jun 13, 2022

So, KVM hypervisor gets it right! I suspect Hyper-V is the issue. WSL rides on top of Hyper-V.

@Thernn88
Copy link
Author

Yes, I found it interesting that correct amount of L3 for the 1950x was reported in the Win10 WSL2 instance.

Only the 5950x and 3990x reported lower L3 and these are both on Win 11.

Possibly a Win 11 only Hyper-V bug?

If others chime in with lscpu from these or close models on Win 10 which report correct L3 (or 128mb for 3990x) that would isolate it as a Win 11 bug. I have no idea where to report those...

@Thernn88
Copy link
Author

Thernn88 commented Aug 25, 2022

Still an issue in current as of below as well as after update to 22.04. WSL2 only has access to 16MB of L3 Cache instead of the expected 128MB (256MB if CPU Cores > 64 bug is fixed).

Windows: 22621.382
WSL version: 0.66.2.0
Kernel version: 5.15.57.1
WSLg version: 1.0.42

Caches (sum of all):
L1d: 1 MiB (32 instances)
L1i: 1 MiB (32 instances)
L2: 16 MiB (32 instances)
L3: 16 MiB (1 instance)

This is ignoring the already existing bug where WSL only sees up to 64 cores.

@Thernn88
Copy link
Author

Thernn88 commented Sep 27, 2022

After further investigation it turns out the L3 cache is actually fully detected.

I ran

3990x
$ getconf -a | grep CACHE
LEVEL1_ICACHE_SIZE 32768
LEVEL1_ICACHE_ASSOC
LEVEL1_ICACHE_LINESIZE 64
LEVEL1_DCACHE_SIZE 32768
LEVEL1_DCACHE_ASSOC 8
LEVEL1_DCACHE_LINESIZE 64
LEVEL2_CACHE_SIZE 524288
LEVEL2_CACHE_ASSOC 8
LEVEL2_CACHE_LINESIZE 64
LEVEL3_CACHE_SIZE 268435456
LEVEL3_CACHE_ASSOC 0
LEVEL3_CACHE_LINESIZE 64
LEVEL4_CACHE_SIZE
LEVEL4_CACHE_ASSOC
LEVEL4_CACHE_LINESIZE

268435456 ~= 256MB of L3

5950x
$ getconf -a | grep CACHE
LEVEL1_ICACHE_SIZE 32768
LEVEL1_ICACHE_ASSOC 8
LEVEL1_ICACHE_LINESIZE 64
LEVEL1_DCACHE_SIZE 32768
LEVEL1_DCACHE_ASSOC 8
LEVEL1_DCACHE_LINESIZE 64
LEVEL2_CACHE_SIZE 524288
LEVEL2_CACHE_ASSOC 8
LEVEL2_CACHE_LINESIZE 64
LEVEL3_CACHE_SIZE 67108864
LEVEL3_CACHE_ASSOC 0
LEVEL3_CACHE_LINESIZE 64
LEVEL4_CACHE_SIZE 0
LEVEL4_CACHE_ASSOC 0
LEVEL4_CACHE_LINESIZE 0

67108864 ~= 64MB of L3

After research I found that LSCPU has longstanding issues with exotic CPU configurations and sometimes returning incorrect cache sizes.

Now it beats me as to why LSCPU returned the correct values with KVM but falls on its face for Hyper-V. Thus, it seems there is still a bug although it is most likely visual.

The limitation to 64 cores, however, still appears valid.

@primenumber
Copy link

I have a 7950X3D. The problem is more severe because this CPU has different cache sizes for each CCX.
In my environment, both CCXs are reported to have 96MiB of L3 cache, but in reality CCX0 has 96MiB of cache and CCX1 has 32MiB of cache.

Copy link
Contributor

This issue has been automatically closed since it has not had any activity for the past year. If you're still experiencing this issue please re-file this as a new issue or feature request.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants