Skip to content

Commit

Permalink
Include support for Windows on Arm on BUILD.bazel along with proper V…
Browse files Browse the repository at this point in the history
…olterra detection (pytorch#220)

This MR includes support for building with Bazel  on cpu `arm64_windows`, I also tried this on my Volterra Windows Dev Kit and noticed that the core string seems different from what the current source code defines. I don't know if this is because my hardware is a bit different or not. 

I ran the tests with the following results

```
[==========] Running 132 tests from 28 test suites.
[----------] Global test environment set-up.
[----------] 1 test from PROCESSORS_COUNT
[ RUN      ] PROCESSORS_COUNT.non_zero
[       OK ] PROCESSORS_COUNT.non_zero (0 ms)
[----------] 1 test from PROCESSORS_COUNT (0 ms total)

[----------] 1 test from PROCESSORS
[ RUN      ] PROCESSORS.non_null
[       OK ] PROCESSORS.non_null (0 ms)
[----------] 1 test from PROCESSORS (0 ms total)

[----------] 13 tests from PROCESSOR
[ RUN      ] PROCESSOR.non_null
[       OK ] PROCESSOR.non_null (0 ms)
[ RUN      ] PROCESSOR.valid_smt_id
[       OK ] PROCESSOR.valid_smt_id (0 ms)
[ RUN      ] PROCESSOR.valid_core
[       OK ] PROCESSOR.valid_core (0 ms)
[ RUN      ] PROCESSOR.consistent_core
[       OK ] PROCESSOR.consistent_core (0 ms)
[ RUN      ] PROCESSOR.valid_cluster
[       OK ] PROCESSOR.valid_cluster (0 ms)
[ RUN      ] PROCESSOR.consistent_cluster
[       OK ] PROCESSOR.consistent_cluster (0 ms)
[ RUN      ] PROCESSOR.valid_package
[       OK ] PROCESSOR.valid_package (0 ms)
[ RUN      ] PROCESSOR.consistent_package
[       OK ] PROCESSOR.consistent_package (0 ms)
[ RUN      ] PROCESSOR.consistent_l1i
[       OK ] PROCESSOR.consistent_l1i (0 ms)
[ RUN      ] PROCESSOR.consistent_l1d
[       OK ] PROCESSOR.consistent_l1d (0 ms)
[ RUN      ] PROCESSOR.consistent_l2
[       OK ] PROCESSOR.consistent_l2 (0 ms)
[ RUN      ] PROCESSOR.consistent_l3
[       OK ] PROCESSOR.consistent_l3 (0 ms)
[ RUN      ] PROCESSOR.consistent_l4
[       OK ] PROCESSOR.consistent_l4 (0 ms)
[----------] 13 tests from PROCESSOR (7 ms total)

[----------] 1 test from CORES_COUNT
[ RUN      ] CORES_COUNT.within_bounds
[       OK ] CORES_COUNT.within_bounds (0 ms)
[----------] 1 test from CORES_COUNT (0 ms total)

[----------] 1 test from CORES
[ RUN      ] CORES.non_null
[       OK ] CORES.non_null (0 ms)
[----------] 1 test from CORES (0 ms total)

[----------] 10 tests from CORE
[ RUN      ] CORE.non_null
[       OK ] CORE.non_null (0 ms)
[ RUN      ] CORE.non_zero_processors
[       OK ] CORE.non_zero_processors (0 ms)
[ RUN      ] CORE.consistent_processors
[       OK ] CORE.consistent_processors (0 ms)
[ RUN      ] CORE.valid_core_id
[       OK ] CORE.valid_core_id (0 ms)
[ RUN      ] CORE.valid_cluster
[       OK ] CORE.valid_cluster (0 ms)
[ RUN      ] CORE.consistent_cluster
[       OK ] CORE.consistent_cluster (0 ms)
[ RUN      ] CORE.valid_package
[       OK ] CORE.valid_package (0 ms)
[ RUN      ] CORE.consistent_package
[       OK ] CORE.consistent_package (0 ms)
[ RUN      ] CORE.known_vendor
[       OK ] CORE.known_vendor (0 ms)
[ RUN      ] CORE.known_uarch
[       OK ] CORE.known_uarch (0 ms)
[----------] 10 tests from CORE (5 ms total)

[----------] 1 test from CLUSTERS_COUNT
[ RUN      ] CLUSTERS_COUNT.within_bounds
[       OK ] CLUSTERS_COUNT.within_bounds (0 ms)
[----------] 1 test from CLUSTERS_COUNT (0 ms total)

[----------] 1 test from CLUSTERS
[ RUN      ] CLUSTERS.non_null
[       OK ] CLUSTERS.non_null (0 ms)
[----------] 1 test from CLUSTERS (0 ms total)

[----------] 14 tests from CLUSTER
[ RUN      ] CLUSTER.non_null
[       OK ] CLUSTER.non_null (0 ms)
[ RUN      ] CLUSTER.non_zero_processors
[       OK ] CLUSTER.non_zero_processors (0 ms)
[ RUN      ] CLUSTER.valid_processors
[       OK ] CLUSTER.valid_processors (0 ms)
[ RUN      ] CLUSTER.consistent_processors
[       OK ] CLUSTER.consistent_processors (0 ms)
[ RUN      ] CLUSTER.non_zero_cores
[       OK ] CLUSTER.non_zero_cores (0 ms)
[ RUN      ] CLUSTER.valid_cores
[       OK ] CLUSTER.valid_cores (0 ms)
[ RUN      ] CLUSTER.consistent_cores
[       OK ] CLUSTER.consistent_cores (0 ms)
[ RUN      ] CLUSTER.valid_cluster_id
[       OK ] CLUSTER.valid_cluster_id (0 ms)
[ RUN      ] CLUSTER.valid_package
[       OK ] CLUSTER.valid_package (0 ms)
[ RUN      ] CLUSTER.consistent_package
[       OK ] CLUSTER.consistent_package (0 ms)
[ RUN      ] CLUSTER.consistent_vendor
[       OK ] CLUSTER.consistent_vendor (0 ms)
[ RUN      ] CLUSTER.consistent_uarch
[       OK ] CLUSTER.consistent_uarch (0 ms)
[ RUN      ] CLUSTER.consistent_midr
[       OK ] CLUSTER.consistent_midr (0 ms)
[ RUN      ] CLUSTER.consistent_frequency
[       OK ] CLUSTER.consistent_frequency (0 ms)
[----------] 14 tests from CLUSTER (7 ms total)

[----------] 1 test from PACKAGES_COUNT
[ RUN      ] PACKAGES_COUNT.within_bounds
[       OK ] PACKAGES_COUNT.within_bounds (0 ms)
[----------] 1 test from PACKAGES_COUNT (0 ms total)

[----------] 1 test from PACKAGES
[ RUN      ] PACKAGES.non_null
[       OK ] PACKAGES.non_null (0 ms)
[----------] 1 test from PACKAGES (0 ms total)

[----------] 10 tests from PACKAGE
[ RUN      ] PACKAGE.non_null
[       OK ] PACKAGE.non_null (0 ms)
[ RUN      ] PACKAGE.non_zero_processors
[       OK ] PACKAGE.non_zero_processors (0 ms)
[ RUN      ] PACKAGE.valid_processors
[       OK ] PACKAGE.valid_processors (0 ms)
[ RUN      ] PACKAGE.consistent_processors
[       OK ] PACKAGE.consistent_processors (0 ms)
[ RUN      ] PACKAGE.non_zero_cores
[       OK ] PACKAGE.non_zero_cores (0 ms)
[ RUN      ] PACKAGE.valid_cores
[       OK ] PACKAGE.valid_cores (0 ms)
[ RUN      ] PACKAGE.consistent_cores
[       OK ] PACKAGE.consistent_cores (0 ms)
[ RUN      ] PACKAGE.non_zero_clusters
[       OK ] PACKAGE.non_zero_clusters (0 ms)
[ RUN      ] PACKAGE.valid_clusters
[       OK ] PACKAGE.valid_clusters (0 ms)
[ RUN      ] PACKAGE.consistent_cluster
[       OK ] PACKAGE.consistent_cluster (0 ms)
[----------] 10 tests from PACKAGE (5 ms total)

[----------] 1 test from UARCHS_COUNT
[ RUN      ] UARCHS_COUNT.within_bounds
[       OK ] UARCHS_COUNT.within_bounds (0 ms)
[----------] 1 test from UARCHS_COUNT (0 ms total)

[----------] 1 test from UARCHS
[ RUN      ] UARCHS.non_null
[       OK ] UARCHS.non_null (0 ms)
[----------] 1 test from UARCHS (0 ms total)

[----------] 5 tests from UARCH
[ RUN      ] UARCH.non_null
[       OK ] UARCH.non_null (0 ms)
[ RUN      ] UARCH.non_zero_processors
[       OK ] UARCH.non_zero_processors (0 ms)
[ RUN      ] UARCH.valid_processors
[       OK ] UARCH.valid_processors (0 ms)
[ RUN      ] UARCH.non_zero_cores
[       OK ] UARCH.non_zero_cores (0 ms)
[ RUN      ] UARCH.valid_cores
[       OK ] UARCH.valid_cores (0 ms)
[----------] 5 tests from UARCH (2 ms total)

[----------] 1 test from L1I_CACHES_COUNT
[ RUN      ] L1I_CACHES_COUNT.within_bounds
[       OK ] L1I_CACHES_COUNT.within_bounds (0 ms)
[----------] 1 test from L1I_CACHES_COUNT (0 ms total)

[----------] 1 test from L1I_CACHES
[ RUN      ] L1I_CACHES.non_null
[       OK ] L1I_CACHES.non_null (0 ms)
[----------] 1 test from L1I_CACHES (0 ms total)

[----------] 13 tests from L1I_CACHE
[ RUN      ] L1I_CACHE.non_null
[       OK ] L1I_CACHE.non_null (0 ms)
[ RUN      ] L1I_CACHE.non_zero_size
[       OK ] L1I_CACHE.non_zero_size (0 ms)
[ RUN      ] L1I_CACHE.valid_size
[       OK ] L1I_CACHE.valid_size (0 ms)
[ RUN      ] L1I_CACHE.non_zero_associativity
[       OK ] L1I_CACHE.non_zero_associativity (0 ms)
[ RUN      ] L1I_CACHE.non_zero_partitions
[       OK ] L1I_CACHE.non_zero_partitions (0 ms)
[ RUN      ] L1I_CACHE.non_zero_line_size
[       OK ] L1I_CACHE.non_zero_line_size (0 ms)
[ RUN      ] L1I_CACHE.power_of_2_line_size
[       OK ] L1I_CACHE.power_of_2_line_size (0 ms)
[ RUN      ] L1I_CACHE.reasonable_line_size
[       OK ] L1I_CACHE.reasonable_line_size (0 ms)
[ RUN      ] L1I_CACHE.valid_flags
[       OK ] L1I_CACHE.valid_flags (0 ms)
[ RUN      ] L1I_CACHE.non_inclusive
[       OK ] L1I_CACHE.non_inclusive (0 ms)
[ RUN      ] L1I_CACHE.non_zero_processors
[       OK ] L1I_CACHE.non_zero_processors (0 ms)
[ RUN      ] L1I_CACHE.valid_processors
[       OK ] L1I_CACHE.valid_processors (0 ms)
[ RUN      ] L1I_CACHE.consistent_processors
[       OK ] L1I_CACHE.consistent_processors (0 ms)
[----------] 13 tests from L1I_CACHE (7 ms total)

[----------] 1 test from L1D_CACHES_COUNT
[ RUN      ] L1D_CACHES_COUNT.within_bounds
[       OK ] L1D_CACHES_COUNT.within_bounds (0 ms)
[----------] 1 test from L1D_CACHES_COUNT (0 ms total)

[----------] 1 test from L1D_CACHES
[ RUN      ] L1D_CACHES.non_null
[       OK ] L1D_CACHES.non_null (0 ms)
[----------] 1 test from L1D_CACHES (0 ms total)

[----------] 13 tests from L1D_CACHE
[ RUN      ] L1D_CACHE.non_null
[       OK ] L1D_CACHE.non_null (0 ms)
[ RUN      ] L1D_CACHE.non_zero_size
[       OK ] L1D_CACHE.non_zero_size (0 ms)
[ RUN      ] L1D_CACHE.valid_size
[       OK ] L1D_CACHE.valid_size (0 ms)
[ RUN      ] L1D_CACHE.non_zero_associativity
[       OK ] L1D_CACHE.non_zero_associativity (0 ms)
[ RUN      ] L1D_CACHE.non_zero_partitions
[       OK ] L1D_CACHE.non_zero_partitions (0 ms)
[ RUN      ] L1D_CACHE.non_zero_line_size
[       OK ] L1D_CACHE.non_zero_line_size (0 ms)
[ RUN      ] L1D_CACHE.power_of_2_line_size
[       OK ] L1D_CACHE.power_of_2_line_size (0 ms)
[ RUN      ] L1D_CACHE.reasonable_line_size
[       OK ] L1D_CACHE.reasonable_line_size (0 ms)
[ RUN      ] L1D_CACHE.valid_flags
[       OK ] L1D_CACHE.valid_flags (0 ms)
[ RUN      ] L1D_CACHE.non_inclusive
[       OK ] L1D_CACHE.non_inclusive (0 ms)
[ RUN      ] L1D_CACHE.non_zero_processors
[       OK ] L1D_CACHE.non_zero_processors (0 ms)
[ RUN      ] L1D_CACHE.valid_processors
[       OK ] L1D_CACHE.valid_processors (0 ms)
[ RUN      ] L1D_CACHE.consistent_processors
[       OK ] L1D_CACHE.consistent_processors (0 ms)
[----------] 13 tests from L1D_CACHE (7 ms total)

[----------] 1 test from L2_CACHES_COUNT
[ RUN      ] L2_CACHES_COUNT.within_bounds
[       OK ] L2_CACHES_COUNT.within_bounds (0 ms)
[----------] 1 test from L2_CACHES_COUNT (0 ms total)

[----------] 1 test from L2_CACHES
[ RUN      ] L2_CACHES.non_null
[       OK ] L2_CACHES.non_null (0 ms)
[----------] 1 test from L2_CACHES (0 ms total)

[----------] 12 tests from L2_CACHE
[ RUN      ] L2_CACHE.non_null
[       OK ] L2_CACHE.non_null (0 ms)
[ RUN      ] L2_CACHE.non_zero_size
[       OK ] L2_CACHE.non_zero_size (0 ms)
[ RUN      ] L2_CACHE.valid_size
[       OK ] L2_CACHE.valid_size (0 ms)
[ RUN      ] L2_CACHE.non_zero_associativity
[       OK ] L2_CACHE.non_zero_associativity (0 ms)
[ RUN      ] L2_CACHE.non_zero_partitions
[       OK ] L2_CACHE.non_zero_partitions (0 ms)
[ RUN      ] L2_CACHE.non_zero_line_size
[       OK ] L2_CACHE.non_zero_line_size (0 ms)
[ RUN      ] L2_CACHE.power_of_2_line_size
[       OK ] L2_CACHE.power_of_2_line_size (0 ms)
[ RUN      ] L2_CACHE.reasonable_line_size
[       OK ] L2_CACHE.reasonable_line_size (0 ms)
[ RUN      ] L2_CACHE.valid_flags
[       OK ] L2_CACHE.valid_flags (0 ms)
[ RUN      ] L2_CACHE.non_zero_processors
[       OK ] L2_CACHE.non_zero_processors (0 ms)
[ RUN      ] L2_CACHE.valid_processors
[       OK ] L2_CACHE.valid_processors (0 ms)
[ RUN      ] L2_CACHE.consistent_processors
[       OK ] L2_CACHE.consistent_processors (0 ms)
[----------] 12 tests from L2_CACHE (6 ms total)

[----------] 1 test from L3_CACHES_COUNT
[ RUN      ] L3_CACHES_COUNT.within_bounds
[       OK ] L3_CACHES_COUNT.within_bounds (0 ms)
[----------] 1 test from L3_CACHES_COUNT (0 ms total)

[----------] 12 tests from L3_CACHE
[ RUN      ] L3_CACHE.non_null
[       OK ] L3_CACHE.non_null (0 ms)
[ RUN      ] L3_CACHE.non_zero_size
[       OK ] L3_CACHE.non_zero_size (0 ms)
[ RUN      ] L3_CACHE.valid_size
[       OK ] L3_CACHE.valid_size (0 ms)
[ RUN      ] L3_CACHE.non_zero_associativity
[       OK ] L3_CACHE.non_zero_associativity (0 ms)
[ RUN      ] L3_CACHE.non_zero_partitions
[       OK ] L3_CACHE.non_zero_partitions (0 ms)
[ RUN      ] L3_CACHE.non_zero_line_size
[       OK ] L3_CACHE.non_zero_line_size (0 ms)
[ RUN      ] L3_CACHE.power_of_2_line_size
[       OK ] L3_CACHE.power_of_2_line_size (0 ms)
[ RUN      ] L3_CACHE.reasonable_line_size
[       OK ] L3_CACHE.reasonable_line_size (0 ms)
[ RUN      ] L3_CACHE.valid_flags
[       OK ] L3_CACHE.valid_flags (0 ms)
[ RUN      ] L3_CACHE.non_zero_processors
[       OK ] L3_CACHE.non_zero_processors (0 ms)
[ RUN      ] L3_CACHE.valid_processors
[       OK ] L3_CACHE.valid_processors (0 ms)
[ RUN      ] L3_CACHE.consistent_processors
[       OK ] L3_CACHE.consistent_processors (0 ms)
[----------] 12 tests from L3_CACHE (6 ms total)

[----------] 1 test from L4_CACHES_COUNT
[ RUN      ] L4_CACHES_COUNT.within_bounds
[       OK ] L4_CACHES_COUNT.within_bounds (0 ms)
[----------] 1 test from L4_CACHES_COUNT (0 ms total)

[----------] 12 tests from L4_CACHE
[ RUN      ] L4_CACHE.non_null
[       OK ] L4_CACHE.non_null (0 ms)
[ RUN      ] L4_CACHE.non_zero_size
[       OK ] L4_CACHE.non_zero_size (0 ms)
[ RUN      ] L4_CACHE.valid_size
[       OK ] L4_CACHE.valid_size (0 ms)
[ RUN      ] L4_CACHE.non_zero_associativity
[       OK ] L4_CACHE.non_zero_associativity (0 ms)
[ RUN      ] L4_CACHE.non_zero_partitions
[       OK ] L4_CACHE.non_zero_partitions (0 ms)
[ RUN      ] L4_CACHE.non_zero_line_size
[       OK ] L4_CACHE.non_zero_line_size (0 ms)
[ RUN      ] L4_CACHE.power_of_2_line_size
[       OK ] L4_CACHE.power_of_2_line_size (0 ms)
[ RUN      ] L4_CACHE.reasonable_line_size
[       OK ] L4_CACHE.reasonable_line_size (0 ms)
[ RUN      ] L4_CACHE.valid_flags
[       OK ] L4_CACHE.valid_flags (0 ms)
[ RUN      ] L4_CACHE.non_zero_processors
[       OK ] L4_CACHE.non_zero_processors (0 ms)
[ RUN      ] L4_CACHE.valid_processors
[       OK ] L4_CACHE.valid_processors (0 ms)
[ RUN      ] L4_CACHE.consistent_processors
[       OK ] L4_CACHE.consistent_processors (0 ms)
[----------] 12 tests from L4_CACHE (6 ms total)

[----------] Global test environment tear-down
[==========] 132 tests from 28 test suites ran. (93 ms total)
[  PASSED  ] 132 tests.
```

with `cpu-info.exe` returning

```
Packages:
        0: Snapdragon (TM) 8cx Gen 3
Microarchitectures:
        4x Cortex-A78
        4x Cortex-X1
Cores:
        0: 1 processor (0), ARM Cortex-A78
        1: 1 processor (1), ARM Cortex-A78
        2: 1 processor (2), ARM Cortex-A78
        3: 1 processor (3), ARM Cortex-A78
        4: 1 processor (4), ARM Cortex-X1
        5: 1 processor (5), ARM Cortex-X1
        6: 1 processor (6), ARM Cortex-X1
        7: 1 processor (7), ARM Cortex-X1
Logical processors:
        0
        1
        2
        3
        4
        5
        6
        7
```

and `isa-info.exe` returning

```
Instruction sets:
        ARM v8.1 atomics: yes
        ARM v8.1 SQRDMLxH: yes
        ARM v8.2 FP16 arithmetics: yes
        ARM v8.2 FHM: no
        ARM v8.2 BF16: no
        ARM v8.2 Int8 dot product: yes
        ARM v8.2 Int8 matrix multiplication: no
        ARM v8.3 JS conversion: no
        ARM v8.3 complex: no
SIMD extensions:
        ARM SVE: no
        ARM SVE 2: no
Cryptography extensions:
        AES: yes
        SHA1: yes
        SHA2: yes
        PMULL: yes
        CRC32: yes
```
  • Loading branch information
everton1984 authored Mar 17, 2024
1 parent fb08ae0 commit 6543fec
Show file tree
Hide file tree
Showing 2 changed files with 21 additions and 0 deletions.
12 changes: 12 additions & 0 deletions BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,11 @@ WINDOWS_X86_SRCS = [
"src/x86/windows/init.c",
]

WINDOWS_ARM_SRCS = [
"src/arm/windows/init-by-logical-sys-info.c",
"src/arm/windows/init.c",
]

MACH_X86_SRCS = [
"src/x86/mach/init.c",
]
Expand Down Expand Up @@ -128,6 +133,7 @@ cc_library(
":macos_x86_64_legacy": COMMON_SRCS + X86_SRCS + MACH_SRCS + MACH_X86_SRCS,
":macos_arm64": COMMON_SRCS + MACH_SRCS + MACH_ARM_SRCS,
":windows_x86_64": COMMON_SRCS + X86_SRCS + WINDOWS_X86_SRCS,
":windows_arm64": COMMON_SRCS + ARM_SRCS + WINDOWS_ARM_SRCS,
":android_armv7": COMMON_SRCS + ARM_SRCS + LINUX_SRCS + LINUX_ARM32_SRCS + ANDROID_ARM_SRCS,
":android_arm64": COMMON_SRCS + ARM_SRCS + LINUX_SRCS + LINUX_ARM64_SRCS + ANDROID_ARM_SRCS,
":android_x86": COMMON_SRCS + X86_SRCS + LINUX_SRCS + LINUX_X86_SRCS,
Expand All @@ -149,6 +155,7 @@ cc_library(
}),
copts = select({
":windows_x86_64": [],
":windows_arm64": [],
"//conditions:default": C99OPTS,
}) + [
"-Iexternal/cpuinfo/include",
Expand Down Expand Up @@ -281,6 +288,11 @@ config_setting(
values = {"cpu": "x64_windows"},
)

config_setting(
name = "windows_arm64",
values = {"cpu": "arm64_windows"},
)

config_setting(
name = "android_armv7",
values = {
Expand Down
9 changes: 9 additions & 0 deletions src/arm/windows/init.c
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,15 @@ static struct woa_chip_info woa_chips[] = {
2420000000,
},
{cpuinfo_vendor_arm, cpuinfo_uarch_cortex_a76, 3150000000}}},
/* Snapdragon (TM) 8cx Gen 3 @ 3.0 GHz */
{L"Snapdragon (TM) 8cx Gen 3",
woa_chip_name_microsoft_sq_3,
{{
cpuinfo_vendor_arm,
cpuinfo_uarch_cortex_a78,
2420000000,
},
{cpuinfo_vendor_arm, cpuinfo_uarch_cortex_x1, 3000000000}}},
/* Microsoft Windows Dev Kit 2023 */
{L"Snapdragon Compute Platform",
woa_chip_name_microsoft_sq_3,
Expand Down

0 comments on commit 6543fec

Please sign in to comment.