Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify sanitizer parameters in CMake #76

Open
wants to merge 31 commits into
base: main
Choose a base branch
from

Conversation

wusatosi
Copy link
Member

@wusatosi wusatosi commented Nov 14, 2024

This PR adopts @bretbrownjr 's suggestion in #44 (review).

You can use variables like CMAKE_CXX_FLAGS_Debug_INIT to tune what a Debug or Release build entails, for what it's worth. It's maybe a little nicer to use those variables instead of CMAKE_CXX_FLAGS. But not a huge deal.

https://cmake.org/cmake/help/latest/variable/CMAKE_LANG_FLAGS_CONFIG_INIT.html

This PR introduces toolchain files for supported platforms:

  • gcc
  • clang
  • msvc

Updated CI and preset to use the new toolchain files.

Race with #82

cmake/toolchain.cmake Outdated Show resolved Hide resolved
Copy link
Member

@steve-downey steve-downey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"We need to verify CMAKE_CXX_COMPILER_ID for g++ on macos is AppleClang."

Confirm what the compiler identification is for the default false g++ on Darwin is.

Marking "Request changes" so this doesn't get landed prematurely.

cmake/toolchain.cmake Outdated Show resolved Hide resolved
@ClausKlein

This comment was marked as resolved.

@wusatosi

This comment was marked as resolved.

@steve-downey
Copy link
Member

So, yes:
The CXX compiler identification is AppleClang 16.0.0.16000026

@steve-downey
Copy link
Member

Not sure if picking the toolchain for Darwin better, or making the generic toolchain smarter would work better, but either should work?

@ClausKlein
Copy link

Wait, that can't work!

iMac:exemplar clausklein$ cmake --preset gcc-debug  --trace-expand --trace-source=toolchain.cmake
Put cmake in trace mode, but with variables expanded.
Put cmake in trace mode, but output only lines of a specified file. Multiple options are allowed.
Preset CMake variables:

  BEMAN_BUILDSYS_SANITIZER="ASan"
  CMAKE_BUILD_TYPE="Debug"
  CMAKE_CXX_COMPILER="g++"
  CMAKE_CXX_STANDARD="20"
  CMAKE_TOOLCHAIN_FILE="cmake/toolchain.cmake"

/Users/clausklein/Workspace/cpp/beman-project/exemplar/cmake/toolchain.cmake(3):  set(CMAKE_C_FLAGS_RELEASE_INIT -O3 )
/Users/clausklein/Workspace/cpp/beman-project/exemplar/cmake/toolchain.cmake(4):  set(CMAKE_CXX_FLAGS_RELEASE_INIT -O3 )
/Users/clausklein/Workspace/cpp/beman-project/exemplar/cmake/toolchain.cmake(6):  set(CMAKE_C_FLAGS_RELWITHDEBINFO_INIT -O3 )
/Users/clausklein/Workspace/cpp/beman-project/exemplar/cmake/toolchain.cmake(7):  set(CMAKE_CXX_FLAGS_RELWITHDEBINFO_INIT -O3 )
/Users/clausklein/Workspace/cpp/beman-project/exemplar/cmake/toolchain.cmake(16):  if(DEFINED BEMAN_BUILDSYS_SANITIZER )
/Users/clausklein/Workspace/cpp/beman-project/exemplar/cmake/toolchain.cmake(17):  if(BEMAN_BUILDSYS_SANITIZER STREQUAL ASan )
/Users/clausklein/Workspace/cpp/beman-project/exemplar/cmake/toolchain.cmake(19):  set(SANITIZER_FLAGS -fsanitize=address -fsanitize=pointer-compare -fsanitize=pointer-subtract -fsanitize=undefined )
/Users/clausklein/Workspace/cpp/beman-project/exemplar/cmake/toolchain.cmake(24):  if(NOT CMAKE_CXX_COMPILER_ID )
/Users/clausklein/Workspace/cpp/beman-project/exemplar/cmake/toolchain.cmake(25):  message(WARNING toolchain is used before CMAKE_CXX_COMPILER_ID was set! )
CMake Warning at cmake/toolchain.cmake:25 (message):
  toolchain is used before CMAKE_CXX_COMPILER_ID was set!
Call Stack (most recent call first):
  build/gcc-debug/CMakeFiles/3.31.0-dirty/CMakeSystem.cmake:6 (include)
  CMakeLists.txt:5 (project)


/Users/clausklein/Workspace/cpp/beman-project/exemplar/cmake/toolchain.cmake(27):  if(APPLE )
/Users/clausklein/Workspace/cpp/beman-project/exemplar/cmake/toolchain.cmake(28):  message(STATUS Using GCC on macOS; excluding -fsanitize=leak )
-- Using GCC on macOS; excluding -fsanitize=leak
/Users/clausklein/Workspace/cpp/beman-project/exemplar/cmake/toolchain.cmake(44):  list(APPEND CMAKE_C_FLAGS_DEBUG_INIT -fsanitize=address -fsanitize=pointer-compare -fsanitize=pointer-subtract -fsanitize=undefined )
/Users/clausklein/Workspace/cpp/beman-project/exemplar/cmake/toolchain.cmake(45):  list(APPEND CMAKE_CXX_FLAGS_DEBUG_INIT -fsanitize=address -fsanitize=pointer-compare -fsanitize=pointer-subtract -fsanitize=undefined )
-- The CXX compiler identification is AppleClang 16.0.0.16000026
-- Detecting CXX compiler ABI info
/Users/clausklein/Workspace/cpp/beman-project/exemplar/cmake/toolchain.cmake(3):  set(CMAKE_C_FLAGS_RELEASE_INIT -O3 )
/Users/clausklein/Workspace/cpp/beman-project/exemplar/cmake/toolchain.cmake(4):  set(CMAKE_CXX_FLAGS_RELEASE_INIT -O3 )
/Users/clausklein/Workspace/cpp/beman-project/exemplar/cmake/toolchain.cmake(6):  set(CMAKE_C_FLAGS_RELWITHDEBINFO_INIT -O3 )
/Users/clausklein/Workspace/cpp/beman-project/exemplar/cmake/toolchain.cmake(7):  set(CMAKE_CXX_FLAGS_RELWITHDEBINFO_INIT -O3 )
/Users/clausklein/Workspace/cpp/beman-project/exemplar/cmake/toolchain.cmake(16):  if(DEFINED BEMAN_BUILDSYS_SANITIZER )
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
Examples to be built: identity_direct_usage;identity_as_default_projection
-- Configuring done (2.4s)

@wusatosi
Copy link
Member Author

Not sure if picking the toolchain for Darwin better, or making the generic toolchain smarter would work better, but either should work?

I am leaning on having separate tool chain files and having like a central tool chain dispatch logic based on compiler and platform.

But then I realized there's not that much variance in building across platforms and compilers, at least in exemplar, to warrant separate files.

@ClausKlein
Copy link

But then I realized there's not that much variance in building across platforms and compilers, at least in exemplar, to warrant separate files.

That is why often I use the project_options
see i.e.: https://github.com/aminya/project_options/blob/main/src/Sanitizers.cmake

@wusatosi
Copy link
Member Author

That is why often I use the project_options see i.e.: https://github.com/aminya/project_options/blob/main/src/Sanitizers.cmake

That's project looks fantastic!

I can bring this up in weekly sync and see if we want to use this.
Clang-tidy support, coverage, doxygen (and potentially vcpkg) are improvement features we would love to have, it would be fantastic if we can get all these done with this dependency.

@wusatosi
Copy link
Member Author

Ah I think tool chain file is executed before project(), so CMAKE_CXX_COMPILER_ID maybe unset?

@wusatosi wusatosi force-pushed the toolchain branch 2 times, most recently from 5f20499 to 117fcd9 Compare November 15, 2024 04:30
@wusatosi wusatosi force-pushed the toolchain branch 2 times, most recently from a3f18b3 to b000f40 Compare November 15, 2024 04:49
@wusatosi
Copy link
Member Author

wusatosi commented Dec 5, 2024

@camio I just realized this is still needed for CI, I can move it under the CI script.

Basically we would ideally want a CI script of this following matrix:

compiler: [GNU, clang, vs code, MSVC]
version: [17,20,23]
config: [default, sanitizer-set-Tthread, sanitizer-set-Address]

With this CMake script, we can let GitHub do it's mix and match without manually entering all the permutations. But if we do it from a purly command line interface, the CI script would be non-extendable.

I see other projects generates the test matrix with a Python script as alternative, I don't think we are at that step yet.

@wusatosi wusatosi reopened this Dec 5, 2024
@camio
Copy link
Contributor

camio commented Dec 5, 2024

MacOS support can be achieved by adding new presets that are specific to the AppleClang compiler which include the appropriate sanitizer flags.

The gcc-14 and AppleClang are currently not so important on macOX.

I agree that gcc-14 isn't important, but AppleClang, being used by 99% of developers on MacOS, is critically important for a preset and CI target.

But clang-19 must be tested on APPLE.! This is the only available compiler on unix with usable CXX_MODULE support.

Nothing here prevents people from testing clang-19 builds on MacOS, but being a very niche use case, I don't think it belongs in either CI or our presets file.

@camio
Copy link
Contributor

camio commented Dec 5, 2024

@camio I just realized this is still needed for CI, I can move it under the CI script.

Basically we would ideally want a CI script of this following matrix:

compiler: [GNU, clang, vs code, MSVC]

What do you mean here? vs code isn't a compiler.

config: [default, sanitizer-set-thread, sanitizer-set-Address]

With this CMake script, we can let GitHub do it's mix and match without manually entering all the permutations. But if we do it from a purely command line interface, the CI script would be non-extendable.

Since each compiler has its own set of flags and options, attempting to abstract the idea of a "configuration" that applies to some compilers, but not others, is going to be difficult to maintain. If we did want such a thing, I still argue it doesn't belong in our CMakeLists.txt files as the decider of which C++ flags to use is the invoker of these scripts, be that a preset, toolchain, or command-line invocation.

I don't think I fully understand what problem you're attempting to solve. What, at the end of the day, would you like our CI jobs to be?

@wusatosi
Copy link
Member Author

wusatosi commented Dec 5, 2024

Sorry about my unclear communication, I am very sleep deprived from finals.

What do you mean here? vs code isn't a compiler.

Ah, I mean xcode here. Let me try to explain myself again.

This tool is a start to provide a sanitizer compatibility layer for all compilers that we support in CI and preset. It's a mapping of compiler, sanitizergroup => compiler options. It's just apple-clang is the only one that someone here tested.

Basically, all the compiler have different support for sanitizers, but these sanitizers could be generally grouped into two groups (tsan and asan), it would be clean to declare on GitHub Actions that I want to generate the permutation of [santizier set: [clang...], c++ version set: [17, 20, ...], sanitizer set (cmake args): [default, tsan-set, asan-set]] (e.g. combination: [clang, 17, asan-set]), and have a subsequent script that determines what the ASan set mean for each compilers (e.g. clang's asan set expands to -fsantizer=address -fsantizer=undefined ..., MSVC's asan set expands to /fsanitize=address /Zi).

Otherwise we will have to include all the combinations individually, which would lead to hell like:

matrix:
cfg:
- { id: ubuntu-gcc-werror, platform: ubuntu, cc: gcc, cpp: g++, cmake_args: "-DCMAKE_CXX_FLAGS='-Werror=all -Werror=extra'"}
- { id: ubuntu-gcc-aubsan, platform: ubuntu, cc: gcc, cpp: g++, cmake_args: "-DCMAKE_CXX_FLAGS=-fsanitize=address -fsanitize=undefined"}
- { id: ubuntu-gcc-tsan, platform: ubuntu, cc: gcc, cpp: g++, cmake_args: "-DCMAKE_CXX_FLAGS=-fsanitize=thread"}
- { id: ubuntu-gcc-static, platform: ubuntu, cc: gcc, cpp: g++, cmake_args: ""}
- { id: ubuntu-gcc-dynamic, platform: ubuntu, cc: gcc, cpp: g++, cmake_args: "-DBUILD_SHARED_LIBS=on"}
- { id: ubuntu-clang-static, platform: ubuntu, cc: clang, cpp: clang++, cmake_args: ""}
- { id: ubuntu-clang-dynamic, platform: ubuntu, cc: clang, cpp: clang++, cmake_args: "-DBUILD_SHARED_LIBS=on"}
- { id: ubuntu-gcc-static-cxx20, platform: ubuntu, cc: gcc, cpp: g++, cmake_args: "-DCMAKE_CXX_STANDARD=20 -DCMAKE_CXX_STANDARD_REQUIRED=on"}
- { id: ubuntu-gcc-static-cxx23, platform: ubuntu, cc: gcc, cpp: g++, cmake_args: "-DCMAKE_CXX_STANDARD=23 -DCMAKE_CXX_STANDARD_REQUIRED=on"}
- { id: ubuntu-gcc-static-cxx26, platform: ubuntu, cc: gcc, cpp: g++, cmake_args: "-DCMAKE_CXX_STANDARD=26 -DCMAKE_CXX_STANDARD_REQUIRED=on"}
- { id: ubuntu-clang-static-cxx20, platform: ubuntu, cc: clang, cpp: clang++, cmake_args: "-DCMAKE_CXX_STANDARD=20 -DCMAKE_CXX_STANDARD_REQUIRED=on"}
- { id: ubuntu-clang-static-cxx23, platform: ubuntu, cc: clang, cpp: clang++, cmake_args: "-DCMAKE_CXX_STANDARD=23 -DCMAKE_CXX_STANDARD_REQUIRED=on"}
- { id: ubuntu-clang-static-cxx26, platform: ubuntu, cc: clang, cpp: clang++, cmake_args: "-DCMAKE_CXX_STANDARD=26 -DCMAKE_CXX_STANDARD_REQUIRED=on"}

And this is when there isn't a "different compiler need different command" scenario yet.

You can already see the pain starting off in the MSVC CI PR. I have to include MSVC builds as extras (not part of the permutation), as a result, it only runs ASan on C++17 (while other compilers run ASan on [17..26]), otherwise the matrix would be:

      matrix:
        platform:
          - description: "ubuntu gcc"
            cpp: g++
            c: gcc
            os: ubuntu-latest
          - description: "ubuntu clang"
            cpp: clang++
            c: clang
            os: ubuntu-latest
        cpp_version: [17, 20, 23, 26]
        cmake_args:
          - description: "Default"
            args: ""
          - description: "TSan"
            args: "-DCMAKE_CXX_FLAGS=-fsanitize=thread"
          - description: "ASan"
            args: "-DCMAKE_CXX_FLAGS='-fsanitize=address -fsanitize=undefined'"
        include:
          - platform:
              description: "windows MSVC"
              cpp: cl
              c: cl
              os: windows-latest
            cpp_version: 17
            cmake_args:
              description: "ASan"
              # Debug infomation needed to avoid cl: C5072
              # https://learn.microsoft.com/en-us/cpp/error-messages/compiler-warnings/compiler-warning-c5072?view=msvc-170
              args: "-DCMAKE_CXX_FLAGS='/fsanitize=address /Zi'"
          - platform:
              description: "windows MSVC"
              cpp: cl
              c: cl
              os: windows-latest
            cpp_version: 20
            cmake_args:
              description: "ASan"
              # Debug infomation needed to avoid cl: C5072
              # https://learn.microsoft.com/en-us/cpp/error-messages/compiler-warnings/compiler-warning-c5072?view=msvc-170
              args: "-DCMAKE_CXX_FLAGS='/fsanitize=address /Zi'"
          - platform:
              description: "windows MSVC"
              cpp: cl
              c: cl
              os: windows-latest
            cpp_version: 23
            cmake_args:
              description: "ASan"
              # Debug infomation needed to avoid cl: C5072
              # https://learn.microsoft.com/en-us/cpp/error-messages/compiler-warnings/compiler-warning-c5072?view=msvc-170
              args: "-DCMAKE_CXX_FLAGS='/fsanitize=address /Zi'"
          - platform:
              description: "windows MSVC"
              cpp: cl
              c: cl
              os: windows-latest
            cpp_version: 26
            cmake_args:
              description: "ASan"
              # Debug infomation needed to avoid cl: C5072
              # https://learn.microsoft.com/en-us/cpp/error-messages/compiler-warnings/compiler-warning-c5072?view=msvc-170
              args: "-DCMAKE_CXX_FLAGS='/fsanitize=address /Zi'"

For MSVC-ASan to cover C++ 17..26. This is 64 lines of CI code with high level of repetition to generate this matrix. Where 43 lines of code is added simply because MSVC use a different options interface and have a very restricted ASan support. This is not readable and thus highly error prone. This would be needed for apple clang as well once we add apple clang tests to CI because the set of sanitizer it supports is different. These combination would get worse if we want to add C++14 support as well.

If we have a utility like what this PR provides the matrix would be simplified to:

      matrix:
        platform:
          - description: "ubuntu gcc"
            cpp: g++
            c: gcc
            os: ubuntu-latest
          - description: "ubuntu clang"
            cpp: clang++
            c: clang
            os: ubuntu-latest
          - description: "windows MSVC"
            cpp: cl
            c: cl
            os: windows-latest
        cpp_version: [17, 20, 23, 26]
        cmake_args:
          - description: "Default"
            args: ""
          - description: "TSan"
            args: "-DBEMAN_BUILDSYS_SANITIZER=TSan"
          - description: "ASan"
            args: "-DBEMAN_BUILDSYS_SANITIZER=ASan"
    exclude:
        # MSVC CL does not support TSan
        - platform:
            cpp: cl
          cmake_args:
           description: "TSan"

Where the only addition needed is:

          - description: "windows MSVC"
            cpp: cl
            c: cl
            os: windows-latest

and

    exclude:
        # MSVC CL does not support TSan
        - platform:
            cpp: cl
          cmake_args:
           description: "TSan"

This will make MSVC not as much of an exception and create less noise.

If we create a python script/ CI step that runs a bash script to generate what TSan/ ASan is, then we are essentially rewriting this PR in another language. But writing this as a CMake utility has an advantage as:

This problem not only applies for the CI (if not especially applies to CI), but also presets. By adding a new compiler/ toolchain, we have to specify what ASan is as we want xxx-debug to match the ASan sanitizer set on CI. The problem there is not that we need to write an extra line of code, its that we might have two sets of definition on what ASan is across CI and preset.

If someone updates CI/ preset without updating the other (maybe because the hundreds of lines of code in CI), we have a potential "it works on my machine but not on CI" issue or vice-versa. This would be potentially really hard to debug.

I see the need to have a simple CMake infrastructure, and I think this utlility don't need to be included in the main CMake script. But I don't think the tradeoff of "having minimal CMake script and offload everything to CI" is reasonable in this specific instance. We have to deal with this complexity if we want to run sanitizers in CI, if we want to use the a declarative way to generate the matrix we will have to write a specific script to deal with sanitizer support, CMake script like what's included in this PR would be the most ideal way to implement.

@camio
Copy link
Contributor

camio commented Dec 6, 2024

@wusatosi, thanks for this explanation. I think I understand what you're going for now.

What do you mean here? vs code isn't a compiler.

Ah, I mean xcode here.

XCode is also an IDE. I think when you say XCode, you're trying to refer to the native MacOS compiler, which is called AppleClang.

sanitizers could be generally grouped into two groups (tsan and asan)

This grouping is not well-defined. You mention that ASan is enabled on GCC with the -fsanitize=address -fsanitize=undefined flags, but that is enabling two distinct sanitizers: address sanitizer, and undefined behavior sanitizer.

What if the grouping has a different shape? GCC's thread sanitizer is unique in that it may not be used with other sanitizers. Say we create a flag configuration named RUNTIME_INSTRUMENTATION that is defined to be the recommended intersection of compatible runtime instrumentation flags for a given compiler. We could have another flag configuration called ENFORCED_WARNINGS which would be the recommended intersection of warning and error-on-warning flags for a given compiler. For TSan-like instrumentation we could also create a MULTITHREAD_RUNTIME_INSTRUMENTATION, but I don't think we have a use case for that just yet.

The next question is where to put these flag groupings. They don't belong in our CMakeLists.txt because the cmake invoker is responsible for providing compiler flags. They also don't belong in a flattened CI matrix for the good reasons you pointed out.

What if we create per-compiler toolchains that make use of a "BEMAN_FLAG_SET" variable? The toolchain for AppleClang could look something like this:

set(CMAKE_CXX_COMPILER /usr/bin/clang++)
set(CMAKE_C_COMPILER /usr/bin/clang)
if( "RUNTIME_INSTRUMENTATION" IN_LIST BEMAN_FLAG_SET )
    list(APPEND CMAKE_CXX_FLAGS "-fsanitize=address" "-fsanitize=pointer-compare" "-fsanitize=pointer-subtract" "-fsanitize=leak" "-fsanitize=undefined")
    # TODO: Error out if "MULTITHREAD_RUNTIME_INSTRUMENTATION" is also within BEMAN_FLAG_SET
endif()
if( "ENFORCED_WARNINGS" IN_LIST BEMAN_FLAG_SET )
    list(APPEND CMAKE_CXX_FLAGS "-Wall" "-Wextra" "-Werror")
endif()
# ...

A similar file is created for each of our supported compilers. CMake can then be invoked with options like -DCMAKE_TOOLCHAIN_FILE=/path/to/BemanAppleClangToolchain.cmake -DBEMAN_FLAG_SET=RUNTIME_INSTRUMENTATION.

I believe this would enable us to have a simple CI matrix. These toolchains can potentially live in a separate repository that is shared by the CI specifications for all libraries. (Aside: If our CI configurations check out a particular commit id of the toolchain repository, it would facilitate a gradual migration to toolchain improvements).

A drawback of this approach is that our presets will repeat some of the information in our CI toolchain files. This is an acceptable tradeoff IMO to minimize the complexity of Beman build files.

@steve-downey
Copy link
Member

Clang's memory sanitizer also doesn't play well with other sanitizers. It also has the unfortunate property of needing to be globally applied, that is, all the libraries that touch memory need to be built with msan, just like tsan, and for essentially the same reason.

@wusatosi
Copy link
Member Author

wusatosi commented Dec 6, 2024

sanitizers could be generally grouped into two groups (tsan and asan)

This grouping is not well-defined. You mention that ASan is enabled on GCC with the -fsanitize=address -fsanitize=undefined flags, but that is enabling two distinct sanitizers: address sanitizer, and undefined behavior sanitizer.

By this grouping I mean to have "the minimum number of distinct sanitizer set that allow us to cover all the sanitizers". This was the original design goal for sanitizers on CI. Since thread sanitizer usually conflict with other sanitizers, this grouping comes down to TSan (thread santizier) and ASan (basically all other santiziers with the main interest being address sanitizer (also this is a good synonyms for All-other-santiziers?)).

The documentation of these set is included in this PR.

# There's three possible values:
# TSan: Thread sanitizer
# ASan: All sanitizer (majorly Address sanitizer) that doesn't conflict with TSan
# OFF: No sanitizer

What if the grouping has a different shape? GCC's thread sanitizer is unique in that it may not be used with other sanitizers. Say we create a flag configuration named RUNTIME_INSTRUMENTATION that is defined to be the recommended intersection of compatible runtime instrumentation flags for a given compiler. We could have another flag configuration called ENFORCED_WARNINGS which would be the recommended intersection of warning and error-on-warning flags for a given compiler. For TSan-like instrumentation we could also create a MULTITHREAD_RUNTIME_INSTRUMENTATION, but I don't think we have a use case for that just yet.

I want to point to the documentation I included in the PR again, the intention for sanitizer group is only to accommodating the fact that we cannot enable all sanitizers all at once. This is not an abstraction layer for general instrumentation based tooling. This is to optimize the common case that we want all the compiler sanitizers we can have, not to perfectly define "threading instrumentation", IMO this would be too much clutter for downstream.

Let's say one day Address santizier doesn't work with undefined santiziers anymore, at that point (and my intention is to only at that point) we can include an extra set.

In a sense, maybe I should just call all the santiziers groups "group a", "group b" instead of "ASan", "TSan" to avoid confusion.

The next question is where to put these flag groupings. They don't belong in our CMakeLists.txt because the cmake invoker is responsible for providing compiler flags. They also don't belong in a flattened CI matrix for the good reasons you pointed out.

What if we create per-compiler toolchains that make use of a "BEMAN_FLAG_SET" variable? The toolchain for AppleClang could look something like this:

set(CMAKE_CXX_COMPILER /usr/bin/clang++)
set(CMAKE_C_COMPILER /usr/bin/clang)
if( "RUNTIME_INSTRUMENTATION" IN_LIST BEMAN_FLAG_SET )
    list(APPEND CMAKE_CXX_FLAGS "-fsanitize=address" "-fsanitize=pointer-compare" "-fsanitize=pointer-subtract" "-fsanitize=leak" "-fsanitize=undefined")
    # TODO: Error out if "MULTITHREAD_RUNTIME_INSTRUMENTATION" is also within BEMAN_FLAG_SET
endif()
if( "ENFORCED_WARNINGS" IN_LIST BEMAN_FLAG_SET )
    list(APPEND CMAKE_CXX_FLAGS "-Wall" "-Wextra" "-Werror")
endif()
# ...

A similar file is created for each of our supported compilers. CMake can then be invoked with options like -DCMAKE_TOOLCHAIN_FILE=/path/to/BemanAppleClangToolchain.cmake -DBEMAN_FLAG_SET=RUNTIME_INSTRUMENTATION.

I believe this would enable us to have a simple CI matrix. These toolchains can potentially live in a separate repository that is shared by the CI specifications for all libraries. (Aside: If our CI configurations check out a particular commit id of the toolchain repository, it would facilitate a gradual migration to toolchain improvements).

A drawback of this approach is that our presets will repeat some of the information in our CI toolchain files. This is an acceptable tradeoff IMO to minimize the complexity of Beman build files.

Per-compiler Toolchain was the original intention behind this PR (or what was originally planned to come after this).
Every compiler gets their own toolchain file that specifies sanitizer set A, sanitizer set B.
But I think this goes against the minimal cmake philosophy we are going with here, especially when the code snippet above will likely be the only code needed for each toolchain, it kinda screams "I can be simplified".

In the simplistic world all the options generically works across all the compilers, I don't see enough variations across compilers to be used that warrants we creating various toolchain files in exemplar.

I want to again point out that the intended audience for this utility here is only CI and preset. This is not intended to be public facing, I think it is obvious any specialization that will be needed downstream (aside from turning off some sanitizer) will need the contributor to create special infrastructure for their respective repo. But I believe the common case for projects are the default sanitizers are good enough without extra per-compiler configuration that deviates from the common -fsantizer=xxx.

exemplar/CMakeLists.txt

Lines 32 to 33 in 55c966c

# BEMAN_BUILDSYS_SANITIZER is not a general use option
# It is used by preset and CI system.

@steve-downey
Copy link
Member

An important reason for keeping flags out of the core CI is to be buildable by package managers, which need to be in charge of the compilation and compilers being used in order to do their jobs. Different compilers with the same flags are sometimes as incompatible as the same compiler with different flags. If the package manager can supply its toolchain files and rely on us not messing with the request, it can mostly work out of the box.

@camio
Copy link
Contributor

camio commented Dec 11, 2024

Per-compiler Toolchain was the original intention behind this PR (or what was originally planned to come after this). Every compiler gets their own toolchain file that specifies sanitizer set A, sanitizer set B. But I think this goes against the minimal cmake philosophy we are going with here, especially when the code snippet above will likely be the only code needed for each toolchain, it kinda screams "I can be simplified".

This PR proposes to add complexity to the top-level CMakeLists.txt and platform-specific flag selection. What I am proposing involves no changes to the top-level CMakeLists.txt file and no platform-specific flag selection. All complexity is moved to CI and the toolchains it uses. I disagree that my proposal goes against the minimal CMake approach.

@ClausKlein
Copy link

ClausKlein commented Dec 11, 2024

This PR proposes to add complexity to the top-level CMakeLists.txt and platform-specific flag selection. What I am proposing involves no changes to the top-level CMakeLists.txt file and no platform-specific flag selection. All complexity is moved to CI and the toolchains it uses. I disagree that my proposal goes against the minimal CMake approach.

You may use a simpler solution like this: bemanproject/optional26#85 (comment)

@camio
Copy link
Contributor

camio commented Dec 11, 2024

This PR proposes to add complexity to the top-level CMakeLists.txt and platform-specific flag selection. What I am proposing involves no changes to the top-level CMakeLists.txt file and no platform-specific flag selection. All complexity is moved to CI and the toolchains it uses. I disagree that my proposal goes against the minimal CMake approach.

You may use a simpler solution like this: bemanproject/optional26#85 (comment)

If that snippet you pointed to was included in the toolchain files and not included in the top-level CMakeLists.txt file, my concerns would be addressed.

@ClausKlein
Copy link

ClausKlein commented Dec 11, 2024

If that snippet you pointed to was included in the toolchain files and not included in the top-level CMakeLists.txt file, my concerns would be addressed.

The toolchain file are not realy needed.
They set CXX_FLAGS and LDFAGS directly, which is not recommented!

The compiler may set on environment CXX=clang++-19 cmake ....

The sanitizer options may be set in the cmake configure presets as needed.


Or you may use this kint of cmake workflow presets:
https://github.com/ClausKlein/inplace_vector/pull/3/files#diff-d52971495d70bd9ea6d015a0e0f0cccbadd92693c723babb24e443a7d9f03255

@wusatosi
Copy link
Member Author

This PR proposes to add complexity to the top-level CMakeLists.txt and platform-specific flag selection. What I am proposing involves no changes to the top-level CMakeLists.txt file and no platform-specific flag selection. All complexity is moved to CI and the toolchains it uses. I disagree that my proposal goes against the minimal CMake approach.

Okay I can try to go back to implement toolchain file

@wusatosi
Copy link
Member Author

@camio I implemented this using toolchain files, is this more what you are looking for?

@wusatosi wusatosi requested a review from camio December 12, 2024 00:25
cmake/gnu-toolchain.cmake Show resolved Hide resolved
set(CMAKE_C_COMPILER gcc)
set(CMAKE_CXX_COMPILER g++)

if(BEMAN_BUILDSYS_SANITIZER STREQUAL "ASan")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modeling sanitizers as a build type works better, either that or an entirely distinct toolchain so Thread and Memory can be uniformly applied to all packages. If everything in an address space aren't using msan or tsan the reports they provide are broken, so you have to rebuild and relink the whole world consistently.
UB sanitizer and address, don't suffer the same problems.

So something like (not tested!):

set(CMAKE_CXX_FLAGS_ASAN
    "${CMAKE_CXX_FLAGS_RELWITHDEBINFO} -fsanitize=address,undefined,leak"
    CACHE STRING
    "C++ ASAN Flags"
    FORCE
)

Also at -O0 there's often no undefined behavior emitted for the sanitizer to see, for the same reasons that debug builds seg fault less often.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(optional26 doesn't use the _INIT variables because it's copied from ancient sources before that rule was clarified. Above should be using the *_INIT vars)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the write-up.

Separate ASAN as build targets seems a bit like overkill for the exemplars use case.

The design goal here isn't to have a full fledged instrumentation based analysis build system but just a quick hand for "enable all flags for sanitizers".

Given there's no dependency for exemplar, and the current recommendation for dependency management is to build with dependency's source code instead of including the dependent library at link time. I don't think there's value in complications here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember, though, exemplar doesn't do anything. It's entire purpose is to serve as a starting point and reference point for further work. Everything we've done is entirely overkill for providing ... checks notes ... std::identity.

Recommending building as part of the dependers source tree is a huge overstatement. We're making that possible, but it's still a terrible idea and does not scale to large systems. Getting to the point where we play well with package systems with public visibility is still on the todo list. (I haven't made it work with my internal one, but I know exactly how to.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see where u r coming from, I didn't think about exemplar as a dependency for other libraries (being a dependent) and were only commenting on its use of dependency and a standalone development library / CI test target.

I think you are right, there should be an ASAN target to produce an ASAN enabled library so someone could link us as a dependency to use. I get what you are talking about. But I think this is more of a package/ export issue, outside of scope for this PR for now and to be honest outside of my skill tree for now.

Again again again, the main motivation here is just to simply CI/ workflow.

Honestly I am tentatively waiting for someone to implement package export, do a quick write up, evaluate it and yonk it over (just like code coverage).

Could we delegate this suggestion to another PR? Let me know if I should add something/ structure this tool chain in anticipation of this feature.

@steve-downey
Copy link
Member

Figuring out better ergonomics for handling sanitizers (and fuzzers, coverage, and the rest of the laundry list) can be ongoing work. Getting sanitizers in CI is an immediate improvement.

I would base the sanitizers on the release or relwithdebinfo profile in CI, as debug tends to not exercise any of the runtime problems that the sanitizers detect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants