Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AddressSanitizer #187

Merged
merged 21 commits into from
Jan 4, 2024
Merged

AddressSanitizer #187

merged 21 commits into from
Jan 4, 2024

Conversation

Hyxogen
Copy link
Contributor

@Hyxogen Hyxogen commented Nov 30, 2023

This PR implements (basic) AddressSanitizer functionality for cheerp-wasm.

For now I'll leave this as a draft, as I'm still planning some minor changes, need to setup some other stuff for the ci, and would also like to get some feedback.

Getting started

Docker

If you want to use it right away, you can use the following docker image: hyxogen/cheerp-asan:latest. I'll update this docker image with the latest changes from my development branch

Cheerp will be located in /opt/cheerp, and you'll also have the following aliases set:

alias cheerp="/opt/cheerp/bin/clang"
alias cheerp++="/opt/cheerp/bin/clang++"

Building it yourself

If you want to build it on your own system, then build the entire cheerp toolchain like normal with compiler-rt at the end. Please note that you'll need the latest versions from all the cheerp repositories. And if you haven't been keeping up with them, you'll have to do a rebuild of them all, lest you want a lot of errors.

building compiler-rt
To be able to use asan, you'll have to compile and install the asan runtime library.

Build instructions:

cd compiler-rt
cmake -DCMAKE_INSTALL_PREFIX="$CHEERP_DEST" -B build -C CheerpCmakeConf.cmake \
	-DCMAKE_TOOLCHAIN_FILE="$CHEERP_DEST/share/cmake/Modules/CheerpWasmToolchain.cmake" .
make -C build install # NOTE: Parallel builds do NOT work

If for some reason the above command does not work, you're probably using an old version of clang (<14.0). You should
be able to still setup with the following (assuming that you've already built cheerp, and that the cheerp build
directory is build at the root of the repo)

cmake -DCMAKE_INSTALL_PREFIX="$CHEERP_DEST" -B build -C CheerpCmakeConf.cmake \
    -DLLVM_DIR="$PWD/../build/lib/cmake/llvm"
	-DCMAKE_TOOLCHAIN_FILE="$CHEERP_DEST/share/cmake/Modules/CheerpWasmToolchain.cmake" .

Usage

Once installed, address sanitizer should work (pretty much) like on native. An example:

# cat test.c
int main()
{
  int *p = 0;
  return *p;
}
# "$CHEERP_DEST/bin/clang" -fsanitize=address -cheerp-pretty-code test.c
# node a.out
=================================================================
==1==ERROR: AddressSanitizer: unknown-crash on address 0x00000000 at pc 0x000192b6 bp 0x00000000 sp 0x00100ffc
READ of size 4 at 0x00000000 thread T0
    #0 0x192b6 in main (main+0x192b6)
    #1 0x1f2c3 in _start (main+0x1f2c3)
    #2 0x80000145 in <unknown function> main:325

Address 0x00000000 is a wild pointer inside of access range of size 0x00000004.
SUMMARY: AddressSanitizer: unknown-crash (main+0x192b6) in main
Shadow bytes around the buggy address:
=>0x00000000:[fe]fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
  0x00000080: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
  0x00000100: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
  0x00000180: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
  0x00000200: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
  0x00000280: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==1==ABORTING

Two step compilation

If you're doing compiling and linking separately, you'll have to specify the -fsanitize=address flag for both steps

Notes

LeakSanitizer

For LeakSanitizer to trigger, you'll have to manually call exit, as code compiled with cheerp doesn't call the atexit
callbacks when returning from main.

LeakSanitizer is not perfect.
It simply scans the memory if there are any pointers still pointing to allocated memory. Thus it might not find leaks
for cheerp-wasm which it can find for native.

Function "x" does not detect memory errors

ASan can only detect memory errors in instrumented code or intercepted functions. If you're using a library, you'll have
to compile it with ASan as well for it to be able to pick up these errors.

As for the intercepted functions, currently ASan only intercepts a handful of libc functions, thus it might be the case
you're using one that isn't intercepted.

More reading

If you want to find out how AddressSanitizer works in depth:
https://github.com/google/sanitizers/wiki/AddressSanitizer

EDIT: Implemented asan flags and exceptions support

@Hyxogen Hyxogen requested a review from yuri91 November 30, 2023 13:46
clang/lib/Driver/ToolChains/WebAssembly.cpp Show resolved Hide resolved
compiler-rt/lib/asan/asan_cheerpwasm.cpp Outdated Show resolved Hide resolved
compiler-rt/lib/asan/asan_interceptors.h Outdated Show resolved Hide resolved
compiler-rt/lib/lsan/lsan_common.cpp Show resolved Hide resolved
compiler-rt/test/asan/TestCases/memcmp_test.cpp Outdated Show resolved Hide resolved
This change is made in preparation for another change where also stack
information is added.
ASan and LSan need to be know where the stack top and bottom are.
libasan makes heavy use of unsigned integers instead of pointers, but
in the IR calls functions with pointers. Previously cheerp did not
accepts these conversions in replaceCallOfBitCastWithBitCastOfCall. Made
it so that on wasm/asmjs these conversions do take place.
The `simplifyCalls` routine in the GDA does not only "simplify calls"
but also replaces calls of bitcasts with bitcasts of the call. While
working on supporting exceptions with ASan, there was some generated
code where an invoke happened on a bitcast of a function. Because the
bitcast did not happen on the invoke, it later resulted in the (wrong)
assumption that this call was an indirect call of an unsupported type.
Causing the call to be replaced by an unreachable instruction.

Made it so that the `simplifyCalls` routine runs on CallBase.
This allows asan to intercept calls to these functions.
This is allows libasan to intercept __cxa_throw_wasm while still calling
the internal implementation.
This instrinsic returns the offset from the stack pointer to the most
recent alloca instruction on the callers stack. Which is 0 on cheerp.

This change is required for Asan, which uses this intrinsic to determine
where to poison when doing dynamic allocas.
compiler-rt/debian/changelog Outdated Show resolved Hide resolved
compiler-rt/debian/control Outdated Show resolved Hide resolved
compiler-rt/lib/asan/asan_allocator.cpp Outdated Show resolved Hide resolved
compiler-rt/lib/sanitizer_common/sanitizer_atomic_clang.h Outdated Show resolved Hide resolved
compiler-rt/rpmbuild/SPECS/cheerp-compiler-rt.spec Outdated Show resolved Hide resolved
ASan, by default, uses a big integer to poison multiple shadow bytes at
the same time. However, this doesn't take into account the alignment
requirements of the target platform, like on AsmJS, which requires
natural alignment.

This change adds the option -fsanitize-address-aligned-poisoning, which
will force ASan to poison the shadow byte by byte.
Cheerp doesn't accept globals without a name.
libasan and libwasm both define their own memset and memcpy, causing the
first one that's specified to be used. This can lead to false negatives
when using ASan when libwasm is the first one linked.
@Hyxogen Hyxogen force-pushed the feat-asan branch 2 times, most recently from 51da83e to 77ffe74 Compare January 3, 2024 10:05
@Hyxogen
Copy link
Contributor Author

Hyxogen commented Jan 4, 2024

I'm sure that something in the CI will not work, but this PR should be ready

@Hyxogen Hyxogen marked this pull request as ready for review January 4, 2024 07:36
@yuri91 yuri91 merged commit ad8c279 into master Jan 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants