Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leftover processes after successful bun run #14

Open
NathanReb opened this issue Jul 17, 2019 · 3 comments · May be fixed by #16
Open

Leftover processes after successful bun run #14

NathanReb opened this issue Jul 17, 2019 · 3 comments · May be fixed by #16

Comments

@NathanReb
Copy link
Contributor

I've been toying around with bun this afternoon in order to add a small tutorial in https://github.com/NathanReb/ocaml-afl-examples and a section in my upcoming blog article about AFL fuzzing and OCaml and something weird happened.

I found out that for some reason when I fuzz one of my binaries bun seems to leave some process running.

I defined dune aliases to invoke bun with the right set of parameters. When I build those aliases everything seems to work just fine:

$ dune build @awesome-list/bun-fuzz --no-buffer
Done: 19/21 (jobs: 1)13:23.57:Fuzzers launched: [1 (pid=23963); 2 (pid=23964); 3 (pid=23965);
                            4 (pid=23966); 5 (pid=23967); 6 (pid=23968);
                            7 (pid=23969); 8 (pid=23970)].
13:23.58:Fuzzer 6 (pid=23968) finished
Crashes found! Take a look; copy/paste to save for reproduction:
echo J3JhVVlMcCA= | base64 -d > crash_0.$(date -u +%s)
13:23.58:[ERROR]All fuzzers finished, but some crashes were found!
         bun alias awesome-list/fuzz/bun-fuzz (exit 125)
(cd _build/default/awesome-list/fuzz && /home/nathan/.opam/4.07.1+afl/bin/bun --input=input --output=output -- ./fuzz.exe)

but it turns out it leaves some processes running:
Screenshot from 2019-07-17 15-58-36

I found out about it because, even though the command line for those seems to indicate otherwise, afl-fuzz considers them to be afl-fuzz processes. A subsequent bun invocation will fail because afl-fuzz believes the given core is being used by another afl-fuzz process for some reason.

I tried running the bun command myself, wondering if dune was interfering somehow but the result is the same.

I'l try to further debug this but I'm not yet super familiar with how bun works internally so I was hoping you might have an idea what's going wrong here.

You can take a look at NathanReb/ocaml-afl-examples#1 if you want to know a bit more about the dune aliases specification or the various binaries being fuzzed but in the following example it's just a crowbar binary with a single really basic test and a fairly regular bun invocation.

Please let me know if you need any further detail!

@NathanReb
Copy link
Contributor Author

NathanReb commented Jul 17, 2019

Just tried running one of those and it just seems to exit normally:

$ ./fuzz.exe findings/1/.cur_input 
Awesome_list.sort: ....
Awesome_list.sort: PASS

@edwintorok
Copy link
Contributor

edwintorok commented Sep 10, 2020

I'm seeing something similar: the afl-fuzz processes get killed by SIGKILL, but its children, the crowbar fuzzed process are all still alive (well actually they seem to be SIGSTOPed due to code in ocaml-afl-persistent).
This is with afl 2.52b on Fedora 32, and ocaml 4.11.1+afl.

It is unclear how to kill these, since their process group id is different from the afl parents too (well I can use pkill -9 <myfuzzedprogram.exe> but unclear how to do this reliably in a programmatic way).

Changing proc#terminate to proc#kill 15 doesn't help either, the SIGSTOPed children stay.

@edwintorok
Copy link
Contributor

https://github.com/jwilk/python-afl/blob/c1b0c535b0c0c8a9486110ebf3b27cfe90a66774/tests/test_fuzz.py#L140-L161 says this is an issue in afl itself. The workaround in python-afl is quite ugly though. I'm sending a PR that uses cgroups to fix it.

edwintorok added a commit to edwintorok/ocaml-bun that referenced this issue Sep 16, 2020
In permanent mode AFL leaves some processes behind:
ocurrent#14 (comment)

This is partly due to bun using proc#terminate (SIGKILL),
but even with SIGTERM there are race conditions in AFL
and some processes stay behind:
https://groups.google.com/d/topic/afl-users/E37s4YDti7o

This can observed by just doing a 'dune runtest' in bun itself,
and then `ps -ef|grep short.exe` on a multicore machine.

The permanent mode processes stop themselves with SIGSTOP and wait for
their afl parent to unblock them, but their afl parent exits, so they
stay around forever.
Which wouldn't be a problem, except for afl-gotcpu which detect that
these processes are around and refuses to start more fuzzing jobs.
The processes' parent pid, gid and sid is unrelated to bun or the afl
fuzzer processes (afl/forkserver does a setsid call),
so we have no easy way of finding these processes.

Use cgroups when available to kill child processes reliably: this allows
us to easily find all the pids of (grand)children.

Signed-off-by: Edwin Török <[email protected]>
edwintorok added a commit to edwintorok/ocaml-bun that referenced this issue Sep 16, 2020
In permanent mode AFL leaves some processes behind:
ocurrent#14 (comment)

This is partly due to bun using proc#terminate (SIGKILL),
but even with SIGTERM there are race conditions in AFL
and some processes stay behind:
https://groups.google.com/d/topic/afl-users/E37s4YDti7o

This can observed by just doing a 'dune runtest' in bun itself,
and then `ps -ef|grep short.exe` on a multicore machine.

The permanent mode processes stop themselves with SIGSTOP and wait for
their afl parent to unblock them, but their afl parent exits, so they
stay around forever.
Which wouldn't be a problem, except for afl-gotcpu which detect that
these processes are around and refuses to start more fuzzing jobs.
The processes' parent pid, gid and sid is unrelated to bun or the afl
fuzzer processes (afl/forkserver does a setsid call),
so we have no easy way of finding these processes.

Use cgroups when available to kill child processes reliably: this allows
us to easily find all the pids of (grand)children.

Fixes ocurrent#14
Signed-off-by: Edwin Török <[email protected]>
edwintorok added a commit to edwintorok/ocaml-bun that referenced this issue Sep 16, 2020
In permanent mode AFL leaves some processes behind:
ocurrent#14 (comment)

This is partly due to bun using proc#terminate (SIGKILL),
but even with SIGTERM there are race conditions in AFL
and some processes stay behind:
https://groups.google.com/d/topic/afl-users/E37s4YDti7o

This can observed by just doing a 'dune runtest' in bun itself,
and then `ps -ef|grep short.exe` on a multicore machine.

The permanent mode processes stop themselves with SIGSTOP and wait for
their afl parent to unblock them, but their afl parent exits, so they
stay around forever.
Which wouldn't be a problem, except for afl-gotcpu which detect that
these processes are around and refuses to start more fuzzing jobs.
The processes' parent pid, gid and sid is unrelated to bun or the afl
fuzzer processes (afl/forkserver does a setsid call),
so we have no easy way of finding these processes.

Use cgroups when available to kill child processes reliably: this allows
us to easily find all the pids of (grand)children.

Fixes ocurrent#14
Signed-off-by: Edwin Török <[email protected]>
@edwintorok edwintorok linked a pull request Sep 16, 2020 that will close this issue
edwintorok added a commit to edwintorok/ocaml-bun that referenced this issue Sep 16, 2020
In permanent mode AFL leaves some processes behind:
ocurrent#14 (comment)

This is partly due to bun using proc#terminate (SIGKILL),
but even with SIGTERM there are race conditions and some processes stay
behind.

This can observed by just doing a 'dune runtest' in bun itself,
and then `ps -ef|grep short.exe` on a multicore machine.

The permanent mode processes stop themselves with SIGSTOP and wait for
their afl parent to unblock them, but their afl parent exits, so they
stay around forever.
Which wouldn't be a problem, except for afl-gotcpu which detect that
these processes are around and refuses to start more fuzzing jobs.
The processes' parent pid, gid and sid is unrelated to bun or the afl
fuzzer processes (afl/forkserver does a setsid call),
so we have no easy way of finding these processes.

Use cgroups when available to kill child processes reliably: this allows
us to easily find all the pids of (grand)children.

Signed-off-by: Edwin Török <[email protected]>
edwintorok added a commit to edwintorok/ocaml-bun that referenced this issue Sep 17, 2020
In permanent mode AFL leaves some processes behind:
ocurrent#14 (comment)

This is partly due to bun using proc#terminate (SIGKILL),
but even with SIGTERM there are race conditions in AFL
and some processes stay behind:
https://groups.google.com/d/topic/afl-users/E37s4YDti7o

This can observed by just doing a 'dune runtest' in bun itself,
and then `ps -ef|grep short.exe` on a multicore machine.

The permanent mode processes stop themselves with SIGSTOP and wait for
their afl parent to unblock them, but their afl parent exits, so they
stay around forever.
Which wouldn't be a problem, except for afl-gotcpu which detect that
these processes are around and refuses to start more fuzzing jobs.
The processes' parent pid, gid and sid is unrelated to bun or the afl
fuzzer processes (afl/forkserver does a setsid call),
so we have no easy way of finding these processes.

Use cgroups when available to kill child processes reliably: this allows
us to easily find all the pids of (grand)children.

Fixes ocurrent#14
Signed-off-by: Edwin Török <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants