Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic support for assert() #6529

Merged
merged 3 commits into from
Feb 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions polytracker/custom_abi/dfsan_abilist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,8 @@ fun:llrintl=functional

# Functions that produce an output that does not depend on the input (shadow is
# zeroed automatically).
fun:__assert_fail=discard
fun:__assert_fail=uninstrumented
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "uninstrumented" result in? Does this mean the function is stubbed and polytracker does not interact with it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pass treats every function in the uninstrumented category in the ABI list file as conforming to the native ABI. Unless the ABI list contains additional categories for those functions, a call to one of those functions will produce a warning message, as the labelling behavior of the function is unknown.

Source: https://clang.llvm.org/docs/DataFlowSanitizer.html#abi-list

Does this answer your question?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I didn't realize this is/was the same as the regular DFSan ABI list, my bad.

fun:__assert_fail=custom
fun:__ctype_b_loc=discard
fun:__cxa_atexit=discard
fun:__errno_location=discard
Expand Down Expand Up @@ -547,7 +548,6 @@ fun:__asinl_finite=uninstrumented
fun:__asprintf=uninstrumented
fun:__asprintf_chk=uninstrumented
fun:__assert=uninstrumented
fun:__assert_fail=uninstrumented
fun:__assert_perror_fail=uninstrumented
fun:__atan2_finite=uninstrumented
fun:__atan2f_finite=uninstrumented
Expand Down
4 changes: 2 additions & 2 deletions polytracker/custom_abi/polytracker_abilist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,8 @@ fun:llrintl=functional

# Functions that produce an output that does not depend on the input (shadow is
# zeroed automatically).
fun:__assert_fail=discard
fun:__assert_fail=uninstrumented
fun:__assert_fail=custom
fun:__ctype_b_loc=discard
fun:__cxa_atexit=discard
fun:__errno_location=discard
Expand Down Expand Up @@ -548,7 +549,6 @@ fun:__asinl_finite=uninstrumented
fun:__asprintf=uninstrumented
fun:__asprintf_chk=uninstrumented
fun:__assert=uninstrumented
fun:__assert_fail=uninstrumented
fun:__assert_perror_fail=uninstrumented
fun:__atan2_finite=uninstrumented
fun:__atan2f_finite=uninstrumented
Expand Down
4 changes: 2 additions & 2 deletions polytracker/include/taintdag/outputfile.h
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ template <Section... Sections> class OutputFile {

// The FileHeader is constructed at offset zero of the mapped memory (file).
// It contains the metadata required to parse a TDAG file. Initially each
// section is assumed to be allocation_size in size. On destruction of
// section is assumed to be 0 bytes in size. On destruction of
// the OutputFile instance the size field of each section is updated to
// reflect the actual, used size.
struct FileHeader {
Expand All @@ -97,7 +97,7 @@ template <Section... Sections> class OutputFile {
(SectionMeta{.tag = Sections::tag,
.align = Sections::align_of,
.offset = 0,
.size = Sections::allocation_size})...};
.size = 0})...};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does setting the size to 0 for a section mean the section technically won't exist when output?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so? I guess this was done to have correct information in the case when we have a section that's optional, like Events if function tracing is turned off. I think there should be a header entry for the section, but it should tell you there's no trace events in it and it's size is 0? Am I right @hbrodin ?

Copy link
Contributor

@surovic surovic Jan 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we should update the comment above this code to reflect this change :P

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just the initial value. The final value will be set in the destructor of the OutputFile. That behaviour hasn't changed, it is just that we don't assume the size to be max from the beginning, but rather to be empty.

};

OutputFile(std::filesystem::path const &filename)
Expand Down
18 changes: 18 additions & 0 deletions polytracker/src/taint_sources/taint_sources.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,9 @@ EARLY_CONSTRUCT_EXTERN_GETTER(taintdag::PolyTracker, polytracker_tdag);
using util::Length;
using util::Offset;

// To allow abort to somewhat gracefully finalize polytracker logging
extern void polytracker_end();

// To create some label functions
// Following the libc custom functions from custom.cc
namespace {
Expand Down Expand Up @@ -391,6 +394,21 @@ EXT_C_FUNC void __dfsw_exit(int ret_code, dfsan_label ret_code_label) {
exit(ret_code);
}

// Need this for some reason. Not sure why. Should already be in assert.h.
extern "C" void __assert_fail(const char *, const char *, unsigned,
const char *);
// Capture calls to abort and explicitly invoke polytracker_end to do a best
// effort at updating the tdag with correct sizes.
EXT_C_FUNC void __dfsw___assert_fail(const char *msg, const char *file,
unsigned line, const char *pretty_func,
dfsan_label msg_label,
dfsan_label file_label,
dfsan_label line_label,
dfsan_label pretty_func_label) {
polytracker_end();
__assert_fail(msg, file, line, pretty_func);
}

// Socket functions
namespace {
static std::optional<std::string> connect_name(int socket) {
Expand Down
12 changes: 12 additions & 0 deletions tests/test_assert.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#include <cassert>
#include <unistd.h>

int main(int argc, char *argv[]) {

char data[2];
read(0, data, sizeof(data));

// This will terminate the program unexpectedly (tdag sizes might not be updated).
assert(data[0] == data[1]);

}
32 changes: 32 additions & 0 deletions tests/test_assert.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
from pathlib import Path
import subprocess
import pytest
from polytracker import taint_dag, PolyTrackerTrace
from typing import cast


@pytest.mark.program_trace("test_assert.cpp")
def test_assert(instrumented_binary: Path, trace_file: Path):
stdin_data = "ab"

subprocess.run(
[str(instrumented_binary)],
input=stdin_data.encode("utf-8"),
env={"POLYDB": str(trace_file), "POLYTRACKER_STDIN_SOURCE": str(1)},
)
program_trace = PolyTrackerTrace.load(trace_file)
assert isinstance(program_trace, taint_dag.TDProgramTrace)

tdfile = program_trace.tdfile
assert tdfile.label_count == 4

t1 = cast(taint_dag.TDSourceNode, tdfile.decode_node(1))
assert isinstance(t1, taint_dag.TDSourceNode)

t2 = cast(taint_dag.TDSourceNode, tdfile.decode_node(2))
assert isinstance(t2, taint_dag.TDSourceNode)

t3 = cast(taint_dag.TDSourceNode, tdfile.decode_node(3))
assert isinstance(t3, taint_dag.TDRangeNode)
assert t3.first == 1
assert t3.last == 2