-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-level data-race reduction architecture to extract distinct races #13
Comments
From @SumedhArani: I've been going through the source code of reporting routine of tsan and what @dongahn suggests can be certainly done and does add a benefit of doing a in situ reduction of data races instead of parsing it after the reports are outputted. As per my known knowledge of tsan reporting routine and the data races I'm exposed to, my plan would be(as per my observation) To store the immediate return stack i.e where the race is taking place in a data structure which will contain two attributes (read location and write location) |
From @dongahn: This sounds right. I think a key is to make this a callback architecture. One can have a default function as you described which does a reduction over the code location. But some ppl may want to have a bit finer granularity differentating the code location that came from different call paths... A callback design should allow multilevel reduction, I think. |
From @SumedhArani: @dongahn Could you please elaborate as I didn't understand clearly
Also like what sort of multilevel reduction are you thinking
A callback design suggested by you is definitely advantageous in the situation as suggested by you but I could not understand as to what issues were you actually thinking. |
I already have a prototype, that fetches the output and redirects the
output into MUST's aggregation engine. It's not perfect. I think we can
introduce additional filters to specify similarities, that should be
aggregated.
I also submitted a patch for review, to be able to redirect the output
at runtime, it was not accepted yet:
https://reviews.llvm.org/D15154
Am 09.02.2017 um 22:42 schrieb Dong H. Ahn:
… I am moving this issue from #3 <#3>:
This has been common feedback from from our early adopters. Instead of
seeing individual data race instances, they want to be able to see
"unique data races." This runs in multiple dimensions. E.g., they want
to be able to aggregate and reduce races from many reported gathered
from running a regression test suite; they want to be able to aggregate
and reduce races across multiple processes from a single MPI application
run.
So far, the suggestion was to look at the current reporting routine
within the ThreadSanitizer and see if we can open up its protocol to add
custom reduction engine.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#13>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGSnhpS9-mt0dVMqz0irDFyCSVvx_ohWks5ra4hLgaJpZM4L8s-q>.
--
Dipl.-Inf. Joachim Protze
IT Center
Group: High Performance Computing
Division: Computational Science and Engineering
RWTH Aachen University
Seffenter Weg 23
D 52074 Aachen (Germany)
Tel: +49 241 80- 24765
Fax: +49 241 80-624765
[email protected]
www.itc.rwth-aachen.de
|
Thinking about this a bit more, I am not sure if the callback is the best mechanism. But what I was thinking is have a way for people like us to register a function that can perform aggregation and reduction at various levels. Using the same hook, we will have a default aggregator but this shouldn't prevent others from registering their own instead. How does the function prototype of TSan report look like? If the data on races are passed into this function as a function argument, perhaps we can simply make the report function "weak symbol" so that one can override it w/ his function defined as a strong symbol? I can imagine, by default we have three different aggregators.
|
@jprotze: oh nice! It sounds like we already have a baseline to extend :-) |
The function that you register with SetPrintfAndReportCallback(void (*callback)(const char *)); is called multiple times per race report just like you would call printf in your code. The structure of the reports is pretty easy to parse. Each single report has a ==== bar at the begin and end. Thus the callback can collect the strings and trigger postprocessing once the end of a report is detected:
|
After following the thread of posts, I've a few questions that remain unclear to me after a quick google search.
Can you help me out here as to which are the various levels are you referring ? By levels, is it aggregations and reduction at different stages?
where ReportDesc is a class with the following outlook
What I could figure out as a way to extract unique race was that if we check the location at which the race reported by TSAN on read and write but shows different races because it was executed by different threads. My context here being the sample code in Readme.md. Probably storing this memory location and then when it calls a PrintReport which gets called multiple times per race report for the next race, we cross check our already found data races and if the read write happens to be at the same location, we don't output it. As an alternative, we can store these repetitive data race in say a key value store. So @dongahn, when you suggested to use callbacks, I thought of having a callback to my own function(PrintLocation as it is the part of the code where I manipulate) in PrintReport instead of it calling to the original PrintLocation which doesn't extract unique races. Hope I'm getting the problem correctly understood. @jprotze Is it that I need to call SetPrintfAndReportCallback instead of Printf for post processing? |
With MUST I refered to our MPI correctness-checking tool: https://www.itc.rwth-aachen.de/MUST The codesnippet above: |
@SumedhArani: by multilevel, I mean an ability to compare two race instances and treat them as "equivalent" or "distinct" with different criteria. One special case would be to reduce those instances that have exactly same backtrace but accessing threads are different into an equivalence class (Your case.) But it doesn't have to be the only reduction criteria. In fact, in many case different users will want to set "equivalence" criteria differently. One may want to treat two race instances which differing call paths except for the leaf point of execution (e.g., line number) as equivalent and merge them together One may want to all but the first few root functions to be the same to become equivalent. This will be common when outputs come from two different test programs. The idea would be have a way for users to register a merge logic what they want while providing our own merge functions for those users who can live w/ our predesigned criteria. I think what might be helpful is for you to do a quick prototype simple with one merge function to do reduction as you proposed. Then, we can take a look at it to see how one can extend this to what we want? |
Especially if you look at execution of OpenMP explicit tasks, you will find that tasks, that execute the same lexical code, which might also be created in the same context, have completely different backtraces. Also the backtrace can differ in depth. |
@jprotze , I did notice that even for the sample program (readme.md) where the master thread has a different backtrace (although similar). I actually get the leaf point of execution from the backtrace itself i.e the top of the stack. @dongahn , I'll surely try to get a quick prototype up and running. I've my exams in the coming week so I'll try to post the prototype ASAP. Thanks, |
@SumedhArani: Yes, let's go with the same leaf point as the criteria to begin with and then try to find ways to program different criteria. It feels like this project will be pretty interesting! |
Yeah me too @dongahn !! Already getting to learn so much!! Will deliver my best and it's been a pleasure to interact with everyone!:) |
Quick update: Using function overloading to handle 4 scenarios as of now
The developer can suitably make a call allowing the person to call the report function as per his needs. |
A working prototype is ready for reduction on same memory location. Now with a little tweaking, I can get it working for the other scenarios. https://github.com/SumedhArani/Unique-Races Files to be looked at:
These are the three files that I've changed. As of now I've uploaded the code in my repository. Right now it does handle when data race is occurring at a specific location.
Here, normally the TSan would report three warnings(My system uses four threads) regarding different threads trying to update the location of variable I've filtered them to one. I've used function overloading as of now to allow one to register a function call as per his suitable needs. If you find the work satisfactory, I'll code for the rest of them in a similar manner. Thanks, |
Hey @SumedhArani: Thank you for the quick response! It is kind of hard to review what has changed without diffs. How about we do this.
This can help us review your codes by looking at diffs and you can get more meaningful reviews. At first glance, some important software maintenance considerations:
|
Hey @SumedhArani, I just forked the "compiler-rt" from the llvm-mirror. I sent you an invitation as a collaborator on that repository, let me know if you are able to read and write the repo. |
Hey @simoatze , Thanks!! I did get the read and write permissions. I did put in a PR but only to later realise that I had not seen your comment before. @dongahn , I'll surely try to stick the software maintenance considerations you've pointed out. I'm not exposed to TRAVIS-CI, but I'll look it up and lets see what test cases can I come up with. The prototype is up and running. Have a check and let me know. :) One more question: Thanks, |
@SumedhArani @dongahn Should we close this on Archer since we have everything in the new repo now? |
I prefer keeping this open until we get a complete solution merged into Archer. I think the current solution need to grow a bit to meet the use case we described it here. If this issue becomes too long, we may split the into multiple manageable topics but let's keep this open for a while. |
@dongahn , If you find the solution for the scenario where it reduces on the basis of memory location acceptable, then on similar lines I'll code for the remaining scenarios that are
The remaining two scenarios that are handled as of now are My point being that we can extend the solution to handle the first two cases also which will have a similar design to that of the PR submitted now. and @simoatze anyway is good for me! 👍 Thanks, |
I am moving this issue from #3:
This has been common feedback from our early adopters. Instead of seeing individual data race instances, they want to be able to see "unique data races." This runs in multiple dimensions. E.g., they want to be able to aggregate and reduce races from many reports gathered from running a regression test suite; they want to be able to aggregate and reduce races across multiple processes from a single MPI application run.
So far, the suggestion was to look at the current reporting routine within ThreadSanitizer and see if we can open up its protocol to add custom reduction engine.
The text was updated successfully, but these errors were encountered: