-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refine types based on debug metadata #191
base: master
Are you sure you want to change the base?
Conversation
Issues with this right now:
|
auto inttype{llvm::cast<llvm::DIBasicType>(ditype)}; | ||
sign = | ||
inttype->getSignedness() == llvm::DIBasicType::Signedness::Signed; | ||
// TODO(frabert): this path will not be taken when arguments will have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you file issues for each of these cases, with an example of C code that reproduces the issue.
if (ditype) { | ||
auto difunctype{llvm::cast<llvm::DISubroutineType>(ditype)}; | ||
auto arr{difunctype->getTypeArray()}; | ||
if (arr.size() == ditype_array.size()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you leave a comment on why the size check is needed. Is it a sanity check on the debug info? Is it to avoid the issue of dead argument elimination?
Regarding the overall method: yes; thinking about it, the current way of trying to match IR and debug metadata is not very robust. The "correct" solution will probably involve some kind of more thorough, "holistic" approach. Question: where should it fit? Could it be that it would be easier to do that before trying to lift the bitcode, as a preprocessing step? |
I think doing so before might be a good place to start, but I don't know what that looks like. What is evident is that right now, in the middle of doing one thing, we're trying to reverse engineer debug info types, and integrate that info. I think that attempts to integrate more "smarts" into that process are going to lead to issues in trying to manage the complexity of what's going on. Some kind of pass, or multiple passes, that interprets bitcode values, types, and debug info locations/types ahead-of-time seems prudent. Perhaps we can formulate this problem as the type of info that we think we should be able to present. For example, at each LLVM instruction, what logical source variables are "live" and where are their values, and what are their types? The "where are their values" is tricky, because their values may be embedded in other values (e.g. high bits, low bits, mid-bits [for the case of a bitfield], in a structure value, in a vector value, at some byte offset of an alloca). I think it would be prudent to work toward the ability to output this information, as a proof-of-capability for getting it, and a way of forcing it into a coherent API. |
This might open up opportunities. For example, if the debug info "tells" us that two LLVM values represent the same local variable, and if the two values have the same LLVM type, then we might be able to keep track of this as saying: these two values are in a "storage equivalence class." |
Regarding this specific PR: due to the way the Also, most tests need additional debug info for function prototypes, and I still haven't figured out a way to convince clang to consistently emit info for those. Even |
@@ -60,6 +60,9 @@ bool StructFieldRenamer::VisitRecordDecl(clang::RecordDecl *decl) { | |||
// FIXME(frabert): Is a clash between field names actually possible? | |||
// Can this mechanism actually be left out? | |||
auto name{di_field->getName().str()}; | |||
if (di_field->getTag() == llvm::dwarf::DW_TAG_inheritance) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this make an explicit field out of the base type in the case of inheritance? Can you add a comment here that shows what a simple c++ code would look like, and what we would generate as a result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some examples that use multiple inheritance and virtual inheritance?
struct Base1 {
int foo;
};
struct Base2 {
float bar;
}
struct Derived : Base1 , Base2 {
};
struct Base1 {
int foo;
};
struct Base2 : Base1 {
float bar;
};
struct Base3 : Base1 {
float bar;
};
struct Derived : virtual Base2 , virtual Base3 {
};
Also, here is a particularly thorny example which shows when this method of embedding the base within the structure of the parent is going to break down:
C++: https://godbolt.org/z/bM4vrq6fW
C: https://godbolt.org/z/fYarYo5he
See this SO post for more detail: https://stackoverflow.com/questions/52818411/will-the-padding-of-base-class-be-copied-into-the-derived-class
@@ -60,6 +60,9 @@ bool StructFieldRenamer::VisitRecordDecl(clang::RecordDecl *decl) { | |||
// FIXME(frabert): Is a clash between field names actually possible? | |||
// Can this mechanism actually be left out? | |||
auto name{di_field->getName().str()}; | |||
if (di_field->getTag() == llvm::dwarf::DW_TAG_inheritance) { | |||
name = di_field->getBaseType()->getName().str() + "_base"; | |||
} | |||
if (seen_names.find(name) == seen_names.end()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if you have seen_names
be a map of seen_names -> unsinged
, then you could have:
auto &name_count = seen_names[name];
if (name_count) {
name = name + "_" + std::to_string(name_count);
}
++name_count;
Solves #190