-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C++: Generate IR for destruction of unconditionally constructed temporaries #16125
C++: Generate IR for destruction of unconditionally constructed temporaries #16125
Conversation
…n if destructors are called.
…rations since these are also conditional.
have multiple parents (the 'new' expression, the call to 'operator new', and the size expression). This happens because the latter two are 'TranslatedExpr's that return the 'new' expression as their expression even though they don't technically represent the translation of this expression. To prevent this bug we tell the IR construction that the latter two handle their destructors explicitly which means that IR construction doesn't try to synthesize them.
…PhiOperand' consistency errors.
…onversion to the temporary object conversion.
50c6b3b
to
d40fa4c
Compare
…ary object expression.
…t expression at the top-level.
670ca5a
to
9c25ce4
Compare
cpp/ql/lib/semmle/code/cpp/ir/implementation/raw/internal/TranslatedElement.qll
Outdated
Show resolved
Hide resolved
cpp/ql/lib/semmle/code/cpp/ir/implementation/raw/internal/TranslatedElement.qll
Outdated
Show resolved
Hide resolved
…slatedElement.qll Co-authored-by: Jeroen Ketema <[email protected]>
cpp/ql/lib/semmle/code/cpp/ir/implementation/raw/internal/TranslatedExpr.qll
Show resolved
Hide resolved
…e reused expression is a temporary.
@@ -724,15 +724,15 @@ void iterate(const std::vector<T>& v) { | |||
} | |||
|
|||
std::vector<int>& ref_to_first_in_returnValue_1() { | |||
return returnValue()[0]; // BAD [NOT DETECTED] (see *) | |||
return returnValue()[0]; // BAD | |||
} | |||
|
|||
std::vector<int>& ref_to_first_in_returnValue_2() { | |||
return returnValue()[0]; // BAD [NOT DETECTED] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this case not detected? The function signature and implementation is identical to ref_to_first_in_returnValue_1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't investigated why yet. Note that, while function signature and implementation is identical to ref_to_first_in_returnValue_1
, they differ in how they're used. See here for this. Specifically:
In the one we detect (i.e., ref_to_first_in_returnValue_1
) the range-based for loop looks like:
for (auto x : ref_to_first_in_returnValue_1()) {}
and in the one we fail to detect the range-based for loop looks like:
{
auto value = ref_to_first_in_returnValue_2();
for (auto x : value) {}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could improve the UX on this by moving the alert location to the for-loop (instead of at the place where the temporary is destructed). The problem with this is that the elements in the DB for range-based for loops don't always have a location (which makes for an even worse UX 😂)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but @rdmarsh2 should probably also approve this before we merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really appreciate the curation into small commits. It made this much easier to review than it otherwise would have been.
I have two main questions about the approach, but no objections to merging as-is:
Determining if an object is conditionally destructed
The extractor must already know whether a given implicit destructor call is conditional or not. Do we just not have this info in the DB?
Approaching it from another direction, don't we already know when we're in a "conditional context" in IR construction, since we need to know whether to turn short-circuit expressions into Boolean values or vice-versa? Would it be easier and more reliable to share that logic to detect conditional destructors? (The answer to the latter may very well be "no").
Destruction due to exception
First, I thought we were already ignoring destruction on exception edges anyway, even for named locals, so I'm not sure it's important to get temporaries destructed on a throw
right now either.
That said, if I understand the new code correctly, a throw
will now have an Exception
edge to any needed destructor calls, after which there will be another Exception
edge to wherever the exception is first handled (whether a Catch
or an Unwind
). Does that mean that the last destructor call will have an exception successor but no fall-through successor? If so, that looks like the destructor call is throwing an exception, rather than completing successfully and "falling through" to the handler.
My original plan for the IR was to handle destructors in Finally
blocks. Control could enter a Finally
block via an Exception
edge or a Goto
edge, and then would exit the finally block via either an Unwind
edge or a Goto
edge. The actual outgoing edge taken would be determined by the kind of the incoming edge that was taken. This would introduce an annoying diamond in the CFG, though.
If you wanted to do control-flow splitting to get rid of the diamond, you could duplicate the Finally
block into a regular block (for the non-exceptional case) and a Fault
block (basically, a Finally
that is only ever reached via an exception). I think that splitting winds up being very similar to what you've done in this PR, except with the addition of Fault
and EndFault
instructions to make the control flow more explicit.
In any case, I think what you've got in this PR is fine, so maybe don't worry about the above until you run into something your current approach can't handle. I suspect that if you ever do attempt to handle conditionally-destructed temporaries, you'll have to revisit this.
It's not quite correct to say that the throw will have an So yes, this is probably not the best possible modeling since it looks like the destructor is throwing the exception.
Good suggestion. That would indeed be a better way to model this. There are many things about exceptional control-flow that we could improve, and I'll your comment to that internal issue.
That's a good point. We've already agreed that we want to introduce control-flow splitting to handle conditional destruction at some point in the future as well, and introducing a split here would align perfectly well with this. |
(what a mouthful 😂)
This PR adds destructor calls for temporaries that are "unconditionally constructed". For example, we now get a destructor call for
S()
in:The "unconditionally constructed" part refers to the fact that we still don't get destructor calls in examples such as:
since the destruction has to happen "at the semicolon", but only if we actually evaluated
S().a
. Generating destructor calls in such cases is for a subsequent PR in the Glorious Future.Once this PR is merged we should be able to pull the query added in #15939 out of experimental (as it will then actually have results).
Commit-by-commit review recommended. Each commit is either a "fix things" or a "accept test changes" commit which represents the test changes caused by the the previous "fix things" commit.
I don't think we should add a change note for this just yet. We can do so once we've pulled the
cpp/iterator-to-expired-container
query out of experimental.