Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++: Generate IR for destruction of unconditionally constructed temporaries #16125

Merged
merged 39 commits into from
Apr 12, 2024

Conversation

MathiasVP
Copy link
Contributor

@MathiasVP MathiasVP commented Apr 4, 2024

(what a mouthful 😂)

This PR adds destructor calls for temporaries that are "unconditionally constructed". For example, we now get a destructor call for S() in:

struct S {
  int a;
  S();
  ~S();
};

void test() {
  int x = S().a;
}

The "unconditionally constructed" part refers to the fact that we still don't get destructor calls in examples such as:

struct S {
  int a;
  S();
  ~S();
};

void test(bool b) {
  int x = b ? S().a : 0;
}

since the destruction has to happen "at the semicolon", but only if we actually evaluated S().a. Generating destructor calls in such cases is for a subsequent PR in the Glorious Future.

Once this PR is merged we should be able to pull the query added in #15939 out of experimental (as it will then actually have results).

Commit-by-commit review recommended. Each commit is either a "fix things" or a "accept test changes" commit which represents the test changes caused by the the previous "fix things" commit.

I don't think we should add a change note for this just yet. We can do so once we've pulled the cpp/iterator-to-expired-container query out of experimental.

@github-actions github-actions bot added the C++ label Apr 4, 2024
@MathiasVP MathiasVP changed the title C++: Generate IR for destruction of unconditionally constructed unnamed temporaries C++: Generate IR for destruction of unconditionally constructed temporaries Apr 4, 2024
MathiasVP added 15 commits April 4, 2024 16:01
have multiple parents (the 'new' expression, the call to 'operator new',
and the size expression). This happens because the latter two are
'TranslatedExpr's that return the 'new' expression as their expression
even though they don't technically represent the translation of this
expression.
To prevent this bug we tell the IR construction that the latter two
handle their destructors explicitly which means that IR construction
doesn't try to synthesize them.
@MathiasVP MathiasVP marked this pull request as ready for review April 7, 2024 00:19
@MathiasVP MathiasVP requested a review from a team as a code owner April 7, 2024 00:19
@MathiasVP MathiasVP added the no-change-note-required This PR does not need a change note label Apr 7, 2024
@MathiasVP MathiasVP force-pushed the destructors-for-unconditional-unnamed branch from 50c6b3b to d40fa4c Compare April 7, 2024 14:50
@MathiasVP MathiasVP force-pushed the destructors-for-unconditional-unnamed branch from 670ca5a to 9c25ce4 Compare April 8, 2024 14:36
@MathiasVP MathiasVP requested a review from rdmarsh2 April 8, 2024 14:48
@@ -724,15 +724,15 @@ void iterate(const std::vector<T>& v) {
}

std::vector<int>& ref_to_first_in_returnValue_1() {
return returnValue()[0]; // BAD [NOT DETECTED] (see *)
return returnValue()[0]; // BAD
}

std::vector<int>& ref_to_first_in_returnValue_2() {
return returnValue()[0]; // BAD [NOT DETECTED]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this case not detected? The function signature and implementation is identical to ref_to_first_in_returnValue_1

Copy link
Contributor Author

@MathiasVP MathiasVP Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't investigated why yet. Note that, while function signature and implementation is identical to ref_to_first_in_returnValue_1, they differ in how they're used. See here for this. Specifically:

In the one we detect (i.e., ref_to_first_in_returnValue_1) the range-based for loop looks like:

for (auto x : ref_to_first_in_returnValue_1()) {}

and in the one we fail to detect the range-based for loop looks like:

{
auto value = ref_to_first_in_returnValue_2();
for (auto x : value) {}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could improve the UX on this by moving the alert location to the for-loop (instead of at the place where the temporary is destructed). The problem with this is that the elements in the DB for range-based for loops don't always have a location (which makes for an even worse UX 😂)

jketema
jketema previously approved these changes Apr 9, 2024
Copy link
Contributor

@jketema jketema left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but @rdmarsh2 should probably also approve this before we merge.

@MathiasVP MathiasVP requested review from dbartol and removed request for rdmarsh2 April 10, 2024 16:00
Copy link
Contributor

@dbartol dbartol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really appreciate the curation into small commits. It made this much easier to review than it otherwise would have been.

I have two main questions about the approach, but no objections to merging as-is:

Determining if an object is conditionally destructed

The extractor must already know whether a given implicit destructor call is conditional or not. Do we just not have this info in the DB?

Approaching it from another direction, don't we already know when we're in a "conditional context" in IR construction, since we need to know whether to turn short-circuit expressions into Boolean values or vice-versa? Would it be easier and more reliable to share that logic to detect conditional destructors? (The answer to the latter may very well be "no").

Destruction due to exception

First, I thought we were already ignoring destruction on exception edges anyway, even for named locals, so I'm not sure it's important to get temporaries destructed on a throw right now either.

That said, if I understand the new code correctly, a throw will now have an Exception edge to any needed destructor calls, after which there will be another Exception edge to wherever the exception is first handled (whether a Catch or an Unwind). Does that mean that the last destructor call will have an exception successor but no fall-through successor? If so, that looks like the destructor call is throwing an exception, rather than completing successfully and "falling through" to the handler.

My original plan for the IR was to handle destructors in Finally blocks. Control could enter a Finally block via an Exception edge or a Goto edge, and then would exit the finally block via either an Unwind edge or a Goto edge. The actual outgoing edge taken would be determined by the kind of the incoming edge that was taken. This would introduce an annoying diamond in the CFG, though.

If you wanted to do control-flow splitting to get rid of the diamond, you could duplicate the Finally block into a regular block (for the non-exceptional case) and a Fault block (basically, a Finally that is only ever reached via an exception). I think that splitting winds up being very similar to what you've done in this PR, except with the addition of Fault and EndFault instructions to make the control flow more explicit.

In any case, I think what you've got in this PR is fine, so maybe don't worry about the above until you run into something your current approach can't handle. I suspect that if you ever do attempt to handle conditionally-destructed temporaries, you'll have to revisit this.

@MathiasVP
Copy link
Contributor Author

That said, if I understand the new code correctly, a throw will now have an Exception edge to any needed destructor calls, after which there will be another Exception edge to wherever the exception is first handled (whether a Catch or an Unwind). Does that mean that the last destructor call will have an exception successor but no fall-through successor? If so, that looks like the destructor call is throwing an exception, rather than completing successfully and "falling through" to the handler.

It's not quite correct to say that the throw will have an Exception edge to any needed destructor calls. Rather, there will be a Goto edge from the throw to the destructor call(s), and an Exception edge from the final destructor call to the exceptional successor (see for example here). In some ways, I guess that's even worse than what you describe since it looks like the throw actually doesn't throw an exception 😂

So yes, this is probably not the best possible modeling since it looks like the destructor is throwing the exception.

My original plan for the IR was to handle destructors in Finally blocks. Control could enter a Finally block via an Exception edge or a Goto edge, and then would exit the finally block via either an Unwind edge or a Goto edge. The actual outgoing edge taken would be determined by the kind of the incoming edge that was taken. This would introduce an annoying diamond in the CFG, though.

Good suggestion. That would indeed be a better way to model this. There are many things about exceptional control-flow that we could improve, and I'll your comment to that internal issue.

If you wanted to do control-flow splitting to get rid of the diamond, you could duplicate the Finally block into a regular block (for the non-exceptional case) and a Fault block (basically, a Finally that is only ever reached via an exception). I think that splitting winds up being very similar to what you've done in this PR, except with the addition of Fault and EndFault instructions to make the control flow more explicit.

That's a good point. We've already agreed that we want to introduce control-flow splitting to handle conditional destruction at some point in the future as well, and introducing a split here would align perfectly well with this.

@MathiasVP MathiasVP merged commit 1166645 into github:main Apr 12, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C++ no-change-note-required This PR does not need a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants