-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataflow: Remove tracked types from Access Paths, track tainted object type, and tweak type pruning. #18179
Dataflow: Remove tracked types from Access Paths, track tainted object type, and tweak type pruning. #18179
Conversation
f7932dc
to
5feb737
Compare
5feb737
to
b65a4e4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great; a nice overall simplification of our code.
@@ -3006,6 +3050,12 @@ module MakeImpl<LocationSig Location, InputSig<Location> Lang> { | |||
) | |||
} | |||
|
|||
private string ppStored() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This predicate is duplicated, so perhaps pull it out via
private class TypOption extends TypOption::Option {
string pp() {
exists(string ppt | ppt = this.toString() |
if this.isNone() or ppt = "" then result = "" else result = " : " + ppt
)
}
}
@@ -891,6 +891,8 @@ module MakeImplCommon<LocationSig Location, InputSig<Location> Lang> { | |||
nodeDataFlowType(this.asNode(), result) | |||
or | |||
nodeDataFlowType(this.asParamReturnNode(), result) | |||
or | |||
isTopType(result) and this.isImplicitReadNode(_) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we certain that isTopType
always has results?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reasonably sure, but not completely. There are two options: Either we make it part of the language-specific input, or we make it a consistency check. I think I like the latter to avoid add yet another predicate in the input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A consistency check sounds fine to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found to fail on C# with
DataFlowType should have one top type that's compatible with all other types, but has 3.
That's harmless, but not intentional, so agreed to add this to the input signature instead as a followup.
29d3c51
to
c187a7a
Compare
This PR changes how type pruning is done in the data flow library. The primary change is that access paths no longer carry information about tracked types. On its own, this is a precision reduction, so it's combined with a separate precision improvement in that we now track the type of the originally tainted object - this corresponds to the innermost type in an access path, if available, prior to this PR. These two changes mostly balance each other in terms of overall precision, but of course they also affect performance. For Java on MRVA top 1000, this effect was slightly negative with two projects having big degradations in terms of number of computed tuples in certain queries. However, without types in access paths, this now allows other changes, so this PR contains a third tweak that postpones type pruning until stage 5. And this latter tweak regains the lost performance and then some, so the overall effect looks like a net win on almost all of MRVA top 1000.
I also checked impact on C# MRVA top 1000, but there the effects were negligible. And other languages are likely even less affected, since this only deals with type pruning, which isn't used much beyond Java and C#.
Commit-by-commit review is encouraged. The first two commits are simple refactorings. And the subsequent 3 commits implement the 3 changes described above.