-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Go: extract and expose struct tags, interface method IDs #17357
Go: extract and expose struct tags, interface method IDs #17357
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Longer review to follow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Broadly looks good! Thank you for improving this and moving it out into it's own PR. I just have a few suggestions in addition to @owen-mc's comments, which also make sense.
Also, to sanity check: in the PR description you discuss that part of the motivation here is to be able to distinguish types better. That makes sense and I found the relevant part of the Go specification for this in https://go.dev/ref/spec#Type_identity. For structs:
Two struct types are identical if they have the same sequence of fields, and if corresponding fields have the same names, and identical types, and identical tags. Non-exported field names from different packages are always different.
For interfaces:
Two interface types are identical if they define the same type set.
Looking over the tests here, I can see that the tests exercise the new functionality and that seems to behave as expected. Do the tests cover the new ability to decide (in)equality that you are hoping for? Could you comment on how the tests cover that?
Tests failing:
|
Retargeted this against |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work spotting these problems and fixing them. A few small suggestions for improvement.
Also, shouldn't the label for struct types include the tag of each field? Since differing tags make it a different struct type? Ideally this would have a test as well. This could be done as a follow-up, but it also fits in pretty naturally with this PR.
Note: at #17341 I added a stats update since there are new db tables and they ought to have associated stats. This evidently caused join-order problems since DCA flipped from just fine to catastrophic, so join order fixery (or simply accepting missing stats -- I note C# recently totally removed them and simply manually hacked their join orders where necessary) will be necessary before this can be merged. |
fe2ef27
to
41ffbdc
Compare
Oh, except, @mbg no there is no direct test of distinguishing two types using these functions -- there isn't a direct identical-type predicate in this PR to test, and @owen-mc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few optional suggestions and one change I feel more strongly about.
go/ql/lib/semmle/go/Types.qll
Outdated
* For example, `interface { Exported() int; notExported() int }` declared in two | ||
* different packages defines two distinct types, but they appear identical according to | ||
* `getMethodType`. If the packages were named `a` and `b`, `getMethodType` would yield | ||
* `notExported -> int` for both, whereas this method would yield `a.notExported -> int` | ||
* and `b.notExported -> int` respectively. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Optional] I think this example would be a bit clearer without the exported method.
* For example, `interface { Exported() int; notExported() int }` declared in two | |
* different packages defines two distinct types, but they appear identical according to | |
* `getMethodType`. If the packages were named `a` and `b`, `getMethodType` would yield | |
* `notExported -> int` for both, whereas this method would yield `a.notExported -> int` | |
* and `b.notExported -> int` respectively. | |
* For example, `interface { notExported() int }` declared in two different packages | |
* defines two distinct types, but they appear identical according to `getMethodType`. | |
* If the packages were named `a` and `b`, `getMethodType` would yield | |
* `notExported -> int` for both, whereas this method would yield `a.notExported -> int` | |
* and `b.notExported -> int` respectively. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a mention of Exported
instead.
(I'll hold off on merging this for now since there are still moderate performance issues brought about by the stats update) |
DCA is now pretty good after the latest wave. Considering all the join-order tweaks needed based on the new stats, I'll do a QA run to get a larger performance sample. |
QA results were broadly very strong -- analysis time reductions were much more common than increases -- but I've debugged a few of the notably somewhat-slower projects and made one last DCA run to retest the usual suite + those projects showing a slowdown on QA. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy with this for when you're happy with the performance results
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merge when performance is good enough.
This enables us to distinguish all database types in QL. Previously structs with the same field names and types but differing tags, and interface types with matching method names and at least one non-exported method but declared in differing packages, were impossible or only sometimes possible to distinguish in QL. With this change these types can be distinguished, as well as permitting queries to examine struct field tags, e.g. to read JSON field name associations.
…der guard -> guardFunction case to work backwards from interesting return sites, allowing us to go backwards not forwards through BasicBlock::dominates
bc49db4
to
837387a
Compare
At long last -- performance results are good. QA shows an overall -8.5% time spent running queries, though over half of that comes from one leviathan project whose 2h30 analysis now takes 15 minutes. There are a small number of QA projects that recurred across a few runs showing moderate-sized (20-second or so) increases in runtime, but which didn't have an obvious cause looking at predicate timing tables, or comparing the RA of the most expensive predicates against the RA generated on |
This enables us to distinguish all database types in QL. Previously structs with the same field names and types but differing tags, and interface types with matching method names and at least one non-exported method but declared in differing packages, were impossible or only sometimes possible to distinguish in QL. With this change these types can be
distinguished, as well as permitting queries to examine struct field tags, e.g. to read JSON field name associations.
This is a pre-requisite to (some approaches to) dealing with Go 1.23's more direct exposure of type aliases, since it enables us to distinguish all types that are distinct in the database in QL, and therefore implement up-to-aliasing type matching, known in the Go spec as identical types.