Go: extract and expose struct tags, interface method IDs #17357

smowton · 2024-09-03T11:19:44Z

This enables us to distinguish all database types in QL. Previously structs with the same field names and types but differing tags, and interface types with matching method names and at least one non-exported method but declared in differing packages, were impossible or only sometimes possible to distinguish in QL. With this change these types can be
distinguished, as well as permitting queries to examine struct field tags, e.g. to read JSON field name associations.

This is a pre-requisite to (some approaches to) dealing with Go 1.23's more direct exposure of type aliases, since it enables us to distinguish all types that are distinct in the database in QL, and therefore implement up-to-aliasing type matching, known in the Go spec as identical types.

owen-mc

Longer review to follow.

go/ql/lib/change-notes/2024-09-03-tags-and-interface-ids.md

go/extractor/dbscheme/tables.go

mbg

Broadly looks good! Thank you for improving this and moving it out into it's own PR. I just have a few suggestions in addition to @owen-mc's comments, which also make sense.

Also, to sanity check: in the PR description you discuss that part of the motivation here is to be able to distinguish types better. That makes sense and I found the relevant part of the Go specification for this in https://go.dev/ref/spec#Type_identity. For structs:

Two struct types are identical if they have the same sequence of fields, and if corresponding fields have the same names, and identical types, and identical tags. Non-exported field names from different packages are always different.

For interfaces:

Two interface types are identical if they define the same type set.

Looking over the tests here, I can see that the tests exercise the new functionality and that seems to behave as expected. Do the tests cover the new ability to decide (in)equality that you are hoping for? Could you comment on how the tests cover that?

go/extractor/extractor.go

go/extractor/dbscheme/tables.go

go/ql/lib/semmle/go/Types.qll

go/ql/test/library-tests/semmle/go/Types/InterfaceIds.ql

owen-mc · 2024-09-03T13:33:07Z

Tests failing:

The following files need to be reformatted using gofmt or have compilation errors:
./ql/test/library-tests/semmle/go/Types/pkg2/tst.go
Error: make: *** [Makefile:15: check-formatting] Error 1
./ql/test/library-tests/semmle/go/Types/struct_tags.go
Error: Process completed with exit code 2.

smowton · 2024-09-03T15:46:24Z

Retargeted this against main because we're currently not expecting to need this for rc/3.15 if we go for a simpler alias-erasing approach in the interim

owen-mc

Good work spotting these problems and fixing them. A few small suggestions for improvement.

Also, shouldn't the label for struct types include the tag of each field? Since differing tags make it a different struct type? Ideally this would have a test as well. This could be done as a follow-up, but it also fits in pretty naturally with this PR.

go/extractor/extractor.go

go/extractor/dbscheme/tables.go

smowton · 2024-09-20T15:53:17Z

Note: at #17341 I added a stats update since there are new db tables and they ought to have associated stats. This evidently caused join-order problems since DCA flipped from just fine to catastrophic, so join order fixery (or simply accepting missing stats -- I note C# recently totally removed them and simply manually hacked their join orders where necessary) will be necessary before this can be merged.

smowton · 2024-09-30T17:04:00Z

@mbg @owen-mc all comments applied.

I have also taken the liberty of resurrecting #9386 and including it here since we have a dbscheme change afoot anyway.

smowton · 2024-09-30T17:33:42Z

Oh, except, @mbg no there is no direct test of distinguishing two types using these functions -- there isn't a direct identical-type predicate in this PR to test, and @owen-mc getTypeLabel already adds struct-type tags into the label, hence how before this PR two structs that differed only in their tags would be different QL entities, but there would be no QL predicate that could tell them apart.

owen-mc

A few optional suggestions and one change I feel more strongly about.

go/extractor/dbscheme/tables.go

go/extractor/dbscheme/dbscheme.go

owen-mc · 2024-10-01T09:52:23Z

go/ql/lib/semmle/go/Types.qll

+   * For example, `interface { Exported() int; notExported() int }` declared in two
+   * different packages defines two distinct types, but they appear identical according to
+   * `getMethodType`. If the packages were named `a` and `b`, `getMethodType` would yield
+   * `notExported -> int` for both, whereas this method would yield `a.notExported -> int`
+   * and `b.notExported -> int` respectively.


[Optional] I think this example would be a bit clearer without the exported method.

Suggested change

* For example, `interface { Exported() int; notExported() int }` declared in two

* different packages defines two distinct types, but they appear identical according to

* `getMethodType`. If the packages were named `a` and `b`, `getMethodType` would yield

* `notExported -> int` for both, whereas this method would yield `a.notExported -> int`

* and `b.notExported -> int` respectively.

* For example, `interface { notExported() int }` declared in two different packages

* defines two distinct types, but they appear identical according to `getMethodType`.

* If the packages were named `a` and `b`, `getMethodType` would yield

* `notExported -> int` for both, whereas this method would yield `a.notExported -> int`

* and `b.notExported -> int` respectively.

I've added a mention of Exported instead.

go/ql/lib/semmle/go/Types.qll

smowton · 2024-10-01T21:45:28Z

(I'll hold off on merging this for now since there are still moderate performance issues brought about by the stats update)

smowton · 2024-10-02T15:55:43Z

DCA is now pretty good after the latest wave. Considering all the join-order tweaks needed based on the new stats, I'll do a QA run to get a larger performance sample.

smowton · 2024-10-02T22:30:42Z

QA results were broadly very strong -- analysis time reductions were much more common than increases -- but I've debugged a few of the notably somewhat-slower projects and made one last DCA run to retest the usual suite + those projects showing a slowdown on QA.

mbg

Happy with this for when you're happy with the performance results

owen-mc

Merge when performance is good enough.

This enables us to distinguish all database types in QL. Previously structs with the same field names and types but differing tags, and interface types with matching method names and at least one non-exported method but declared in differing packages, were impossible or only sometimes possible to distinguish in QL. With this change these types can be distinguished, as well as permitting queries to examine struct field tags, e.g. to read JSON field name associations.

…n method name

…der guard -> guardFunction case to work backwards from interesting return sites, allowing us to go backwards not forwards through BasicBlock::dominates

smowton · 2024-10-09T10:05:17Z

At long last -- performance results are good. QA shows an overall -8.5% time spent running queries, though over half of that comes from one leviathan project whose 2h30 analysis now takes 15 minutes. There are a small number of QA projects that recurred across a few runs showing moderate-sized (20-second or so) increases in runtime, but which didn't have an obvious cause looking at predicate timing tables, or comparing the RA of the most expensive predicates against the RA generated on main -- my guess is that these are cases where scheduling and/or cache eviction is perturbed for the worse, and for now I'm going to have to let them go, to revisit if we see a more flagrant problem.

smowton requested a review from a team as a code owner September 3, 2024 11:19

github-actions bot added documentation Go labels Sep 3, 2024

owen-mc reviewed Sep 3, 2024

View reviewed changes

go/ql/lib/change-notes/2024-09-03-tags-and-interface-ids.md Outdated Show resolved Hide resolved

go/extractor/dbscheme/tables.go Outdated Show resolved Hide resolved

mbg reviewed Sep 3, 2024

View reviewed changes

smowton changed the base branch from rc/3.15 to main September 3, 2024 15:45

owen-mc reviewed Sep 10, 2024

View reviewed changes

go/extractor/extractor.go Outdated Show resolved Hide resolved

go/extractor/dbscheme/tables.go Outdated Show resolved Hide resolved

smowton force-pushed the smowton/feature/go-indistinguishable-types branch from fe2ef27 to 41ffbdc Compare September 30, 2024 16:25

owen-mc reviewed Oct 1, 2024

View reviewed changes

owen-mc previously approved these changes Oct 1, 2024

View reviewed changes

smowton dismissed owen-mc’s stale review via 052ccfa October 2, 2024 14:18

mbg previously approved these changes Oct 3, 2024

View reviewed changes

smowton dismissed mbg’s stale review via 6ff35a3 October 3, 2024 21:18

owen-mc previously approved these changes Oct 4, 2024

View reviewed changes

smowton dismissed owen-mc’s stale review via 4e46b0e October 4, 2024 09:31

owen-mc previously approved these changes Oct 4, 2024

View reviewed changes

smowton dismissed owen-mc’s stale review via 4edd81e October 4, 2024 10:16

owen-mc previously approved these changes Oct 4, 2024

View reviewed changes

smowton dismissed owen-mc’s stale review via b290565 October 8, 2024 14:34

smowton added 3 commits October 8, 2024 19:23

Change note

9bb2a4b

Autoformat CodeQL

22ed2f9

smowton and others added 20 commits October 8, 2024 19:23

Fix test file

5d14070

Apply review comments

7a7ff4a

autoformat

e1963a5

Update stats

442e581

Prevent bad magic

fd615fb

Remove unnecessary table population on upgrade

1511927

Add note explaining how to regenerate dbscheme

d04a0f4

Optimise join orders

74cba90

Autoformat

c1a1edf

component_tags -> struct_tags

288e0ec

Clarify doc

0f95a8d

Rework interface for querying private interface method ids

ab99509

Further optimisation

36a0318

autoformat

365ccf4

Improve join orders for top 5 perf regressions in QA

bf5ba33

Further join order optimisations

ed9a6bd

Avoid pathological case where getExampleMethodName picks a very commo…

c79da8b

…n method name

copyedit

d401891

Further optimise guardingFunction: remove redundant condition, and or…

629a7a6

…der guard -> guardFunction case to work backwards from interesting return sites, allowing us to go backwards not forwards through BasicBlock::dominates

Re-optimise isSensitive routine

837387a

smowton force-pushed the smowton/feature/go-indistinguishable-types branch from bc49db4 to 837387a Compare October 8, 2024 18:23

owen-mc approved these changes Oct 9, 2024

View reviewed changes

smowton merged commit 58fd1a2 into github:main Oct 9, 2024
19 checks passed

smowton mentioned this pull request Oct 9, 2024

Go: Improve guardingFunction join order #17238

Closed

owen-mc mentioned this pull request Nov 13, 2024

Go: Note how to generate the Go dbscheme #9386

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Go: extract and expose struct tags, interface method IDs #17357

Go: extract and expose struct tags, interface method IDs #17357

smowton commented Sep 3, 2024

owen-mc left a comment

mbg left a comment

owen-mc commented Sep 3, 2024 •

edited

Loading

smowton commented Sep 3, 2024

owen-mc left a comment

smowton commented Sep 20, 2024

smowton commented Sep 30, 2024

smowton commented Sep 30, 2024

owen-mc left a comment

owen-mc Oct 1, 2024

smowton Oct 1, 2024

smowton commented Oct 1, 2024

smowton commented Oct 2, 2024

smowton commented Oct 2, 2024

mbg left a comment

owen-mc left a comment

smowton commented Oct 9, 2024

Go: extract and expose struct tags, interface method IDs #17357

Go: extract and expose struct tags, interface method IDs #17357

Conversation

smowton commented Sep 3, 2024

owen-mc left a comment

Choose a reason for hiding this comment

mbg left a comment

Choose a reason for hiding this comment

owen-mc commented Sep 3, 2024 • edited Loading

smowton commented Sep 3, 2024

owen-mc left a comment

Choose a reason for hiding this comment

smowton commented Sep 20, 2024

smowton commented Sep 30, 2024

smowton commented Sep 30, 2024

owen-mc left a comment

Choose a reason for hiding this comment

owen-mc Oct 1, 2024

Choose a reason for hiding this comment

smowton Oct 1, 2024

Choose a reason for hiding this comment

smowton commented Oct 1, 2024

smowton commented Oct 2, 2024

smowton commented Oct 2, 2024

mbg left a comment

Choose a reason for hiding this comment

owen-mc left a comment

Choose a reason for hiding this comment

smowton commented Oct 9, 2024

owen-mc commented Sep 3, 2024 •

edited

Loading