Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crypto/elliptic: regression from 1.17 in the cost of parsing and compiling p256_asm_table.go #50995

Closed
mvdan opened this issue Feb 3, 2022 · 14 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. release-blocker
Milestone

Comments

@mvdan
Copy link
Member

mvdan commented Feb 3, 2022

Please see https://go-review.googlesource.com/c/go/+/380474 for details. I'm opening this issue so we can track this for the 1.18 release.

In short, https://go-review.googlesource.com/c/go/+/339591/ changed the code generator, and since then, that file is significantly more expensive to parse and compile due to the extensive use of binary expressions:

	name           old time/op         new time/op         delta
	Gofmt                 22.8ms ± 6%        898.5ms ± 3%  +3837.32%  (p=0.000 n=8+8)
	GoToolCompile         26.9ms ± 2%        371.1ms ± 2%  +1278.36%  (p=0.000 n=7+8)

	name           old user-time/op    new user-time/op    delta
	Gofmt                 25.7ms ±65%        897.1ms ± 3%  +3383.86%  (p=0.000 n=8+8)
	GoToolCompile         35.1ms ±26%        367.2ms ± 3%   +945.80%  (p=0.000 n=8+8)

https://go-review.googlesource.com/c/go/+/380474 fixes this issue by using fewer binary expressions, and https://go-review.googlesource.com/c/go/+/380475 fixes this by using go:embed.

I don't really mind which of the two we pick, but I really think we need a fix before the final release, as the regression seems pretty significant. I don't think this affects compiled code, but it certainly affects the developer experience through build speed and tool speed.

cc @josharian @rolandshoemaker @FiloSottile @bcmills

@FiloSottile FiloSottile added this to the Go1.18 milestone Feb 3, 2022
@FiloSottile
Copy link
Contributor

Labeling release-blocker for assessment before Go 1.18.

@randall77
Copy link
Contributor

Users are probably never gofmt-ing this code. Maybe some code analysis tools parse it though? Does vet parse it, or does it use object files?
Users are probably only compiling this code once (maybe a few times, if cross-compiling, using -race, etc.). Then it lives on in the build cache and would no longer cause an issue.

@mvdan
Copy link
Member Author

mvdan commented Feb 3, 2022

Running gofmt on the entire standard library seems like a reasonable thing to do, for example when testing that a change to gofmt isn't resulting in unexpected changes. I don't see a reason why that should burn CPU for a solid second if we can avoid it :)

You're right that the build cache partially hides this slow-down from many users, but it's still noticeable whenever the standard library is rebuilt. This will be the case for cross-builds, when build parameters like build tags are changed, when building Go itself from source (which I do on a daily basis), or with -toolexec tools that use the build cache by modifying compile -V=full, such as burrowers/garble#475, where this regression accounted for 2% of total CPU cost.

@mvdan
Copy link
Member Author

mvdan commented Feb 3, 2022

I should note that we have two rather easy fixes at hand, here, with presumably no downsides. I would perhaps agree with Keith if there was a significant downside to the changes.

@toothrot toothrot added the NeedsFix The path to resolution is known, but the work has not been done. label Feb 4, 2022
@bradfitz
Copy link
Contributor

bradfitz commented Feb 4, 2022

Not release team, but change Looks Safe To Me at least.

Can you file & make the commit reference a bug for the root problem in any case? The compiler & gofmt shouldn't be so sensitive to this.

@mvdan
Copy link
Member Author

mvdan commented Feb 4, 2022

I did think about whether gofmt and the compiler are to blame for this, but I'm not so sure we can blame them; a syntax tree with a chain of tens of thousands of nodes will be slow no matter what, because you'll need to jump through tens of thousands of pointers to traverse it. I guess one fix would be "make go/ast not use a syntax tree with interfaces and pointers", but that would break most tools out there :)

The compiler could potentially prevent the slow-down, because it doesn't publicly expose its syntax tree representation, though I'm not sure whether remedying this slow-down in the compiler and not all other tools will be worthwhile. Even if the builds are fast, all other tooling including gofmt would still be slow. cc @griesemer

@josharian could you please confirm whether the go:embed route with a string var still keeps the binary/memory properties you were interested in when you did your codegen refactor? Because that route would certainly be the simplest going forward.

@mvdan
Copy link
Member Author

mvdan commented Feb 4, 2022

And also, thinking outloud: I'm not sure we need to worry too much about Go files that have ten thousand string additions chained together, because I've only seen that in generated code to embed some form of asset - and, presumably, all of those can just switch to go:embed now :)

@josharian
Copy link
Contributor

Apologies for the delay. Both CLs look fine to me, both from a code perspective and from a binary properties perspective.

The compiler has special handling for long string constant concatenation, for #16394. See

// sum efficiently handles very large summation expressions (such as
// in issue #16394). In particular, it avoids left recursion and
// collapses string literals.
func (p *noder) sum(x syntax.Expr) ir.Node {

But perhaps that is broken? Or there is another place we need that fix? Regardless of what we do about crypto/elliptic, the compiler should not have super-linear behavior on constant string addition.

@mvdan
Copy link
Member Author

mvdan commented Feb 7, 2022

Apologies for the delay. Both CLs look fine to me, both from a code perspective and from a binary properties perspective.

Thank you! I'll tidy up the go:embed CL tomorrow, then - always nice to reduce the amount of code :)

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/380475 mentions this issue: crypto/elliptic: use go:embed for the precomputed p256 table

@mvdan
Copy link
Member Author

mvdan commented Feb 7, 2022

I had some spare time, and I didn't want to delay a release blocker, so the go:embed CL is now ready :)

@mvdan
Copy link
Member Author

mvdan commented Feb 7, 2022

--- FAIL: TestDependencies (1.11s)
    deps_test.go:620: unexpected dependency: crypto/elliptic imports [embed]
FAIL
FAIL	go/build	1.167s

Could someone please confirm that this is an OK change to make in deps_test.go? embed is a tiny package, and crypto/elliptic is already manually embedding a file via code generation, so I don't imagine it should cause any problems.

@ianlancetaylor
Copy link
Member

I think it's fine for crypto/elliptic to depend on embed.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/383995 mentions this issue: misc/reboot: don't use symlinks when copying GOROOT/src

gopherbot pushed a commit that referenced this issue Feb 8, 2022
go:embed disallows using symlinked files by design.
crypto/elliptic is the first std package to use it as of CL 380475,
and unfortunately that broke the TestRepeatBootstrap long test.

The reason it uses symlinks is for speed; it wants to copy GOROOT/src,
but regular files aren't going to be modified in any way,
so a symlink, if supported, means not needing to copy the contents.

Replace the symlink attempt with hard links,
which will mean regular files remain as such, fixing go:embed.
It's worth noting that on many systems hard links won't work,
as the temporary filesystem tends to be separate,
but it doesn't hurt to try.

In my system, where /tmp is tmpfs, the test now copies more bytes.
With the added Logf, I can see overlayDir goes from ~30ms to ~100ms.
This makes sense, as GOROOT/src currently weighs around 100MiB.
To alleviate that slow-down, stop copying testdata directories,
as they currently weigh around 20MiB and aren't needed for the test.
This gets overlayDir on my system down to an acceptable ~70ms.

I briefly considered teaching overlayDir what files can be symlinks,
but that seemed fairly complex long-term, as any file could be embedded.

While here, start using testing.T.TempDir and fs.WalkDir.

For #50995.

Change-Id: I17947e6bdee96237e1ca0606ad0b95e7c5987bc1
Reviewed-on: https://go-review.googlesource.com/c/go/+/383995
Trust: Daniel Martí <[email protected]>
Trust: Bryan Mills <[email protected]>
Run-TryBot: Bryan Mills <[email protected]>
Reviewed-by: Bryan Mills <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
@golang golang locked and limited conversation to collaborators Feb 8, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. release-blocker
Projects
None yet
Development

No branches or pull requests

8 participants