title |
---|
GcToolchainTricks |
This page documents some less well-known (perhaps advanced) tricks for the gc
toolchain (and the Go tool).
Basically, you write your assembly language in GNU as(1) format, but make sure all the interface functions are using Go's ABI (everything on stack, etc., please read Go 1.2 Assembler Introduction for more details).
The most important step is compiling that file to file.syso (gcc -c -O3 -o file.syso file.S
),
and put the resulting syso in the package source directory.
And then, suppose your assembly function is named Func, you need one stub
cmd/asm assembly file to call it:
TEXT ·Func(SB),$0-8 // please set the correct parameter size (8) here
JMP Func(SB)
then you just declare Func in your package and use it, go build will be able to pick up the syso and link it into the package.
Notes:
- The binary produced won't use cgo, and the overhead is just an unconditional JMP that could be perfectly branch predicted. But, please be aware that because it doesn't use cgo, your assembly function is running on Go stack, and it shouldn't use too much stack (a safe value is less than ~100 bytes) or terrible things will happen. For compute kernels, this requirement isn't too restricting.
- Please make sure you‘ve included all library dependencies in your C code.
libc
is not available, and most notably,libgcc
is also not available (esp. when you're using gcc__builtin_funcs
, please usenm(1)
to double-check that your file doesn't contain any undefined symbols). - It's also possible to call back Go functions from C code, but this is left as an exercise for the reader.
- this trick is supported on all Go 1.x releases.
- the Go linker is pretty capable in that you just need to prepare .syso file for each architecture, not for each OS/Arch combination (assuming you don't use OS-specific constructs, obviously), and the Go linker is perfectly capable to link, for example, Mach-O object files into ELF binaries. So be sure to name your syso file with names like
file_amd64.syso
,file_386.syso
.
There are a lot of ways to bundle data in Go binary, for example:
zip
the data files, and append the zip file to end of Go binary, then usezip -A prog
to adjust the bundled zip header. You can usearchive/zip
to open the program as a zip file, and access its contents easily. There are existing packages that helps with this, for example, https://pkg.go.dev/bitbucket.org/tebeka/nrsc; This requires post-processing the program binary, which is not suitable for non-main packages that require static data. Also, you must collect all data files into one zip file, which means that it's impossible to use multiple packages that utilize this method.- Embed the binary file as a
string
or[]byte
in Go program. This method is not recommended, not only because the generated Go source file is much larger than the binary files themselves, also because static large[]byte
slows down the compilation of the package and thegc
compiler uses a lot of memory to compile it (this is a known bug ofgc
). For example, see the tools/godoc/static package. - use similar
syso
technique to bundle the data. Precompile the data file as syso file using GNUas(1)
's.incbin
pseudo-instruction.
The key trick for the 3rd alternative is that the linker for the gc
toolchain has the ability to link COFF object files of a different architecture into the binary without problem, so you don't need to provide syso files for all supported architectures. As long as the syso file doesn't contain instructions, you can just use one to embed the data.
The assembly template to generate the COFF .syso file:
/* data.S, as -o data.syso */
.section .rdata,"dr" /* put in COFF section .rdata */
.globl _bindataA /* no longer need to prepend package name here */
.globl _ebindataA
_bindataA:
.incbin "dataA"
_ebindataA:
.globl _bindataB /* no longer need to prepend package name here */
.globl _ebindataB
_bindataB:
.incbin "dataB"
_ebindataB:
And two other files, first a Plan 9 C source file that assembles the slice for Go:
/* slice.c */
#include "runtime.h"
extern byte _bindataA[], _bindataB[], _ebindataA, _ebindataB;
void ·getDataSlices(Slice a, Slice b) {
a.array = _bindataA;
a.len = a.cap = &_ebindataA - _bindataA;
b.array = _bindataB;
b.len = b.cap = &_ebindataB - _bindataB;
FLUSH(&a);
FLUSH(&b);
}
And finally, the Go file that uses the embedded slide:
/* data.go */
package bindata
func getDataSlices() ([]byte, []byte) // defined in slice.c
var A, B = getDataSlices()
Note: you will need an as(1)
capable of generating the COFF syso file, you
can build one easily on Unix:
wget http://ftp.gnu.org/gnu/binutils/binutils-2.22.tar.bz2 # any newer version also works
tar xf binutils-2.22.tar.bz2
cd binutils-2.22
mkdir build; cd build
../configure --target=i386-foo-pe --enable-ld=no --enable-gold=no
make
# use gas/as-new to assemble your data.S
# all the other file could be discarded.
Drawback of this issue is that it seems incompatible to cgo, so only use it when you don't use cgo, at least for now. I (minux) is working on figuring out why they're incompatible.
The gc toolchain linker, cmd/link, provides a -X
option that may be used to record arbitrary information in a Go string variable at link time. The format is -X importpath.name=val
. Here importpath
is the name used in an import statement for the package (or main
for the main package), name
is the name of the string variable defined in the package, and val
is the string you want to set that variable to. When using the go tool, use its -ldflags
option to pass the -X
option to the linker.
Let's suppose this file is part of the package company/buildinfo
:
package buildinfo
var BuildTime string
You can build the program using this package using go build -ldflags="-X 'company/buildinfo.BuildTime=$(date)'"
to record the build time in the string. (The use of $(date)
assumes you are using a Unix-style shell.)
The string variable must exist, it must be a variable, not a constant, and its value must not be initialized by a function call. There is no warning for using the wrong name in the -X
option. You can often find the name to use by running go tool nm
on the program, but that will fail if the package name has any non-ASCII characters, or a "
or %
character.