Skip to content
This repository has been archived by the owner on Nov 21, 2023. It is now read-only.

Add stubs compilation workflow #11

Merged
merged 28 commits into from
May 21, 2021
Merged

Add stubs compilation workflow #11

merged 28 commits into from
May 21, 2021

Conversation

maxb2
Copy link
Contributor

@maxb2 maxb2 commented Mar 22, 2021

Update

This PR has evolved a lot since it was opened. See the newer comments below.

Original

This adds a workflow that compiles the stubs and deploys them to a Release page. Related #4. This runs on both master and tags. If the run is on a tag, it also creates a draft release with the stubs uploaded automatically. A maintainer then has to go to the releases page and finalize the release. See an example release here.

I adapted the workflow from dungeon-revealer. I wasn't sure how you would want to package the stubs so I left it like the previous project.

Here is the output of the command file on each of the stubs:

linux:       ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, Go BuildID=rIFBUnJiN8ioz3oRfkhV/thAnBoXxKQpBhl57NFet/0X5YCWWCJphv52xdwney/wGIQo3O6AS6yoMxtdOQI, not stripped
linux-arm64: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, Go BuildID=qULBnaHB11IIB8qsP7L_/7P8_VnSO7Bv1RuCQrXgh/Q9t3pXkNbZlt3AjGVQTU/g_NHKy-_pR77C6v5uHAu, not stripped
linux-armv7: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-armhf.so.3, Go BuildID=_90b4fxHK5h1-G_o2RcW/FUiZ63Ap7sGKpbXqwhVD/DDogijYO_wjJkRWtCwuW/kYrIhvCUKH8hgyCRRtTP, not stripped
macos:       Mach-O 64-bit x86_64 executable
windows.exe: PE32+ executable (console) x86-64 (stripped to external PDB), for MS Windows

TODO:

  • Cross-compile binaries
  • Add tests for the stubs
  • Add md5sums of the binaries for build transparency
  • convert actions/upload-release-asset@v1 to softprops/action-gh-release@v1

Copy link

@pdcastro pdcastro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not much of my business :-) and most likely I'm missing something, but why is this PR necessary? The way I see it, it's 172 lines of build-stubs.yml versus the following 6 lines executed on any Intel machine, any OS, without the need for docker or QEMU:

BIN=windows.exe; GOOS=windows go build -o $BIN stub.go && echo >> $BIN && echo "### CAXA ###" >> $BIN
BIN=macos;       GOOS=darwin  go build -o $BIN stub.go && echo >> $BIN && echo "### CAXA ###" >> $BIN
BIN=linux;       GOOS=linux   go build -o $BIN stub.go && echo >> $BIN && echo "### CAXA ###" >> $BIN
BIN=linux-armv6; GOOS=linux GOARCH=arm GOARM=6 go build -o $BIN stub.go && echo >> $BIN && echo "### CAXA ###" >> $BIN
BIN=linux-armv7; GOOS=linux GOARCH=arm GOARM=7 go build -o $BIN stub.go && echo >> $BIN && echo "### CAXA ###" >> $BIN
BIN=linux-arm64; GOOS=linux GOARCH=arm64       go build -o $BIN stub.go && echo >> $BIN && echo "### CAXA ###" >> $BIN

OK, I see the value in making sure that the stub binaries are up to date in relation to the Go source code, and the fact that a binary commit from some random contributor cannot be easily reviewed for including a bitcoin miner, :-) but then maybe the workflow could be just ubuntu-20.04 on Intel, running the 6 lines above?

caxa's Readme makes a point about not supporting cross compilation, but I understand that's about native Node.js modules of the end user's application, not caxa's own stubs. After all, the binary stubs for all platforms are even committed to the GitHub repo.

@maxb2
Copy link
Contributor Author

maxb2 commented May 13, 2021

The main point of this PR is to put stub compilation in the CI/CD pipeline. I was avoiding cross-compilation because the repo owner seemed not to be a fan of it as you pointed out. From the Readme:

I believe you should have environments to work with all the operating systems you plan on supporting. They may not be your main development environment, but they should be able to build your project and let you test things. At the very least, you should use a service like GitHub Actions which lets you run build tasks and tests on Windows, macOS, and Linux.

Using your cross-compilation script would certainly simplify the workflow, but puts all your trust in the go cross-compiler. Emulation at compilation is more complex but also ensures the build environment is closest to the actual hardware. Pick your poison, I guess 🤷

After all, the binary stubs for all platforms are even committed to the GitHub repo.

This is really what I'm aiming to change. Committing binaries to a git repo is a bit sketchy in my opinion.

@leafac
Copy link
Owner

leafac commented May 13, 2021

Hi all,

I’m just checking in to you know that I’m going to work on this soon. I maintain several open-source projects and go around spending some time on each; caxa is up next. I also started streaming my open-source work, which you may follow here: https://www.youtube.com/channel/UC_R-6HcHW5V9_FlZe30tnGA

@maxb2:

Committing binaries to a git repo is a bit sketchy in my opinion.

What makes you think that? How’s a CI workflow better?

Reasons why I like stubs in the Git repository:

  1. It’s the simplest thing that works.
  2. The stubs are small, around 3MB. Some images in a web application could be bigger than that.
  3. The stubs are in sync with the rest of the project.
  4. It’s simple to ship the stubs with the npm package.
  5. It’s as secure as anything else: As the administrator of the repository I could always change the assets in a GitHub Release or whatever other way we could distribute the stubs. In summary, if you use the compiled stubs, you’re trusting me. If you don’t trust me, you should inspect the source code and compile it yourself. This is true of any binary. And, ultimately, not even compiling yourself is enough, because then you’re trusting the compiler: https://dl.acm.org/doi/10.1145/358198.358210

@pdcastro
Copy link

Playing devil's advocate... I see pros and cons.

How’s a CI workflow better?

There may be room for simplification as I pointed out in my previous comment, but:

  • A CI workflow can help you save time ("I maintain several open-source projects and go around spending some time on each; caxa is up next") when an external contributor submits a PR that modifies stub.go, say a minor bug fix. If a contributor's commit includes the stubs, and the contributor is not someone you know and trust, you could not possibly simply approve and merge the PR with the binary stubs. You might find yourself awkwardly asking the contributor to delete the binary stubs from their PR, as you cannot trust them and you need to generate them yourself. But if GitHub's workflow takes care of producing the binaries, the contributor's commit would include only the changes to stub.go and you could simply merge the PR and go back to the other open-source projects.

  • Automated testing of the stubs, by executing them in GitHub's sandboxes / virtualized environment. While it is easy to cross-compile the stubs on any OS and CPU, executing and testing them on a non-native operating system may not be as easy. Currently, however, this repo does not have this kind of test.

The stubs are small, around 3MB

Each. 7 x 3MB = 21MB (7 counting windows, mac-intel, mac-arm, linux-intel, linux-armv6, linux-armv7, linux-arm64). This will show in the "unpacked size" entry at https://www.npmjs.com/package/caxa and the git repo will keep the history: one-line changes to stubs.go could mean another 21MB committed. Soon enough, git clone will be downloading hundreds of megabytes of old stubs no one cares about.

In summary, if you use the compiled stubs, you’re trusting me.

Even so, transparency could be increased by adopting a GitHub workflow. You might be able add a feature bullet point to the Readme: "binary stubs reliably generated with GitHub workflows", with a link to build-stubs.yml.

Using your cross-compilation script would certainly simplify the workflow, but puts all your trust in the go cross-compiler.

I'd say that's fine until there is evidence to the contrary: "Add complexity only when justified".

@maxb2
Copy link
Contributor Author

maxb2 commented May 13, 2021

Git was designed to handle version control of text files. It stores the diff of updated files rather than the updated file itself. When you update a binary file the diff will basically be the size of the whole file. That can lead to repo sizes growing very quickly if the binary file changes frequently. At the very least, you should use a tool specifically for storing binary blobs in a git repo such as lfs.

The workflow at least shows the whole process of generating the binaries. Anyone would be able to see the exact commit of the source and the whole build process. I can include md5sums in the build process as well to mitigate administrator abuse. Of course you can never 100% trust binaries from the internet, but you can take steps to help people

I think the current repo structure of caxa makes sense given its current scope. However, if it grows in popularity (a lot of people are excited by it), I think you'll need to rethink that structure. Having a robust workflow in place could save you time and headache in the future.

@maxb2
Copy link
Contributor Author

maxb2 commented May 13, 2021

Automated testing of the stubs, by executing them in GitHub's sandboxes / virtualized environment. While it is easy to cross-compile the stubs on any OS and CPU, executing and testing them on a non-native operating system may not be as easy. Currently, however, this repo does not have this kind of test.

I'll add a simple smoke test.

I'd say that's fine until there is evidence to the contrary: "Add complexity only when justified".

That's a totally fair point. I think adding tests of the stubs to the workflow would justify the complexity though.

@maxb2 maxb2 marked this pull request as draft May 13, 2021 14:31
@leafac
Copy link
Owner

leafac commented May 13, 2021

You do make some good points and I’ll give a more detailed answer in the near future. For now, a couple quick comments:

Git was designed to handle version control of text files. It stores the diff of updated files rather than the updated file itself.

Oh, a teachable moment 😀
Git doesn’t store diffs of updated files; it does store the updated file itself (see, for example, https://github.blog/2020-12-17-commits-are-snapshots-not-diffs/)

In any case, that’s not central to your argument: You’re right that storing blobs will blow up the repository’s size.

The workflow at least shows the whole process of generating the binaries.

I believe it amounts to the same as

"stubs": "cd stubs && shx rm -f windows.exe macos linux && cross-env GOOS=windows go build -o windows.exe stub.go && shx echo >> windows.exe && shx echo \"### CAXA ###\" >> windows.exe && cross-env GOOS=darwin go build -o macos stub.go && shx echo >> macos && shx echo \"### CAXA ###\" >> macos && cross-env GOOS=linux go build -o linux stub.go && shx echo >> linux && shx echo \"### CAXA ###\" >> linux"
in terms of transparency. Am I missing something?

@maxb2
Copy link
Contributor Author

maxb2 commented May 13, 2021

Git doesn’t store diffs of updated files; it does store the updated file itself

I stand corrected 😄

Am I missing something?

A workflow shows the user the build log that produced the binary files. The stubs command in package.json just tells the user what commands are needed to build the stub themselves. Using a metaphor, the workflow is like being in the kitchen to witness every part of a meal being made and delivered, while the stub command + binaries in the repo is like being given the recipe for a pre-packaged meal. You could certainly make it yourself, but there is still a gap in the chain of custody.

Here is the build log I am talking about. You can trace the process from source code all the way to build artifact.

@maxb2
Copy link
Contributor Author

maxb2 commented May 13, 2021

As far as binary distribution goes, I think removing them from the repo is a net positive. The caxa dependency would have a smaller size and all that needs to happen is that caxa fetches the appropriate binary after identifying the system it is running on. Github Releases provides direct links to the binaries e.g. https://github.com/maxb2/caxa/releases/download/v1.0.0-test06/linux

https://github.com/leafac/caxa/releases/download/{version}/{filename}
Where version is the git tag and filename is the binary you want.

@maxb2 maxb2 force-pushed the actions-stubs branch 2 times, most recently from b697c77 to 4129720 Compare May 13, 2021 20:39
@maxb2 maxb2 marked this pull request as ready for review May 13, 2021 21:01
@maxb2
Copy link
Contributor Author

maxb2 commented May 13, 2021

I've added some new features to this that may persuade you of its usefulness.

  • Binary Testing: there are simple tests of all the binaries. It just appends a file and then verifies its contents. This kind of test requires running on native platforms or emulation. This testing can be easily extended to something that covers future features of caxa.
  • Binary Checksums: file checksums are performed during the build and are deployed with the binaries. Users can verify that the file they download is the same as was built by Actions.
  • Simplified Release Creation: there is a simplified action to create the Github Release page and upload the files to it. It's much easier to understand now.

.github/workflows/build-stubs.yml Outdated Show resolved Hide resolved
.github/workflows/build-stubs.yml Outdated Show resolved Hide resolved
@maxb2 maxb2 mentioned this pull request May 14, 2021
@maxb2
Copy link
Contributor Author

maxb2 commented May 14, 2021

@pdcastro @leafac the emulated binaries aren't working as expected for armv6 on the raspi1 (see #4) I'll convert this to cross-compiling all the binaries and possibly testing them with emulation.

@maxb2 maxb2 marked this pull request as draft May 14, 2021 16:56
@maxb2
Copy link
Contributor Author

maxb2 commented May 14, 2021

I changed this to cross-compile all the binaries and then test them natively or emulated in the case of arm architecture.

@maxb2 maxb2 marked this pull request as ready for review May 14, 2021 18:51
@maxb2
Copy link
Contributor Author

maxb2 commented May 17, 2021

@leafac I just watched your stream on caxa maintenance. It was good to see your vision of where caxa should be heading!

As far as the method of compilation goes, this PR is currently cross-compiling the stubs and then testing them in native or emulated environments. If you would prefer they be compiled in a native/emulated environment, I can easily revert back to that. I'd say it's a judgement call up to you which method should be used.

I hope this PR takes compiling and distributing the stubs off your plate.

@maxb2 maxb2 marked this pull request as draft May 18, 2021 20:01
@leafac
Copy link
Owner

leafac commented May 18, 2021

@maxb2: You’re doing a fantastic job on this pull request. Thank you very much. And thanks for joining me on the stream today.

Do you want to be a guest in tomorrow’s stream at 17:30 UTC? I can call you and we can have a pairing session in which we review your contribution and merge it.

If you’re interested, please contact me on [email protected] and we can arrange the details.

@maxb2
Copy link
Contributor Author

maxb2 commented May 18, 2021

Here is an updated overview of this PR as of efbcc02

Features

  • Compile stubs using a Github Actions workflow
    • Compile x64 stubs natively for Linux, Windows, and MacOS
    • Compile armv6, armv7, and arm64 stubs in an emulated environment using qemu+docker
  • Test compiled stubs in the environment that they were built
  • Report sha256 checksums of the compiled stubs
  • Deploy the compiled stubs and checksums to a draft Github Release page if on a tagged branch
    • A maintainer needs to manually finalize the release before the public can access it. This gives you the opportunity to add release notes.

Special Considerations

  • All stubs are compiled with CGO_ENABLED=0. This forces the binaries to be statically linked.
    • This is very important for armv6 since the docker image uses alpine linux. Alpine uses musl libc instead of the standard libc library like most mainstream linux distros.
  • The stub smoke tests are performed manually using the instructions from the readme. I left node out of the emulation process to avoid the complication of installing both node and go in a container.
  • The arm32v7/golang docker images are used rather than a more general docker image such as balenalib/raspberry-pi-golang.
    • I chose these because they are maintained specifically for go and because balenalib/raspberrypi-golang does not have go 1.16 which is required to compile the stubs.
    • The arm*v*/golang images do require using the --platform argument at runtime which requires enabling experimental features of docker.

Future Work (Separate PR)

  • Use the tests in src/index.test.ts rather than the manual test in this PR. This requires using architecture specific versions of node which can be achieved with docker.
  • Support more platforms, such as darwin/arm64. This requires more research on if this can be achieved with qemu+docker. Otherwise, we would need to use go cross-compilation.

@maxb2 maxb2 marked this pull request as ready for review May 18, 2021 21:06
@maxb2
Copy link
Contributor Author

maxb2 commented May 18, 2021

I thought a little more about using node to perform the binary tests. There isn't any reason that the docker image used to compile the binary has to be the same that tests it. In short, compile the binary with a golang image, copy the binary into the node image and run the test.

cd stubs/
# Compile stub
docker run --rm --platform linux/arm/v7 -v $PWD:/usr/src/myapp -w /usr/src/myapp arm32v7/golang:1.16  sh -c 'CGO_ENABLED=0 go build -o linux-armv7 stub.go && echo "" >> linux-armv7 && echo "### CAXA ###" >> linux-armv7'

# Rename files to work with index.test.ts
mv linux linux-amd64
ln -s linux-armv7 linux

# Run node tests
cd ..
docker run --rm -it --platform linux/arm/v7 -v $PWD:/usr/src/myapp -w /usr/src/myapp arm32v7/node sh -c 'npm install-ci-test'

This is currently failing with the same error as the main workflow: https://github.com/leafac/caxa/runs/2138993751?check_suite_focus=true

We could also test the binary against multiple node versions like in the main workflow by pulling the appropriate images. This may have to wait until the javascript side of caxa is aware of the arm binaries.

@leafac leafac merged commit 7632ae5 into leafac:master May 21, 2021
@leafac
Copy link
Owner

leafac commented May 26, 2021

@maxb2:

Check this out:

Screen Shot 2021-05-26 at 20 13 54

https://github.com/maxb2/caxa/runs/2641839545?check_suite_focus=true#step:3:6

Fun stuff, huh?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants