Skip to content

Commit

Permalink
Redesign cosmocc toolchain
Browse files Browse the repository at this point in the history
The `cosmocc` compiler is now being distributed as a self-contained
toolchain that's path-agnostic and it no longer requires you clone the
Cosmop repo to use it. The bin/ folder has been deleted from the mono
repo. The `fatcosmocc` command has been renamed to `cosmocc`. MacOS
support now works very well.
  • Loading branch information
jart committed Nov 11, 2023
1 parent 3802428 commit 291103a
Show file tree
Hide file tree
Showing 71 changed files with 2,437 additions and 1,398 deletions.
21 changes: 16 additions & 5 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ endif
ifeq ($(TOOLCHAIN),) # if TOOLCHAIN isn't defined
ifeq ("$(wildcard o/third_party/gcc/bin/x86_64-linux-cosmo-*)","") # if our gcc isn't unbundled
ifneq ($(UNAME_M)-$(UNAME_S), x86_64-Linux) # if this is not amd64 linux
$(error you need to download https://justine.lol/cosmocc-0.0.18.zip and unzip it inside the cosmo directory)
$(error you need to download https://cosmo.zip/pub/cosmocc/cosmocc-0.0.18.zip and unzip it inside the cosmo directory)
endif
endif
endif
Expand Down Expand Up @@ -436,19 +436,30 @@ COSMOPOLITAN_HEADERS = \
THIRD_PARTY_MUSL \
THIRD_PARTY_REGEX

COSMOCC_HEADERS = \
THIRD_PARTY_AARCH64 \
THIRD_PARTY_LIBCXX \
THIRD_PARTY_INTEL

o/$(MODE)/cosmopolitan.a: \
$(foreach x,$(COSMOPOLITAN_OBJECTS),$($(x)_A_OBJS))

o/cosmopolitan.h: \
o/$(MODE)/tool/build/rollup.com \
o/cosmocc.h.txt: $(foreach x,$(COSMOCC_HEADERS),$($(x)_HDRS))
$(file >$@, $^)

o/cosmopolitan.h.txt: \
libc/integral/normalize.inc \
$(foreach x,$(COSMOPOLITAN_HEADERS),$($(x)_HDRS))
$(file >$@, $^)

o/cosmopolitan.h: o/cosmopolitan.h.txt \
libc/integral/normalize.inc \
$(foreach x,$(COSMOPOLITAN_HEADERS),$($(x)_HDRS)) \
$(foreach x,$(COSMOPOLITAN_HEADERS),$($(x)_INCS))
$(file >$(TMPDIR)/$(subst /,_,$@),libc/integral/normalize.inc $(foreach x,$(COSMOPOLITAN_HEADERS),$($(x)_HDRS)))
@$(ECHO) '#ifndef __STRICT_ANSI__' >$@
@$(ECHO) '#define _COSMO_SOURCE' >>$@
@$(ECHO) '#endif' >>$@
@$(COMPILE) -AROLLUP -T$@ o/$(MODE)/tool/build/rollup.com @$(TMPDIR)/$(subst /,_,$@) >>$@
@$(COMPILE) -AROLLUP -T$@ build/bootstrap/rollup.com @$< >>$@

o/cosmopolitan.html: private .UNSANDBOXED = 1
o/cosmopolitan.html: \
Expand Down
308 changes: 17 additions & 291 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,22 +19,17 @@ libc](https://justine.lol/cosmopolitan/index.html) website. We also have

## Getting Started

It's recommended that Cosmopolitan be installed to `/opt/cosmo` and
`/opt/cosmos` on your computer. The first has the monorepo. The second
contains your non-monorepo artifacts.
You can start by obtaining a release of our `cosmocc` compiler from
<https://cosmo.zip/pub/cosmocc/>.

```sh
sudo mkdir -p /opt
sudo chmod 1777 /opt
git clone https://github.com/jart/cosmopolitan /opt/cosmo
export PATH="/opt/cosmo/bin:/opt/cosmos/bin:$PATH"
echo 'PATH="/opt/cosmo/bin:/opt/cosmos/bin:$PATH"' >>~/.profile
ape-install # optionally install a faster systemwide ape loader
cosmocc --update # pull cosmo and rebuild toolchain
mkdir -p cosmocc
cd cosmocc
wget https://cosmo.zip/pub/cosmocc/cosmocc.zip
unzip cosmocc.zip
```

You've now successfully installed your very own cosmos. Now let's build
an example program:
Here's an example program we can write:

```c
// hello.c
Expand All @@ -45,307 +40,38 @@ int main() {
}
```

To compile the program, you can run the `cosmocc` command. It's
important to give it an output path that ends with `.com` so the output
format will be Actually Portable Executable. When this happens, a
concomitant debug binary is created automatically too.
It can be compiled as follows:

```sh
cosmocc -o hello.com hello.c
./hello.com
./hello.com.dbg
```

You can use the `cosmocc` toolchain to build conventional open source
projects which use autotools. This strategy normally works:

```sh
export CC=cosmocc
export CXX=cosmoc++
./configure --prefix=/opt/cosmos
make -j
make install
cosmocc -o hello hello.c
./hello
```

The Cosmopolitan Libc runtime links some heavyweight troubleshooting
features by default, which are very useful for developers and admins.
Here's how you can log system calls:

```sh
./hello.com --strace
./hello --strace
```

Here's how you can get a much more verbose log of function calls:

```sh
./hello.com --ftrace
```

If you don't want rich runtime features like the above included, and you
just want libc, and you want smaller simpler programs. In that case, you
can consider using `MODE=tiny`, which is preconfigured by the repo in
[build/config.mk](build/config.mk). Using this mode is much more
effective at reducing binary footprint than the `-Os` flag alone. You
can change your build mode by doing the following:

```sh
export MODE=tiny
cosmocc --update
```

We can also make our program slightly smaller by using the system call
interface directly, which is fine, since Cosmopolitan polyfills these
interfaces across platforms, including Windows. For example:

```c
// hello2.c
#include <unistd.h>
int main() {
write(1, "hello world\n", 12);
}
```

Once compiled, your APE binary should be ~36kb in size.

```sh
export MODE=tiny
cosmocc -Os -o hello2.com hello2.c
./hello2.com
```

But let's say you only care about your binaries running on Linux and you
don't want to use up all this additional space for platforms like WIN32.
In that case, you can try `MODE=tinylinux` for example which will create
executables more on the order of 8kb (similar to Musl Libc).

```sh
export MODE=tinylinux
cosmocc --update
cosmocc -Os -o hello2.com hello2.c
./hello2.com # <-- actually an ELF executable
```

## ARM

Cosmo supports cross-compiling binaries for machines with ARM
microprocessors. For example:

```sh
make -j8 m=aarch64 o/aarch64/third_party/ggml/llama.com
make -j8 m=aarch64-tiny o/aarch64-tiny/third_party/ggml/llama.com
```

That'll produce ELF executables that run natively on two operating
systems: Linux Arm64 (e.g. Raspberry Pi) and MacOS Arm64 (i.e. Apple
Silicon), thus giving you full performance. The catch is you have to
compile these executables on an x86_64-linux machine. The second catch
is that MacOS needs a little bit of help understanding the ELF format.
To solve that, we provide a tiny APE loader you can use on M1 machines.

```sh
scp ape/ape-m1.c macintosh:
scp o/aarch64/third_party/ggml/llama.com macintosh:
ssh macintosh
xcode-install
cc -o ape ape-m1.c
sudo cp ape /usr/local/bin/ape
```

You can run your ELF AARCH64 executable on Apple Silicon as follows:

```sh
ape ./llama.com
```

If you want to run the `MODE=aarch64` unit tests, you need to have
qemu-aarch64 installed as a binfmt_misc interpreter. It needs to be a
static binary if you want it to work with Landlock Make's security. You
can use the build included in our `third_party/qemu/` folder.

```
doas cp o/third_party/qemu/qemu-aarch64 /usr/bin/qemu-aarch64
doas sh -c "echo ':qemu-aarch64:M::\x7fELF\x02\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\xb7\x00:\xff\xff\xff\xff\xff\xff\xff\x00\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff\xff:/usr/bin/qemu-aarch64:CF' > /proc/sys/fs/binfmt_misc/register"
make -j8 m=aarch64
```

Please note that the qemu-aarch64 binfmt_misc interpreter installation
process is *essential* for being able to use the `aarch64-unknown-cosmo`
toolchain to build fat APE binaries on your x86-64 machine.

## AMD64 + ARM64 fat APE binaries

If you've setup the qemu binfmt_misc interpreter, then you can can use
cosmo's toolchains to build fat ape binaries. It works by compiling your
program twice, so you can have a native build for both architectures in
the same file. The two programs are merged together by apelink.com which
also embeds multiple copies of APE loader and multiple symbols tables.

The easiest way to build fat APE is using `fatcosmocc`. This compiler
works by creating a concomitant `.aarch64/foo.o` for every `foo.o` you
compile. The only exception is the C preprocessor mode, which actually
runs x86-64 GCC except with macros like `__x86_64__` undefined.

This toolchain works great for C projects that are written in a portable
way and don't produce architecture-specific artifacts. One example of a
large project that can be easily built is GNU coreutils.

```sh
cd coreutils
fatcosmocc --update ||exit
./configure CC=fatcosmocc \
AR=fatcosmoar \
INSTALL=$(command -v fatcosmoinstall) \
--prefix=/opt/cosmos \
--disable-nls \
--disable-dependency-tracking \
--disable-silent-rules
make -j8
```

You'll then have a bunch of files like `src/ls` which are fat ape
binaries. If you want to run them on Windows, then you simply need to
rename the file so that it has the `.com` suffix. Better yet, consider
making that a symlink (a.k.a. reparse point). The biggest gotcha with
`fatcosmocc` though is ensuring builds don't strip binaries. For
example, Linux's `install -s` command actually understands Windows'
Portable Executable format well enough to remove the MS-DOS stub, which
is where the APE shell script is stored. You need to ensure that
`fatcosmoinstall` is used instead. Especially if your project needs to
install the libraries built by `fatacosmoar` into `/opt/cosmos`.

## Advanced Fat APE Builds

Once you get seriously involved in creating fat APE builds of software
you're going to eventually outgrow `fatcosmocc`. One example is Emacs
which is trickier to build, because it produces architecture-specific
files, and it also depends on shared files, e.g. zoneinfo. Since we like
having everything in a neat little single-file executable container that
doesn't need an "installation wizard", this tutorial will explain how we
manage to accomplish that.

What you're going to do is, instead of using `fatcosmocc`, you're going
to use both the `x86_64-unknown-cosmo-cc` and `aarch64-unknown-cosmo-cc`
toolchains independently, and then run `apelink` and `zip` to manually
build the final files. But there's a few tricks to learn first.

The first trick is to create a symlink on your system called `/zip`.
Cosmopolitan Libc normally uses that as a synthetic folder that lets you
access the assets in your zip executable. But since that's a read-only
file system, your build system should use the normal one.

```sh
doas ln -sf /opt/cosmos /zip
./hello --ftrace
```

Now create a file named `rebuild-fat.sh` which runs the build twice:
You can use the Cosmopolitan's toolchain to build conventional open
source projects which use autotools. This strategy normally works:

```sh
#!/bin/sh
set -ex
export MODE=aarch64
export COSMOS=/opt/cosmos/aarch64
rebuild-cosmos.sh aarch64
export MODE=
export COSMOS=/opt/cosmos/x86_64
rebuild-cosmos.sh x86_64
wall.com 'finished building'
```

Then create a second file `rebuild-cosmos.sh` which runs your build:

```sh
#!/bin/bash
set -ex

ARCH=${1:-x86_64}
export COSMO=${COSMO:-/opt/cosmo}
export COSMOS=${COSMOS:-/opt/cosmos/$ARCH}
export AS=$(command -v $ARCH-unknown-cosmo-as) || exit
export CC=$(command -v $ARCH-unknown-cosmo-cc) || exit
export CXX=$(command -v $ARCH-unknown-cosmo-c++) || exit
export AR=$(command -v $ARCH-unknown-cosmo-ar) || exit
export STRIP=$(command -v $ARCH-unknown-cosmo-strip) || exit
export INSTALL=$(command -v $ARCH-unknown-cosmo-install) || exit
export OBJCOPY=$(command -v $ARCH-unknown-cosmo-objcopy) || exit
export OBJDUMP=$(command -v $ARCH-unknown-cosmo-objdump) || exit
export ADDR2LINE=$(command -v $ARCH-unknown-cosmo-addr2line) || exit

$CC --update

export COSMOPOLITAN_DISABLE_ZIPOS=1

cd ~/vendor/zlib
./configure --prefix=$COSMOS --static
make clean
make -j
make install

cd ~/vendor/ncurses-6.4
./configure --prefix=$COSMOS --sysconfdir=/zip --datarootdir=/zip/share --exec-prefix=/zip/$ARCH --disable-shared
make clean
make -j
make install

cd ~/vendor/readline-8.2
./configure --prefix=$COSMOS --sysconfdir=/zip --datarootdir=/zip/share --exec-prefix=/zip/$ARCH --disable-shared
make uninstall || true
make clean
make -j
make install

# NOTES:
# 1. You'll need to patch enum { FOO = x } that fails to build into a #define FOO
# 2. You'll need to patch configure.ac so it DOES NOT define USABLE_FIONREAD to 1
# 2. You'll need to patch configure.ac so it DOES NOT define INTERRUPT_INPUT to 1
cd ~/vendor/emacs-28.2
./configure --prefix=$COSMOS --sysconfdir=/zip --datarootdir=/zip/share --exec-prefix=/zip/$ARCH \
--without-x --with-threads --without-gnutls --disable-silent-rules --with-file-notification=no
make uninstall || true
make clean
export CC=x86_64-unknown-cosmo-cc
export CXX=x86_64-unknown-cosmo-c++
./configure --prefix=/opt/cosmos/x86_64
make -j
make install
```

Once you've completed this build process, you'll have the ELF files
`/opt/cosmos/x86_64/bin/emacs` and `/opt/cosmos/aarch64/bin/emacs`. Your
next move is to combine them into a single pristine `emacs.com` file.

```sh
cd /zip
COSMO=${COSMO:-/opt/cosmo}
mkdir -p /opt/cosmos/bin
apelink \
-o /opt/cosmos/bin/emacs.com \
-l "$COSMO/o//ape/ape.elf" \
-l "$COSMO/o/aarch64/ape/ape.elf" \
-M "$COSMO/ape/ape-m1.c" \
/opt/cosmos/x86_64/bin/emacs \
/opt/cosmos/aarch64/bin/emacs
cd /zip
zip -r /opt/cosmos/bin/emacs.com \
aarch64/libexec \
x86_64/libexec \
share/terminfo \
$(find share/emacs -type f |
grep -v '\.el.gz$' |
grep -v refcards |
grep -v images)
```

You can now scp your `emacs.com` build to seven operating systems for
two distinct kinds of microprocessors without any dependencies. All the
LISP, zoneinfo, and termcap files it needs are stored inside the ZIP
structure of the binary, which has performance that's equivalent to the
Linux filesystem (even though it decompresses artifacts on the fly!) For
this reason, you might actually find that fat APE Emacs goes faster if
you're using an operating system like Windows where files are go slow.

If you like to use Vim instead of Emacs, then you can build that too.
However Vim's build system makes it a bit harder, since it's configured
to always strip binaries. The `apelink` program needs the symbol tables
to still be there when it creates the fat version. Otherwise tools like
`--ftrace` won't work.

## Monolithic Source Builds

Cosmopolitan can be compiled from source on any Linux distro. First, you
Expand Down
Loading

0 comments on commit 291103a

Please sign in to comment.