Building Linux packages for different CPU architectures with Docker and QEMU
Many Linux open source projects provide only source code releases. To be able to use them the users need to download the source code and to build it, usually by executing steps like: ./configure, make and make install.
Some users prefer this way because they have the chance to configure the software by passing specific arguments to the ./configure script. It is also the preferred way from security point of view — the person responsible for managing the system is certain that this is the original version of the source code and no one added anything on top.
Still many users prefer to download a binary package and install it, or to use the package management software of their favorite Linux distribution, e.g. yum for RedHat/CentOS/Fedora or apt for Debian/Ubuntu flavors. Here the benefit is that the dependencies are installed automatically for you.
Some of the open source projects provide binary packages for download themselves. Others delegate the packaging task to their community or to the Linux distributions to package the software following the best practices for the specific package type. They do this for different reasons but most often because:
1) it is an extra burden — the software developers do not want to deal with “bureaucracy” different than their domain of expertise
2) there are many Linux distributions with their specific packaging types, e.g. .deb, .rpm, .apk, etc. One needs to read a lot of documentation to understand each of them
3) another reason is because one may need specific hardware to be able to build a package for not so common CPU architectures. Usually developers work on Intel or AMD based computers known as x86_64 CPU architecture. But if your software needs to run on mobile phones or tables and Internet of Things (IoT) devices then you need to produce a binary for ARM architecture, also known as AARCH. ARMv7 and before is 32-bit. ARMv8, also known as aarch64, is 64-bit. Lately even more and more cloud providers recommend ARM64 CPUs because they have similar performance to the x86_64 ones but consume less electricity, so they are cheaper to rent and environment friendlier. There are rumors that Apple will release ARM based laptops soon.
In the rest of this article I’m going to show you how to build and package your software for ARM on x86_64 computer by using Docker and QEMU.
What is Docker ?
From Wikipedia: Docker is a set of platform as a service (PaaS) products that uses OS-level virtualization to deliver software in packages called containers. Containers are isolated from one another and bundle their own software, libraries and configuration files; they can communicate with each other through well-defined channels. All containers are run by a single operating system kernel and therefore use fewer resources than virtual machines.
What is QEMU ?
From Wikipedia: QEMU (short for Quick EMUlator) is a free and open-source emulator that performs hardware virtualization. QEMU is a hosted virtual machine monitor: it emulates the machine’s processor through dynamic binary translation and provides a set of different hardware and device models for the machine, enabling it to run a variety of guest operating systems. It also can be used with KVM to run virtual machines at near-native speed (by taking advantage of hardware extensions such as Intel VT-x). QEMU can also do emulation for user-level processes, allowing applications compiled for one architecture to run on another.
Most of the cloud based Continuous Integration (CI) providers (e.g. TravisCI, CircleCI, DroneCI, Github Actions, and more) use Docker to provide you with a throw-away Docker container (a Linux instance) which you can modify the way you need, for example by installing required dependencies of your software or by even changing kernel settings, and then to build/test/package your software. Once your CI job finishes the docker container is discarded and the resources freed for the new CI job. The jobs are fully isolated from each other and this makes your build reproducible because they always start from the same state and there is nothing left from a previous job.
Building
The process of building your software consists of two main steps:
1) register QEMU/binfmt
If you try to run a Docker container that is built for a different CPU architecture than the host’s it will fail with this error:
$ docker run -it — rm arm64v8/centos:8 uname -m
standard_init_linux.go:211: exec user process caused “exec format error”
To be able to run such foreign architectures one may use QEMU! Someone made the installation step as simple as executing:
$ docker run -it — rm — privileged multiarch/qemu-user-static — credential yes — persistent yes
What this does is:
1.2) run a Docker container that modifies the host. If it is executed inside a Docker container then it will modify the outer container.
1.3) The — privileged argument gives permissions to the Docker container to modify the host. In case it is run in a CI server then the host is the outer Docker container, the one allocated for our CI job.
1.4) The — credential yes argument is needed only if you need to use sudo later in step 2).
1.5) The — persistent yes argument tells it to load the interpreter when binfmt is configured and remains in memory. All future uses clone the interpreter from memory.
If the execution of the command above is successful you should see output similar to the following:
Setting /usr/bin/qemu-alpha-static as binfmt interpreter for alpha
Setting /usr/bin/qemu-arm-static as binfmt interpreter for arm
Setting /usr/bin/qemu-armeb-static as binfmt interpreter for armeb
Setting /usr/bin/qemu-sparc-static as binfmt interpreter for sparc
Setting /usr/bin/qemu-sparc32plus-static as binfmt interpreter for sparc32plus
Setting /usr/bin/qemu-sparc64-static as binfmt interpreter for sparc64
Setting /usr/bin/qemu-ppc-static as binfmt interpreter for ppc
Setting /usr/bin/qemu-ppc64-static as binfmt interpreter for ppc64
Setting /usr/bin/qemu-ppc64le-static as binfmt interpreter for ppc64le
Setting /usr/bin/qemu-m68k-static as binfmt interpreter for m68k
Setting /usr/bin/qemu-mips-static as binfmt interpreter for mips
Setting /usr/bin/qemu-mipsel-static as binfmt interpreter for mipsel
Setting /usr/bin/qemu-mipsn32-static as binfmt interpreter for mipsn32
Setting /usr/bin/qemu-mipsn32el-static as binfmt interpreter for mipsn32el
Setting /usr/bin/qemu-mips64-static as binfmt interpreter for mips64
Setting /usr/bin/qemu-mips64el-static as binfmt interpreter for mips64el
Setting /usr/bin/qemu-sh4-static as binfmt interpreter for sh4
Setting /usr/bin/qemu-sh4eb-static as binfmt interpreter for sh4eb
Setting /usr/bin/qemu-s390x-static as binfmt interpreter for s390x
Setting /usr/bin/qemu-aarch64-static as binfmt interpreter for aarch64
Setting /usr/bin/qemu-aarch64_be-static as binfmt interpreter for aarch64_be
Setting /usr/bin/qemu-hppa-static as binfmt interpreter for hppa
Setting /usr/bin/qemu-riscv32-static as binfmt interpreter for riscv32
Setting /usr/bin/qemu-riscv64-static as binfmt interpreter for riscv64
Setting /usr/bin/qemu-xtensa-static as binfmt interpreter for xtensa
Setting /usr/bin/qemu-xtensaeb-static as binfmt interpreter for xtensaeb
Setting /usr/bin/qemu-microblaze-static as binfmt interpreter for microblaze
Setting /usr/bin/qemu-microblazeel-static as binfmt interpreter for microblazeel
Setting /usr/bin/qemu-or1k-static as binfmt interpreter for or1k
Now if we try to run the foreign Docker image it will succeed:
$ docker run -it — rm arm64v8/centos:8 uname -m
aarch64
The above tells us that uname -m executed inside arm64v8/centos:8 container returns that the CPU architecture is aarch64!
If you want to understand how QEMU/binfmt works you can read its documentation but it is not required to know more for the purpose of this article.
2) Build your software
All we need to do now is to run the usual build steps (e.g. ./configure, make, make test, etc.) inside the foreign Docker container.
2.1) Create a Dockerfile that uses as a base image any image with foreign architecture, like arm64v8/centos:8 above.
2.2) run the scripts
One can use Docker’s RUN commands, e.g.
RUN make
RUN make test
but this may get wild if you need to execute many steps!
I prefer to put all these commands in a Shell script, copy it to the custom Docker image and finally execute it.
The script may look like this:
#!/usr/bin/env bash
apt install -y dependency1 dependency2 dependencyN
…
./configure
make
make test
…
The Dockerfile will look something like:
FROM arm64v8/centos:8
ADD build-test-and-package.sh /usr/bin
CMD [“build-test-and-package.sh”]
2.3) Build the custom image
$ docker build -t my-arm-centos:8 .
2.4) Run it
$ docker run — rm -it -v $(pwd):/my-software my-arm-centos:8
Here we mount the current folder into /my-software folder inside the Docker container. build-test-and-package.sh needs to know this location to cd into it!
Any result files, like the binary packages, could be saved in this folder or another mounted folder so that they can be consumed at the end of the CI workflow, e.g. to store them as artifacts of the build and copy them to AWS S3 or elsewhere.
In action
You can see all this in action at Varnish Cache GitHub repository.
It is a Pull Request suggesting to build, test and package Varnish Cache for CentOS 7 & 8, Ubuntu 16.04 & 18.04, Debian 8, 9 & 10, and Alpine 3, for both x86_64 and aarch64.
Expending it for more distros, versions or CPU architectures is as easy as adding an additional CircleCI job with the proper parameters.
The build graph looks like:
The dist and tar_pkg_tools jobs run first in parallel. The dist job packages the source distribution and tar_pkg_tools gets the packaging recipes for RPM, DEB and APK from here. Those are stored in CircleCI’s workflow workspace and made available for any following jobs.
Once both of them finish the next jobs that run in parallel are the distcheck and the package jobs. distcheck jobs build Varnish Cache on different distros and the package jobs build the respective binary packages for each arch/distro/version triple.
If everything is successful finally the collect_packages job exports all binary packages as CircleCI artifacts which are later copied to Package Cloud.
Conclusion
Using stable tools like Docker and QEMU makes it easier to build and test our software for different CPU architectures.
There are few drawbacks though:
1) it is an emulation of the foreign CPU architecture, so it is slower than doing it on a real hardware
Note: Some cloud CI services like TravisCI and DroneIO provide support for ARM/ARM64. I have experience only with TravisCI and it is not faster than QEMU. I’ve had some small issues with it but it was easy to work them around. Hopefully it will become even better in the near future!
2) you need to find a good base Docker image for the CPU architecture you need to support. There are many images at DockerHub but depending on how exotic your needs are it may be harder to find one.
Happy hacking and stay safe in these strange times!