Latest posts for tag systemd
.gitlab-ci.yml supports 'image' to allow selecting in which environment the script gets run. The documentation says "Used to specify a Docker image to use for the job", but it's clearly a bug in the documentation, because we can do it with nspawn-runner, too.
It turns out that most of the environment variables available to CI runs
are also available to custom runner scripts. In this case, the value passed as
image
can be found as $CUSTOM_ENV_CI_JOB_IMAGE
in the custom runner scripts
environment.
After some experimentation I made this commit
that makes every chroot under /var/lib/nspawn-runner
available as an image:
# Set up 3 new images for CI jobs: nspawn-runner chroot-create buster nspawn-runner chroot-create bullseye nspawn-runner chroot-create sid
That's it, CI scripts can now use image: buster
, image: bullseye
or image:
sid
, as they please. You can manually set up other chroots under
/var/lib/nspawn-runner
and they'll be automatically available.
You can also now choose a default image in config.toml
in case the CI script
doesn't specify one:
prepare_args = ["--verbose", "prepare", "--default-image=buster"]
This post is part of a series about trying to setup a gitlab runner based on systemd-nspawn. I published the polished result as nspawn-runner on GitHub.
gitlab-runner
supports adding extra arguments to the custom scripts,
and I can take advantage of that to pack all the various scripts that I
prototyped so far into an all-in-one nspawn-runner
command:
usage: nspawn-runner [-h] [-v] [--debug] {chroot-create,chroot-login,prepare,run,cleanup,gitlab-config,toml} ... Manage systemd-nspawn machines for CI runs. positional arguments: {chroot-create,chroot-login,prepare,run,cleanup,gitlab-config,toml} sub-command help chroot-create create a chroot that serves as a base for ephemeral machines chroot-login enter the chroot to perform maintenance prepare start an ephemeral system for a CI run run run a command inside a CI machine cleanup cleanup a CI machine after it's run gitlab-config configuration step for gitlab-runner toml output the toml configuration for the custom runner optional arguments: -h, --help show this help message and exit -v, --verbose verbose output --debug verbose output
chroot maintenance
chroot-create
and chroot-login
are similar to what
pbuilder,
cowbuilder,
schroot,
debspawn and similar tools do.
They only take a chroot name, and default the rest of paths to where
nspawn-runner
expects things to be under /var/lib/nspawn-runner
.
gitlab-runner setup
nspawn-runner toml <chroot-name>
outputs a snippet to add to
/etc/gitlab-runner/config.toml
to configure the CI.
For example:`
$ ./nspawn-runner toml buster [[runners]] name="buster" url="TODO" token="TODO" executor = "custom" builds_dir = "/var/lib/nspawn-runner/.build" cache_dir = "/var/lib/nspawn-runner/.cache" [runners.custom_build_dir] [runners.cache] [runners.cache.s3] [runners.cache.gcs] [runners.cache.azure] [runners.custom] config_exec = "/home/enrico/…/nspawn-runner/nspawn-runner" config_args = ["gitlab-config"] config_exec_timeout = 200 prepare_exec = "/home/enrico/…/nspawn-runner/nspawn-runner" prepare_args = ["prepare", "buster"] prepare_exec_timeout = 200 run_exec = "/home/enrico/dev/nspawn-runner/nspawn-runner" run_args = ["run"] cleanup_exec = "/home/enrico/…/nspawn-runner/nspawn-runner" cleanup_args = ["cleanup"] cleanup_exec_timeout = 200 graceful_kill_timeout = 200 force_kill_timeout = 200
One needs to remember to set url
and token
, and the runner is configured.
The end, for now
This is it, it works! Time will tell what issues or ideas will come up: for now, it's a pretty decent first version.
The various prepare
, run
, cleanup
steps are generic enough that they can
be used outside of gitlab-runner
: feel free to build on them, and drop me a
note if you find this useful!
Updated: Issues noticed so far, that could go into a new version:
- updating the master chroot would disturb the running CI jobs that use it. Using nspawn's btrfs-specfic features would prevent this problem, and possibly simplify the implementation even more.
- New step! trivially implementing support for multiple OS images
This post is part of a series about trying to setup a gitlab runner based on systemd-nspawn. I published the polished result as nspawn-runner on GitHub.
The plan
Back to custom runners, here's my plan:
config
can be a noopprepare
starts the nspawn machinerun
runs scripts withmachinectl shell
cleanup
runsmachinectl stop
The scripts
Here are the scripts based on Federico's work:
base.sh
with definitions sourced by all scripts:
MACHINE="run-$CUSTOM_ENV_CI_JOB_ID" ROOTFS="/var/lib/gitlab-runner-custom-chroots/buster" OVERLAY="/var/lib/gitlab-runner-custom-chroots/$MACHINE"
config.sh
doing nothing:
#!/bin/sh exit 0
prepare.sh
starting the machine:
#!/bin/bash source $(dirname "$0")/base.sh set -eo pipefail # trap errors as a CI system failure trap "exit $SYSTEM_FAILURE_EXIT_CODE" ERR logger "gitlab CI: preparing $MACHINE" mkdir -p $OVERLAY systemd-run \ -p 'KillMode=mixed' \ -p 'Type=notify' \ -p 'RestartForceExitStatus=133' \ -p 'SuccessExitStatus=133' \ -p 'Slice=machine.slice' \ -p 'Delegate=yes' \ -p 'TasksMax=16384' \ -p 'WatchdogSec=3min' \ systemd-nspawn --quiet -D $ROOTFS \ --overlay="$ROOTFS:$OVERLAY:/" --machine="$MACHINE" --boot --notify-ready=yes
run.sh
running the provided scripts in the machine:
#!/bin/bash logger "gitlab CI: running $@" source $(dirname "$0")/base.sh set -eo pipefail trap "exit $SYSTEM_FAILURE_EXIT_CODE" ERR systemd-run --quiet --pipe --wait --machine="$MACHINE" /bin/bash < "$1"
cleanup.sh
stopping the machine and removing the writable overlay directory:
#!/bin/bash logger "gitlab CI: cleanup $@" source $(dirname "$0")/base.sh machinectl stop "$MACHINE" rm -rf $OVERLAY
Trying out the plan
I tried a manual invocation of gitlab-runner
, and it worked perfectly:
# mkdir /var/lib/gitlab-runner-custom-chroots/build/ # mkdir /var/lib/gitlab-runner-custom-chroots/cache/ # gitlab-runner exec custom \ --builds-dir /var/lib/gitlab-runner-custom-chroots/build/ \ --cache-dir /var/lib/gitlab-runner-custom-chroots/cache/ \ --custom-config-exec /var/lib/gitlab-runner-custom-chroots/config.sh \ --custom-prepare-exec /var/lib/gitlab-runner-custom-chroots/prepare.sh \ --custom-run-exec /var/lib/gitlab-runner-custom-chroots/run.sh \ --custom-cleanup-exec /var/lib/gitlab-runner-custom-chroots/cleanup.sh \ tests Runtime platform arch=amd64 os=linux pid=18662 revision=775dd39d version=13.8.0 Running with gitlab-runner 13.8.0 (775dd39d) Preparing the "custom" executor Using Custom executor... Running as unit: run-r1be98e274224456184cbdefc0690bc71.service executor not supported job=1 project=0 referee=metrics Preparing environment Getting source from Git repository Executing "step_script" stage of the job script WARNING: Starting with version 14.0 the 'build_script' stage will be replaced with 'step_script': https://gitlab.com/gitlab-org/gitlab-runner/-/issues/26426 Job succeeded
Deploy
The remaining step is to deploy all this in /etc/gitlab-runner/config.toml
:
concurrent = 1 check_interval = 0 [session_server] session_timeout = 1800 [[runners]] name = "nspawn runner" url = "http://gitlab.siweb.local/" token = "…" executor = "custom" builds_dir = "/var/lib/gitlab-runner-custom-chroots/build/" cache_dir = "/var/lib/gitlab-runner-custom-chroots/cache/" [runners.custom_build_dir] [runners.cache] [runners.cache.s3] [runners.cache.gcs] [runners.cache.azure] [runners.custom] config_exec = "/var/lib/gitlab-runner-custom-chroots/config.sh" config_exec_timeout = 200 prepare_exec = "/var/lib/gitlab-runner-custom-chroots/prepare.sh" prepare_exec_timeout = 200 run_exec = "/var/lib/gitlab-runner-custom-chroots/run.sh" cleanup_exec = "/var/lib/gitlab-runner-custom-chroots/cleanup.sh" cleanup_exec_timeout = 200 graceful_kill_timeout = 200 force_kill_timeout = 200
Next steps
My next step will be polishing all this in a way that makes deploying and maintaining a runner configuration easy.
This post is part of a series about trying to setup a gitlab runner based on systemd-nspawn. I published the polished result as nspawn-runner on GitHub.
Here I try to figure out possible ways of invoking nspawn for the prepare
,
run
, and cleanup
steps of gitlab custom runners. The results might be
useful invocations beyond Gitlab's scope of application.
I begin with a chroot which will be the base for our build environments:
debootstrap --variant=minbase --include=git,build-essential buster workdir
Fully ephemeral nspawn
This would be fantastic: set up a reusable chroot, mount readonly, run the CI
in a working directory mounted on tmpfs. It sets up quickly, it cleans up after
itself, and it would make prepare
and cleanup
noops:
mkdir workdir/var/lib/gitlab-runner systemd-nspawn --read-only --directory workdir --tmpfs /var/lib/gitlab-runner "$@"
However, run
gets run multiple times, so I need the side effects of run
to
persist inside the chroot between runs.
Also, if the CI uses a large amount of disk space, tmpfs may get into trouble.
nspawn with overlay
Federico used --overlay to keep the base chroot readonly while allowing persistent writes on a temporary directory on the filesystem.
Note that using --overlay
requires systemd and systemd-container from
buster-backports because of systemd bug #3847.
Example:
mkdir -p tmp-overlay systemd-nspawn --quiet -D workdir \ --overlay="`pwd`/workdir:`pwd`/tmp-overlay:/"
I can run this twice, and changes in the file system will persist between systemd-nspawn executions. Great! However, any process will be killed at the end of each execution.
machinectl
I can give a name to systemd-nspawn
invocations using --machine
, and it
allows me to run multiple commands during the machine lifespan using
machinectl
and systemd-run
.
In theory machinectl
can also fully manage chroots and disk images in
/var/lib/machines
, but I haven't found a way with machinectl
to start
multiple machines sharing the same underlying chroot.
It's ok, though: I managed to do that with systemd-nspawn
invocations.
I can use the --machine=name
argument to systemd-nspawn
to make it visible
to machinectl
. I can use the --boot
argument to systemd-nspawn
to start
enough infrastructure inside the container to allow machinectl
to interact
with it.
This gives me any number of persistent and named running systems, that share the same underlying chroot, and can cleanup after themselves. I can run commands in any of those systems as I like, and their side effects persist until a system is stopped.
The chroot needs systemd and dbus for machinectl to be able to interact with it:
debootstrap --variant=minbase --include=git,systemd,systemd,build-essential buster workdir
Let's boot the machine:
mkdir -p overlay systemd-nspawn --quiet -D workdir \ --overlay="`pwd`/workdir:`pwd`/overlay:/" --machine=test --boot
Let's try machinectl:
# machinectl list MACHINE CLASS SERVICE OS VERSION ADDRESSES test container systemd-nspawn debian 10 - 1 machines listed. # machinectl shell --quiet test /bin/ls -la / total 60 […]
To run commands, rather than machinectl shell
, I need to use systemd-run
--wait --pipe --machine=name
, otherwise machined won't forward the exit
code. The result however is
pretty good, with working stdin/stdout/stderr redirection and forwarded exit
code.
Good, I'm getting somewhere.
The terminal where I ran systemd-nspawn is currently showing a nice getty for the booted system, which is cute, and not what I want for the setup process of a CI.
Spawning machines without needing a terminal
machinectl
uses /lib/systemd/system/systemd-nspawn@.service
to start
machines. I suppose there's limited magic in there: start systemd-nspawn
as a
service, use --machine
to give it a name, and machinectl
manages it as if
it started it itself.
What if, instead of installing a unit file for each CI run, I try to do the
same thing with systemd-run
?
systemd-run \ -p 'KillMode=mixed' \ -p 'Type=notify' \ -p 'RestartForceExitStatus=133' \ -p 'SuccessExitStatus=133' \ -p 'Slice=machine.slice' \ -p 'Delegate=yes' \ -p 'TasksMax=16384' \ -p 'WatchdogSec=3min' \ systemd-nspawn --quiet -D `pwd`/workdir \ --overlay="`pwd`/workdir:`pwd`/overlay:/" --machine=test --boot
It works! I can interact with it using machinectl, and fine tune DevicePolicy
as needed to lock CI machines down.
This setup has a race condition where if I try to run a command inside the machine in the short time window before the machine has finished booting, it fails:
# systemd-run […] systemd-nspawn […] ; machinectl --quiet shell test /bin/ls -la / Failed to get shell PTY: Protocol error # machinectl shell test /bin/ls -la / Connected to machine test. Press ^] three times within 1s to exit session. total 60 […]
systemd-nspawn
has the option --notify-ready=yes
that solves exactly this
problem:
# systemd-run […] systemd-nspawn […] --notify-ready=yes ; machinectl --quiet shell test /bin/ls -la / Running as unit: run-r5a405754f3b740158b3d9dd5e14ff611.service total 60 […]
On nspawn's side, I should now have all I need.
Next steps
My next step will be wrapping it all together in a gitlab runner.
This is a first post in a series about trying to setup a gitlab runner based on systemd-nspawn. I published the polished result as nspawn-runner on GitHub.
The goal
I need to setup gitlab runners, and I try to not involve docker in my professional infrastructure if I can avoid it.
Let's try systemd-nspawn. It's widely available and reasonably reliable.
I'm not the first to have this idea: Federico Ceratto made a setup based on custom runners and Josef Kufner one based on ssh runners.
I'd like to skip the complication of ssh, and to expand Federico's version to persist not just filesystem changes but also any other side effect of CI commands. For example, one CI command may bring up a server and the next CI command may want to test interfacing with it.
Understanding gitlab-runner
First step: figuring out gitlab-runner.
Test runs of gitlab-runner
I found that I can run gitlab-runner
manually without needing to go through a
push to Gitlab. It needs a local git repository with a .gitlab-ci.yml file:
mkdir test cd test git init cat > .gitlab-ci.yml << EOF tests: script: - env | sort - pwd - ls -la EOF git add .gitlab-ci.yml git commit -am "Created a test repo for gitlab-runner"
Then I can go in the repo and test gitlab-runner
:
gitlab-runner exec shell tests
It doesn't seem to use /etc/gitlab-runner/config.toml
and it needs all the
arguments passed to its command line: I used the shell
runner for a simple
initial test.
Later I'll try to brew a gitlab-runner exec custom
invocation that uses
nspawn.
Basics of custom runners
A custom runner runs a few scripts to manage the run:
config
, to allow to override the run configuration outputting JSON dataprepare
, to prepare the environmentrun
, to run scripts in the environment (might be ran multiple times)cleanup
to clean up the environment
run gets at least one argument which is a path to the script to run. The other scripts get no arguments by default.
The runner configuration controls the paths of the scripts to run, and optionally extra arguments to pass to them
Next steps
My next step will be to figure out possible
ways of invoking nspawn for the prepare
, run
, and cleanup
scripts.
This is part of a series of posts on the design and technical steps of creating Himblick, a digital signage box based on the Raspberry Pi 4.
We've started implementing reloading of the media player when media on disk changes.
One challenge when doing that, is that libreoffice doesn't always stop. Try this and you will see that the presentation keeps going:
$ loimpress --nodefault --norestore --nologo --nolockcheck --show example.odp $ pkill -TERM loimpress
It turns out that loimpress forks various processes. After killing it, these processes will still be running:
/usr/lib/libreoffice/program/oosplash --impress --nodefault --norestore --nologo --nolockcheck --show talk.odp /usr/lib/libreoffice/program/soffice.bin --impress --nodefault --norestore --nologo --nolockcheck --show talk.odp
Is there a way to run the media players in such a way that, if needed, they can easily be killed, together with any other process they might have spawned meanwhile?
systemd-run
Yes there is: systemd provides a systemd-run command to run simple commands under systemd's supervision:
$ systemd-run --scope --slice=player --user \ loimpress --nodefault --norestore --nologo --nolockcheck --show media/talk.odp
This will run the player contained in a cgroup with a custom name, and we can simply use that name to stop all the things:
$ systemctl --user stop player.slice
Resulting python code
The result is this patch which simplifies the code, and isolates and easily kills all subprocesses run as players.
This is part of a series of posts on the design and technical steps of creating Himblick, a digital signage box based on the Raspberry Pi 4.
Another nice to have in a system like Himblick is the root filesystem mounted readonly, with a volatile tempfs overlay on top. This would kind of always guarantee a clean boot without leftovers from a previous run, especially in a system where the most likely mode of shutdown is going to be pulling the plug.
This won't be a guarantee about SD issues developing over time in such a scenario, but it should at least cover the software side of things.
This is part of a series of posts on the design and technical steps of creating Himblick, a digital signage box based on the Raspberry Pi 4.
Time to setup ssh. We want to have admin access to the pi
user, and we'd like
to have a broader access to a different, locked down user, to use to manage
media on the boxes via sftp.
This is part of a series of posts on the design and technical steps of creating Himblick, a digital signage box based on the Raspberry Pi 4.
A RaspberryPi boots using a little FAT partition which contains kernel, device tree, configuration, and everything else necessary to boot.
It has the conveniente of being able to plug the SD card into pretty much any system, and tweak the knobs that are exposed through it.
While we don't expect that people would want to modify the config.txt that controls the boot process, we would like to give people a convenient way to set up things like host name (which makes the device findable on the net), timezone, screen orientation, and wifi passwords.
This is part of a series of posts on the design and technical steps of creating Himblick, a digital signage box based on the Raspberry Pi 4.
In modern times, there are tools for provisioning systems that do useful things and allow to store an entire system configuration in text files committed to git. They are good in being able to reproducibly setup a system, and being able to inspect its contents from looking at the provisioning configuration instead of wading into it.
I normally use Ansible. It does have a chroot connector, but it has some serious limitations.
The biggest issue is that ansible's chroot connector does not mount /dev, /proc
and so on, which greatly limits what can be run inside it. Specifically,
installing many .deb
packages will fail.
We work around it by copying Ansible needs inside the chroot (including Ansible itself), and then run it under systemd-nspawn using the local connector.