Introduction

At $work we have been using GitLab CI in combination with NixOS package builds, checks and NixOS tests for up to 2 years now. This article will cover a loose collection of learnings and tips starting from optimizing GitLab runners to effective usage of GitLab caches.

As for scope, this article only concerns itself with self-hosted GitLab CE instances. All covered topics should also work on EE but are untested. It is also assumed that you are using flakes however most guidance should also apply to channel-based workflows.

Runners

The most common GitLab runner is the docker runner. It enables you to run jobs inside an OCI container image.

There are however some limitations when using nix inside a normal, unpriviledged docker runner. Notably, the sandbox is disabled as it require elevated priviledges. This can have unintended side effects, like build time access to the internet.

There are two runner configuration types that have helped me here:

Using the host daemon

You can make the host daemon available to the job inside the runner, with the configuration below. This will also mount the host nix store inside the runner’s container.

There are advantages to this approach:

strong caching, all jobs on the runner share the same store. This includes all derivations etc. and also makes the use of, e.g., harmonia, for native caching-sharing between runners possible.
sharing of build locks, if two jobs on the same runner try to build the same derivation it will only be built once.
sandboxing, as the builds are delegated to the host daemon, the builders on the host can use the sandbox of the host daemon.

There are however also disadvantages:

minor compatability issues, the runner is likely only suitable for use with nix images. Other images often rely on specific $PATH or other variables that may be altered in unexpected ways. The trivial workaround for this is to disable running of untagged jobs.
runner overload, as all builds are on the host no direct CPU or memory limits can be enforced on the builds of any single job. This is especially a problem when multiple heavy jobs (NixOS tests) get issued to the same runner.
priviledged access, the jobs will likely need to connected as trusted-users. This means they effectively have control over the host daemon and should trusted as priviledge escalation is significantly easier.

Based on the NixOS wiki article on GitLab runners, below is an adjusted configuration:

{ pkgs, lib, config, ... }: {
  boot.kernel.sysctl."net.ipv4.ip_forward" = true;
  virtualisation.docker = {
    enable = true;

    rootless = {
      enable = true;
      setSocketVariable = true;
    };
  };

  nix.settings.trusted-users = [ "root" ];

  services.gitlab-runner = {
    enable = true;
    clear-docker-cache.enable = true;
    settings.concurrent = 3;

    # runner for building in docker via host's nix-daemon
    # nix store will be readable in runner, might be insecure
    services.native-runner = {
      # community managed, automatically updated nix image with flakes + commands pre-enabled
      dockerImage = "nixpkgs/nix-flakes:nixos-${config.system.nixos.release}-${pkgs.system}";
      dockerVolumes = [
        # the items are ro because we write to the store via the daemon
        "/nix/store:/nix/store:ro"
        "/nix/var/nix/db:/nix/var/nix/db:ro"
        "/nix/var/nix/profiles/system/etc/ssl/:/etc/ssl/:ro"
        "/nix/var/nix/daemon-socket:/nix/var/nix/daemon-socket:ro"
        "${pkgs.bash}/bin/bash:/usr/bin/sh:ro"
        "${pkgs.bash}/bin/bash:/bin/bash:ro"
        "${pkgs.bash}/bin/bash:/usr/bin/bash:ro"
        "${pkgs.bash}/bin/bash:/bin/sh:ro"
      ];
      dockerDisableCache = true;
      registrationFlags = [
        "--docker-pull-policy=if-not-present"
        "--docker-allowed-pull-policies=if-not-present"
        "--docker-allowed-pull-policies=always"
      ];
      environmentVariables = {
        # we use the shared nix daemon of the host
        NIX_REMOTE = "daemon";
        ENV = "/etc/profile";
        USER = "root";
        # NOTE: we override the nix installation in the container because
        # it is linked to the original nix store from the container.
        # However this store, nor the dynamic libraries in it, can be found
        # because of the overlay mount. Also ensure the nix version is
        # synced to the system nix daemon.
        PATH =
          (pkgs.lib.strings.makeSearchPathOutput "bin" "bin" (
            with pkgs;
            [
              gnugrep
              coreutils
              nix
              openssh
              gitleaks
              bash
              git
            ]
          ))
          + ":/nix/var/nix/profiles/default/bin:/usr/local/bin:/usr/local/sbin:/nix/var/nix/profiles/default/sbin:/bin:/sbin:/usr/bin:/usr/sbin";
      };

      authenticationTokenConfigFile = <path to token file, use sops-nix or agenix here>;
    };
  };
}

Plain runner

You can also use the daemon inside the container, the advantages are:

proper ressource limits, you can limit CPU and memory per job
runners can be trivially reused for other images (can be reliably used for untagged jobs)

The disadvantages are:

caching requires extra work, see “Reducing duplicate work” for strategies to address this
sandboxing is only possible for priviledged runners

A sample configuration for a plain runner is:

{ pkgs, config, ... }: {
  boot.kernel.sysctl."net.ipv4.ip_forward" = true; # 1
  virtualisation.docker = {
    enable = true;
    enableOnBoot = true;
    autoPrune.enable = true;
  };

  services.gitlab-runner = {
    enable = true;
    settings = {
      concurrent = 4;
      listen_address = "127.0.0.1:9252";
    };

    services.runner = {
      # community managed, automatically updated nix image with flakes + commands pre-enabled
      dockerImage = "nixpkgs/nix-flakes:nixos-${config.system.nixos.release}-${pkgs.system}";

      dockerVolumes = [
        # passthrough bash & grep for gitlab ci (used inside the executor, not contained in the base image)
        "${lib.getExe pkgs.pkgsStatic.gnugrep}:/usr/bin/grep:ro"
        "${lib.getExe pkgs.pkgsStatic.bash}:/usr/bin/sh:ro"
        "${lib.getExe pkgs.pkgsStatic.bash}:/usr/bin/bash:ro"
      ];

      registrationFlags = [
        "--docker-pull-policy=if-not-present"
        "--docker-allowed-pull-policies=if-not-present"
        "--docker-allowed-pull-policies=always"
      ];

      authenticationTokenConfigFile = ;
    };
    };
  };
}

If you want to use sandboxing and are fine with a priviledged runner, add the extra flags and volumes below:

{ config, pkgs, ... }:
let
  nixJoin = list: builtins.concatStringsSep " " list;
  # enable sandbox & flakes + passthrough of host substituters
  nix-conf = pkgs.writeText "nix.conf" ''
    accept-flake-config = true
    experimental-features = nix-command flakes
    max-jobs = auto
    sandbox = true

    substituters = ${nixJoin config.nix.settings.substituters}
    trusted-public-keys = ${nixJoin config.nix.settings.trusted-public-keys}
    extra-substituters = ${nixJoin config.nix.settings.extra-substituters}
    extra-trusted-public-keys = ${nixJoin config.nix.settings.extra-trusted-public-keys}
  '';
in
{
  services.gitlab-runner.services.runner = {
    dockerVolumes = [
      # inject host nix conf for substituters etc. into container
      # this ensures our nix cache get's used for all devshells
      "${nix-conf}:/etc/nix/nix.conf:ro"
      # passthrough bash & grep for gitlab ci (used inside the executor, not contained in the base image)
      ...
    ];
    registrationFlags = [ "--docker-privileged" ];
  };
}

Please also note that adjusting the nix.conf with dockerVolumes also can enable you to adjust the default trusted-substituter inside the container. This can be useful to inject your own public cache even if you don’t enable sandboxing.

Runners with S3 cache

The official documentation is a bit sparse here, so this serves as an extension to both runners types. When you have runners on multiple hosts you likely want a shared cache between runners to exchange data between jobs reliably. For the GitLab runner NixOS module this means you will have to use the CLI registration flags instead of the YAML config, below are the relevant flags:

{ ... }:
{
  services.gitlab-runner.services.runner.registrationFlags = [
    "--cache-s3-server-address=s3.example.com"
    "--cache-s3-access-key=$(cat ${access_key.path})"
    "--cache-s3-secret-key=$(cat ${secret_key.path})"
    "--cache-s3-bucket-name=$(cat ${bucket_name.path})"
    "--cache-s3-bucket-location=us-east-1"
    "--cache-s3-authentication_type=access-key"
    "--cache-type=s3"
    "--cache-shared"
  ];
}

Both garage and minio are suitable for self-hosting this s3 cache and available as NixOS modules.

Tips: Reducing duplicate work

GitLab CI can cache some artifacts but will not cache paths outside the project directory. This notably affects the nix store.

If you have many jobs or a complex devshell you will likely spend a significant amount of time just downloading, or worse building, artifacts before even starting the actual job task, e.g., running unit tests. Below are two ways that have worked for me:

Specialized container images

Existing images

If you have 10 jobs that need, e.g., devshell, it is very likely an improvement to use an image where it is already available instead of downloading it on every build. The advantage here is that on each run you won’t have to download/build the artifact but can instead start directly (ot at least faster).

Some useful image collections are:

github:nix-community /docker-nixpkgs: Images for nix and common nix development tooling (cachix, devenv, devcontainer, …)
dockerhub nixos/nix: Official nix container image from github:NixOS /nix

Building images

You can also build your own container images with nix, see the following documentation for guidance on this topic:

nix.dev — guide on building docker images
dockerTools reference documentation
nix2container — archive-less builder (can be faster than dockerTools)

You can upload docker images built with nix to the GitLab container registry with, e.g., skopeo. This can either be done on a per project/repository basis or, e.g., in a shared image repository with regularly built images based on scheduled pipelines. The latter may be useful when dealing with multiple repositories that share the same tooling.

Streaming image into GitLab registry

This is an adjusted example from the reference documentation. It assumes you want to upload an image to the GitLab registry with name builder and tag latest.

{
  inputs.nixpkgs.url = "github:nixos/nixpkgs/nixos-unstable";

  outputs =
    { self, nixpkgs }:
    let
      pkgs = nixpkgs.legacyPackages.x86_64-linux;
    in
    {
      packages.x86_64-linux.image = pkgs.dockerTools.streamLayeredImage {
        name = "hello";
        contents = [ pkgs.hello ];
      };
    };
}

With the job:

build-image:
  script:
    - echo "$CI_JOB_TOKEN" | skopeo login --insecure-policy "$CI_REGISTRY" -u gitlab-ci-token --password-stdin
    - $(nix build --no-link --print-out-paths .#image) | gzip --fast | skopeo copy docker-archive:/dev/stdin "docker://$CI_REGISTRY_IMAGE/builder:latest"

This job will in echo ... log into the registry with the ephemeral job token, this is required for the upload from the runner. In $(nix ...) the image is built and streamed to stdout, afterward it is compressed (gzip --fast) and pushed to the registry (skopeo ...).

Using binary caches

Nix binary caches allow you to cache the built results of an evaluated derivation.

In a CI/CD pipeline these may be used to push the results of a build to enable derivation-level caching of results. This means you would, e.g., build an artifact in a job, push it to the cache and let successive jobs download it from the cache again (instead of building it).

In my experience, with the constraint of self-hosting most things, attic is a good fit for this. But, especially if you want commercial support/a hosted non-free offering, cachix is a popular solution.

It should be noted that neither have particularly good GitLab support and appear to be mainly used with GitHub. They are however both compatible but you may have to put in some extra work.

attic

attic authenticates with to the server with a signed token. If you plan to use this token in a pipeline, give it push/pull permissions and store it in a CI variable and adjust permissions as needed.

In a job you first need to log in and tell it to configure your local nix.conf:

build-stuff:
  before_script:
    - attic login upstream https://cache.example.com "$ATTIC_TOKEN"
    - attic use upstream:<your cache>
  script:
    - nix build .#cake

You can then either push a derivation output (extending the example from above):

build-stuff:
  after_script:
    - attic push upstream:<your cache> ./result

This will make the next run¹ fetch .#cake from the cache instead of (re)building the derivation. It may also be helpful to use rules if you only want to rebuild on particular changes.

This may be extended by pushing the direct dependencies:

build-stuff:
  - attic push ./result upstream:<your cache>
  - nix-store --query --references ./result | attic push upstream:<your cache> --stdin

Or the whole store,

build-stuff:
  - nix path-info --all | attic push upstream:<your cache> --stdin

Tips: Scripting

Your pipelines will often contain some sort of scripts for, e.g., the image upload from above. These scripts can be made reusable as nix packages.

One of the most helpful parts for small scripts are the Trivial build helpers, in particular writeShellApplication. They enable the concise packaging of small scripts.

The main advantage of this approach is the easy invocation of commands locally. It also allows you to, e.g. with writeShellApplication, run shellcheck over your CI scripts.

Extended example

Building on the example from Streaming image into GitLab registry:

{
  inputs.nixpkgs.url = "github:nixos/nixpkgs/nixos-unstable";

  outputs =
    { self, nixpkgs }:
    let
      system = "x86_64-linux";
      pkgs = nixpkgs.legacyPackages.${system};
    in
    {
      packages.${system} = {
        image = pkgs.dockerTools.streamLayeredImage {
          name = "hello";
          contents = [ pkgs.hello ];
        };
        upload-image = pkgs.writeShellApplication {
          name = "upload-image";
          runtimeInputs = [ pkgs.skopeo pkgs.gzip ];
          text = ''
            echo "$CI_JOB_TOKEN" | skopeo login --insecure-policy "$CI_REGISTRY" -u gitlab-ci-token --password-stdin
            ${self.packages.${system}.image} | gzip --fast | skopeo copy docker-archive:/dev/stdin "docker://$CI_REGISTRY_IMAGE/builder:latest"
          '';
        };
      };
    };
}

Which can now be used as:

build-image:
  script:
    - nix run .#upload-image

And you can of course also then invoke nix run .#upload-image locally and test with, e.g., a PAT instead of a job token when an issue occurs.

Tips: Flake Evaluation caching

GitLab CI cannot currently cache paths outside the project directory. This does not just affect the nix store but also the flake eval cache.

A workaround is to store the cache from $NIX_CACHE_HOME in the after_script inside the project directory and preseed $NIX_CACHE_HOME in the before_script in the same manner.

cached-eval:
  before_script:
    - if [ ! -d .cache ]; then mkdir .cache; fi
    - export NIX_CACHE_HOME="$HOME/cache"
    - mv .cache "$NIX_CACHE_HOME"
  script:
    - nix build .#...
  after_script:
    - mv "$HOME/cache" .cache

Tips: KVM

Running NixOS tests efficiently inside a builder requires the builder feature kvm. As the name implies, this means /dev/kvm must be present inside the container AND the kernel module must be loaded.

To pass through the device, use:

services.gitlab-runner.services.${runner}.registrationFlags = [
  "--docker-devices=/dev/kvm"
];

Under most circumstances, the kernel module is loaded by book.kernelPackages when generated from nixos-generate-config. If it isn’t, add kvm-amd or kvm-intel to boot.kenelPackages in your host config.

Assuming the same inputs were used and the cache has not been garbage collected. ↩

# GitLab CI & Nix

Introduction

Runners

Using the host daemon

Plain runner

Runners with S3 cache

Tips: Reducing duplicate work

Specialized container images

Existing images

Building images

Streaming image into GitLab registry

Using binary caches

attic

Tips: Scripting

Extended example

Tips: Flake Evaluation caching

Tips: KVM

# Migrating from MinIO to garage

# GitLab CI & Nix

Introduction

Runners

Using the host daemon

Plain runner

Runners with S3 cache

Tips: Reducing duplicate work

Specialized container images

Existing images

Building images

Streaming image into GitLab registry

Using binary caches

attic

Tips: Scripting

Extended example

Tips: Flake Evaluation caching

Tips: KVM

Footnotes

# Migrating from MinIO to garage