Homelab

David Wagner

I own a few computers which I use to experiment with new languages, tools and technologies. I was dissatisfied with configuration management of these machines. I used to install and configure packages manually and I never remembered what I’ve changed. I tried Ansible but I found it tedious to maintain playbooks: it’s hard to remove all assumptions about the current state of the system you’re configuring and making all playbook tasks idempotent is close to impossible.

This article documents the current state of my home infrastructure which I configure primarily using Nix.

You can find all the configuration code on GitHub.

Routers

The external connectivity is provided by a DSL box of my Internet service provider (ISP). I change the default settings of this device as little as possible because, in my experience, these ISP provided boxes break or get replaced every other year.

When I have a problem with the box and I call the ISP’s support line they usually ask me to reboot the device. Then, because the reboot rarely solves anything, they ask me to perform a factory reset. When this happens all custom configuration from the box is gone and I have to start from scratch. I don’t want my home network setup to depend on specific features of the ISP-provided box.

I have a Linksys WRT3200ACM router connected to the ISP’s box which also provides WiFi for the apartment. The router runs OpenWRT and I use a script to modify the default settings: change the timezone, setup WiFi and DNS aliases, install the Prometheus OpenWRT node exporter.

NixOS servers

After a few weeks of exploration and learning I installed NixOS on all my computers at home, which include:

  • Old industrial PC (32-bit)
  • Intel NUC (64-bit)
  • Raspberry Pi 4 (64-bit ARM)
  • Thinkpad laptop (64-bit)

When it comes to deploying to physical hardware NixOS feels like the endgame. Because NixOS is designed from the ground up to be declarative I can store all the configuration of these machines in a Git repository. The deployments are atomic: if I break something I can roll back any change. I can create a virtual machine from an arbitrary machine configuration, test it locally, then deploy it to the real hardware. If I remove a service definition from the configuration files the services will be removed from the servers as well.

Previously I tried managing servers using Salt and Ansible and I’m never looking back.

Let me demonstrate with an example the level of composability Nix enables. The core part of the Nix module that configures Prometheus on one of my nodes reads like this:

services.prometheus = {
  enable = true;
  scrapeConfigs = ...;
};

services.consul.catalog = [
  {
    name = "prometheus";
    port = 9090;
  }
];

networking.firewall.allowedTCPPorts = [ 9090 ];

This snippet adjusts three separate components:

  • Enable and configure Prometheus
  • Create an entry in the Consul Service Catalog for service discovery
  • Open a port in the firewall

A typical service often relies on other services, monitoring agents, network configurations and other tools. However, these dependencies are hard to express in traditional configuration management tools. In NixOS they are described at the same place using a single, unified syntax.

The complete module also contains the full Prometheus configuration, the scrapeConfigs attribute, which I elided here.

Sensors

I installed temperature and humidity sensors in two rooms and a few smart switches. I use Nix to create the development environment for flashing the firmware and for provisioning these embedded devices.

Also, I wrote a small Nix module to make the sensor configuration more expressive. For example, instead of a cryptic JSON document:

{
  "NAME": "ZJ-ESP-IR-B-v2.3",
  "GPIO": [0,0,0,0,51,37,0,0,39,38,0,0,0],
  "FLAG":0,
  "BASE":18
}

the GPIO ports of a LED controller equipped with an infrared receiver are assigned like this:

tasmota.template {
  name = "ZJ-ESP-IR-B-v2.3";
  gpio = with tasmota.component; {
    GPIO4  = IRrecv;
    GPIO5  = PWM1;
    GPIO12 = PWM3;
    GPIO13 = PWM2;
  };
};

The sensors publish their data over MQTT. Then, a Telegraf service exports the MQTT messages for Prometheus. Finally, the Prometheus time series are displayed in Grafana dashboards. There are many moving pieces here, but the resulting configuration is short and readable: for example the module setting up MQTT-Prometheus conversion is only 66 lines long.

Summary

Today I run the following services on my home computers:

  • Consul and Consul Template: Service discovery and dynamic service configuration
  • Gitolite: Host private Git repositories
  • Grafana: Display metrics on dashboards
  • Mosquitto MQTT broker: relay messages from the sensors
  • Nginx: HTTP server and reverse proxy
  • Prometheus and its exporters: Collect metrics
  • Telegraf: Export MQTT sensor data as Prometheus metrics

In the future I’m planning to configure WireGuard VPN access and experiment with network booting and periodic, automatic reinstalling of NixOS on my servers.

Overall I’m very pleased with this setup. NixOS allows fearless tinkering in my homelab. Deploying from a Git repository is easy and requires minimal maintenance. I can reconfigure or even reinstall my whole network within minutes.

Acknowledgment

I’m thankful to Alessandro Degano for suggesting me Telegraf and showing me how to set it up.