우병수

Posted on May 14 • Originally published at techdigestor.com

Tailscale vs Headscale: I Ran Both for My Private Journaling Setup — Here's the Honest Breakdown

#ai #machinelearning #productivity #tools

TL;DR: The thing that broke my patience with raw WireGuard wasn't the first node or even the third — it was adding a VPS to a mesh that already had my home server and laptop talking to each other. Suddenly I'm juggling four private keys, four public keys, four AllowedIPs blocks, and th

📖 Reading time: ~27 min

What's in this article

I Needed a Private Sync Network for My Journals — So I Tried Both
What Each Tool Actually Is (Without the Marketing Fluff)
Setting Up Tailscale: The Fast Path
Setting Up Headscale: Where It Gets Real
Head-to-Head: Where Each One Actually Falls Down
Which Journaling Apps Actually Pair Well With This Setup
The Moment Headscale Won Me Over (And When It Lost)
When to Pick What: Specific Scenarios

I Needed a Private Sync Network for My Journals — So I Tried Both

The thing that broke my patience with raw WireGuard wasn't the first node or even the third — it was adding a VPS to a mesh that already had my home server and laptop talking to each other. Suddenly I'm juggling four private keys, four public keys, four AllowedIPs blocks, and the mental overhead of making sure every peer config references every other peer correctly. Miss one line, and your journal sync silently fails at 2am when the cron job runs.

My actual setup: plaintext Markdown journals living in ~/journals/, synced via syncthing between a home server (running on a mini-PC with Ubuntu 22.04), a Framework laptop (Fedora 38), and a Hetzner VPS. No Dropbox, no iCloud, no S3. The constraint was deliberate — these are personal notes I don't want sitting on infrastructure I don't control. WireGuard is the right protocol for this, but the manual key exchange workflow stops being sustainable the moment you add a fourth device, let alone a phone.

The specific pain: every time I provisioned a new peer, I had to SSH into each existing node, edit /etc/wireguard/wg0.conf, add the new peer block, and run wg syncconf or restart the interface. On four nodes that's four SSH sessions, four config edits, four chances to fat-finger a public key. Tailscale and Headscale both solve exactly this — they handle the control plane (key distribution, peer discovery, NAT traversal) while WireGuard stays as the data plane underneath.

The fork in the road is about trust and control. Tailscale's control plane runs on their servers at controlplane.tailscale.com. Your traffic doesn't go through them — WireGuard tunnels are peer-to-peer — but your node registration, key coordination, and ACL policies do. Headscale is a community-built reimplementation of that control server that you run yourself, on your own VPS or home server. Same Tailscale clients on every device, different server they check in with. For a journaling setup where the whole point is keeping data off third-party infrastructure, that distinction matters — even if it's "just" metadata about which of your devices are online.

One scope clarification before going deeper: this comparison is purely about the network layer. The journaling app — whether that's Obsidian with its sync-via-folder setup, plain Syncthing, jrnl on the CLI, or even a self-hosted Joplin server — sits on top of whatever mesh network you build. I'll mention which apps pair naturally with each approach, but the journaling app itself isn't the variable being tested here. If you're building out a fuller self-hosted stack beyond just journals, the Ultimate Productivity Guide: Automate Your Workflow in 2026 covers the broader tooling picture that this kind of private network enables.

What Each Tool Actually Is (Without the Marketing Fluff)

The thing that trips most people up: Headscale doesn't replace the Tailscale client. You still install the exact same tailscale binary on every device. What Headscale replaces is the coordination server — the backend at login.tailscale.com that Tailscale Inc. runs as a SaaS product. Same client, different brain. That distinction matters a lot for understanding what you're actually taking on when you self-host.

Tailscale's architecture is split deliberately. The data plane is WireGuard — peer-to-peer encrypted tunnels between your devices, running directly on each machine. The control plane is a hosted service that handles everything WireGuard itself doesn't: distributing public keys to peers, pushing ACL rules, picking which DERP relay to use when direct connections fail, and running MagicDNS so your devices get hostnames like my-laptop.tail1234.ts.net. When you install Tailscale and run tailscale up, the client authenticates to that control plane and gets told who its peers are. Without a working coordination server, the mesh doesn't form.

Headscale reimplements that coordination server from scratch, open source, and lets you run it on your own infrastructure. The project reverse-engineered the control protocol well enough that official Tailscale clients — including the iOS and Android apps — can talk to a Headscale instance instead of login.tailscale.com. You point the client at your server with --login-server and it mostly just works. The coverage isn't 100% feature-parity — more on that — but the core mesh functionality is solid. Headscale is written in Go and exposes a local CLI and a gRPC API for managing nodes and users.

Here's what the coordination server actually does under the hood, because understanding this is what makes the self-hosting trade-off legible:

Key exchange: Each client generates a WireGuard keypair. The coordination server collects public keys and distributes them to authorized peers. Without this, devices can't establish tunnels.
ACL distribution: Tailscale's access control rules (which device can reach which port on which other device) are compiled and pushed from the control plane. In Headscale, you define these in a local policy file on your server.
DERP relay selection: When two peers can't punch through NAT directly, traffic goes through a relay. Tailscale runs a global fleet of DERP servers. Headscale lets you use Tailscale's public DERP servers, or run your own with derper.
MagicDNS: Hostnames for every node on your tailnet, resolved without manual DNS configuration. Headscale supports this, though with slightly more manual setup than the managed product.

The practical upshot: if Tailscale's SaaS backend goes down, your existing tunnels keep running (WireGuard stays up), but your mesh can't reconfigure — no new devices, no key rotation, no ACL changes. Same is true for Headscale. Your coordination server going offline doesn't instantly kill connectivity, but it does mean you can't make changes. That's why high availability for your Headscale instance actually matters, not just for day-to-day use but for operational resilience.

Setting Up Tailscale: The Fast Path

The thing that surprises most people about Tailscale is how fast you go from zero to a working mesh — we're talking under five minutes on a fresh Linux box. The install step is the classic pipe-to-shell pattern that half the industry hates and everyone does anyway:

# Yes, this is pipe-to-shell. Audit it first if that bothers you:
# curl -fsSL https://tailscale.com/install.sh | less
curl -fsSL https://tailscale.com/install.sh | sh

On Debian/Ubuntu this drops a proper apt repo and installs the tailscaled daemon. It's not just a binary dump — future apt upgrade calls will keep it current. Once installed, bring the node into your tailnet with an auth key you generate in the admin console under Settings → Keys:

# --authkey is the non-interactive path — no browser popup, good for servers
sudo tailscale up --authkey tskey-auth-xxxxx

# For ephemeral nodes (containers, CI runners) add --ephemeral
# so they auto-remove from your device list when they disconnect
sudo tailscale up --authkey tskey-auth-xxxxx --ephemeral

After that, tailscale status is your dashboard. The output is denser than it looks:

100.64.0.1      home-server          myuser@    linux   -
100.64.0.2      work-laptop          myuser@    macOS   idle, tx 1.2MB rx 800KB
100.64.0.3      phone                myuser@    iOS     offline

First column is the Tailscale IP (always in the 100.64.x.x CGNAT range). Second is the hostname. The dash in the last column means direct connection — no relay. idle with traffic counters means the peer connected at some point this session. offline means their daemon isn't running or they lost internet. If you see relay instead of a dash, Tailscale is routing through a DERP server because NAT traversal failed — common behind strict corporate firewalls and something to flag if you care about latency for journal sync.

MagicDNS is the feature I didn't know I needed until I enabled it. Flip it on in the admin panel under DNS, and suddenly every node is reachable at hostname.tail1234.ts.net. Your journal app's sync URL stops being a hardcoded IP like http://100.64.0.1:5000 and becomes http://home-server.tail1234.ts.net:5000 — which survives IP reassignments and is actually readable in logs. The subdomain suffix is unique to your tailnet and stays constant.

ACLs are where you lock down which nodes can actually talk to your journal server. The config lives in the admin UI as HuJSON (JSON with comments — don't fight it), and a policy that restricts journal sync to tagged nodes looks like this:

{
  "tagOwners": {
    // Only you can assign these tags
    "tag:journal-server": ["autogroup:owner"],
    "tag:journal-client": ["autogroup:owner"]
  },
  "acls": [
    {
      // Journal clients can reach the server on port 5000 only
      "action": "accept",
      "src":    ["tag:journal-client"],
      "dst":    ["tag:journal-server:5000"]
    }
  ]
}

Tag nodes at auth time with sudo tailscale up --authkey tskey-auth-xxxxx --advertise-tags=tag:journal-client. Without an explicit ACL rule allowing traffic, tagged nodes can't reach anything — the default-deny posture is real and it's the right call for something as personal as a journal.

Subnet routing is the sleeper feature here. If your journal server is a homelab box sitting behind a router you don't want to expose, run sudo tailscale up --advertise-routes=192.168.1.0/24 on any Tailscale node in that LAN, approve it in the admin console, and every other tailnet node can now reach 192.168.1.x addresses without installing Tailscale on the journal server itself. Exit nodes work similarly — route all traffic through a node, useful if you're traveling and want your journal traffic to egress from your home IP. On the free tier, you get 3 users and 100 devices as of my last check, but verify the current numbers at tailscale.com/pricing because they've adjusted the free tier before.

Setting Up Headscale: Where It Gets Real

The thing that caught me off guard wasn't the installation — it was realizing how much Tailscale's SaaS layer silently handles for you. Headscale makes all of that visible, which is both its strength and its friction. Before you touch any config file, confirm you have: a VPS with a static public IP (DigitalOcean, Hetzner, Vultr all work — I've been running mine on a €3.79/month Hetzner CAX11), a domain you actually control with an A record you can point at that IP, and either Go 1.21+ if you want to build from source, or just grab the binary release. The binary route is faster and I'd recommend it unless you're patching something.

Pull the latest stable from github.com/juanfont/headscale/releases — as of writing that's v0.23.x, but check the releases page because they ship fairly often:

# Replace the version and arch as needed
wget https://github.com/juanfont/headscale/releases/download/v0.23.0/headscale_linux_amd64
chmod +x headscale_linux_amd64
sudo mv headscale_linux_amd64 /usr/local/bin/headscale

# Create the config directory and a system user with no login shell
sudo mkdir -p /etc/headscale /var/lib/headscale
sudo useradd --system --no-create-home --shell /usr/sbin/nologin headscale

The config at /etc/headscale/config.yaml has a lot of fields but only a handful actually matter for a journaling-focused private network. Here's the stripped-down version that works:

server_url: https://headscale.yourdomain.com   # must be publicly reachable — clients use this
listen_addr: 0.0.0.0:8080
metrics_listen_addr: 127.0.0.1:9090

# SQLite is fine for small personal setups; switch to postgres if you're running this for a team
db_type: sqlite3
db_path: /var/lib/headscale/db.sqlite

# Leave Tailscale's public DERP servers enabled unless you want to run your own derper binary
derp:
  server:
    enabled: false
  urls:
    - https://controlplane.tailscale.com/derpmap/default

dns_config:
  override_local_dns: true
  nameservers:
    - 1.1.1.1
  magic_dns: true
  base_domain: journals.internal   # clients resolve each other as hostname.journals.internal

Run it as a systemd service — the Restart=on-failure directive is non-optional if this is guarding access to your journal data. Without it, a crash at 2am means nothing syncs until you notice:

# /etc/systemd/system/headscale.service
[Unit]
Description=Headscale VPN controller
After=network-online.target
Wants=network-online.target

[Service]
User=headscale
Group=headscale
ExecStart=/usr/local/bin/headscale serve
Restart=on-failure
RestartSec=5
AmbientCapabilities=CAP_NET_BIND_SERVICE

[Install]
WantedBy=multi-user.target

sudo systemctl daemon-reload
sudo systemctl enable --now headscale
sudo systemctl status headscale   # look for "active (running)"

Once the server is up, create a namespace and generate a preauth key. In Headscale, "users" are the equivalent of Tailscale's tailnet — all your journal devices should live under one user:

headscale users create journals

# --reusable means you use the same key across all your devices without regenerating
# --expiration 90d is enough time to enroll everything without leaving a permanent key dangling
headscale preauthkeys create --user journals --reusable --expiration 90d

Connecting a Linux or macOS machine is straightforward once you have the key:

tailscale up --login-server https://headscale.yourdomain.com --authkey tskey-auth-XXXXX

Mobile is where the real rough edge lives. iOS and Android Tailscale clients technically support custom control servers via the login server field, but the OAuth redirect often breaks against self-hosted Headscale. The workaround that actually works: start the login flow on the device, grab the machine key it prints (it'll show in the Tailscale app's debug screen or your server logs), then register it manually from the server side:

# The machine key looks like mkey:xxxxxx — grab it from `headscale nodes list` or server logs
headscale nodes register --user journals --key mkey:xxxxxxxxxxxxxxxx

On the DERP relay question: by default your clients fall back to Tailscale's public DERP servers for relay when direct connections fail, which is fine and works reliably. You can run your own derper instance for full sovereignty — it needs its own TLS cert and public IP — but for a personal journaling setup the privacy gain is marginal. The metadata that leaks through Tailscale's DERP servers is just IP addresses and timing, not payload. I'd only bother with a custom DERP server if you're deploying this for a team across multiple continents and care about relay latency.

Head-to-Head: Where Each One Actually Falls Down

The performance gap people expect between Tailscale and Headscale almost never materializes in practice. Once two nodes establish a direct WireGuard tunnel — which happens in both setups — the control plane is completely out of the data path. Your journal sync traffic travels peer-to-peer at full WireGuard speed regardless of whether Tailscale's servers or your VPS brokered the connection. Where things actually diverge is uptime guarantees, metadata ownership, and how much ops work lands on your plate at 11pm on a Tuesday.

Factor

Tailscale

Headscale

Setup time

~10 minutes

1–3 hours (includes VPS, TLS, config)

Control plane hosting

Tailscale's servers

Your VPS

MagicDNS quality

Polished, split-DNS works reliably

Basic — DNS resolves but split-DNS is manual

Mobile client support

First-class iOS/Android apps

Uses the same apps but needs custom login URL

ACL complexity

Web UI + HuJSON, version history built in

File-based HuJSON pushed via CLI

Maintenance burden

Near-zero

Cert renewal, upgrades, backups, uptime

Cost

Free up to 3 users / 100 devices

~$5–6/mo VPS + your time

Tailscale's actual dealbreaker for a privacy-focused journaling setup: your node names, auth keys, last-seen timestamps, and ACL rules all live on their infrastructure. The WireGuard keys themselves are generated client-side and Tailscale never sees them — they've published documentation on this — but the metadata picture is different. If you're building a personal journaling system specifically because you don't want a third party to know which devices you own and when they're active, that metadata exposure is a real concern, not a paranoid one. A company with that data can receive legal process, get acquired, or just have a breach.

Headscale's dealbreaker is just as concrete: you are now responsible for the control plane's uptime. Existing nodes on established connections stay connected even if your Headscale instance goes down — WireGuard tunnels don't need the coordinator once they're up. But if your VPS goes offline, new nodes can't join, key rotations fail, and any mobile device that roamed to a new network and dropped its tunnel can't re-authenticate. I've seen this bite people when Let's Encrypt cert renewal fails silently and the Headscale HTTPS endpoint starts returning TLS errors. Set up a cert monitoring alert before you rely on this for anything daily-use critical.

# Push an updated ACL policy to Headscale — this is the entire UX
headscale policy set --policy-file acl.hujson

# Verify what got applied
headscale policy get

# On Tailscale you'd paste HuJSON into https://login.tailscale.com/admin/acls
# and get syntax highlighting, diff view, and a revert button

Both systems use the same HuJSON ACL format, which is genuinely good news — your policy files are portable. But the UX gap is real. Tailscale's web console shows you a diff when you save, highlights syntax errors inline, and keeps a history so you can revert a bad push. With Headscale you're doing headscale policy set and hoping the JSON was valid. I'd strongly recommend keeping your ACL file in a git repo with pre-commit validation if you go the Headscale route, otherwise a typo silently locks you out of your own nodes.

The one area where Tailscale's infrastructure genuinely outperforms Headscale is mobile reconnection on flaky networks. Tailscale runs a global fleet of DERP (Designated Encrypted Relay for Packets) servers that act as fallback relays when direct connections can't be established — there are nodes in North America, Europe, Asia, and elsewhere. When your phone switches from WiFi to LTE, or you're on a conference hotel network that blocks UDP, Tailscale's relay infrastructure reconnects you in a second or two. With Headscale, DERP relay support exists but you either rely on Tailscale's relay servers (which many people running Headscale specifically to avoid Tailscale find uncomfortable) or you self-host your own DERP node, adding yet another piece of infrastructure to babysit.

Which Journaling Apps Actually Pair Well With This Setup

The mesh network is only half the picture. The part that actually surprised me after setting up Headscale was how much simpler app configuration became — because once every device shares a flat IP space, you stop wrestling with dynamic DNS, port forwarding, and certificate gymnastics. Your Tailscale IP is stable, reachable from any device on the tailnet, and that changes what "self-hosted sync" actually means in practice.

Obsidian + Syncthing

This is the stack I run daily. Install the Syncthing plugin in Obsidian, then point Syncthing's listen address directly at your Tailscale IP — not 0.0.0.0, not your LAN IP. Open ~/.config/syncthing/config.xml and set the listen address to something like tcp://100.x.x.x:22000. That pins sync traffic to the tailnet only, so you're not accidentally broadcasting the Syncthing handshake on every network you join. No port forwarding. No router config. Syncthing figures out the peer via the tailnet and connects directly. The latency on initial sync is slightly higher than LAN but in practice you never notice it for a 50MB vault.

Joplin Server

Joplin has a first-party sync server you can self-host — it's a Node.js app, runs fine on a cheap VPS or a home server. After you get it running, lock it to the tailnet with one firewall rule:

# Only allow Joplin Server traffic from tailnet interface
ufw allow in on tailscale0 to any port 22300
ufw deny 22300

The order matters — ufw evaluates rules top to bottom. This pattern means the server port is completely invisible to the public internet. In the Joplin desktop and mobile clients, set the sync target to http://100.x.x.x:22300. Mobile works too because Tailscale runs as a VPN app on iOS and Android — your phone is on the tailnet just like your laptop.

Standard Notes Self-Hosted

Standard Notes offers a self-hosted sync server called standardnotes/self-hosted — it's Docker Compose based. Same pattern as Joplin: after the stack is up, bind it to the tailnet IP in your .env file:

# In your Standard Notes .env
EXPOSED_PORT=3000
# Then in docker-compose.yml, bind explicitly:
ports:
  - "100.x.x.x:3000:3000"

That Docker port binding is the key move. If you leave it as 0.0.0.0:3000:3000, the server is open on every interface — including whatever public IP your VPS has. Binding to the Tailscale IP means Docker won't even accept a connection from outside the tailnet. Point the Standard Notes client at https://100.x.x.x:3000 and you're done.

Plain Git Over SSH

Honestly the simplest option and the one I'd recommend for anyone who's already comfortable on the command line. Once the mesh is up:

# Add your home machine as a remote using its Tailscale IP
git remote add home ssh://user@100.x.x.x/~/journals.git

# First push
git push home main

That's it. No server software, no Docker, no database. The Tailscale IP is stable across reboots (Headscale assigns them persistently), so this remote doesn't break. I keep a bare repo on a home server and push from my laptop and phone (Termius on iOS handles this fine). Conflict resolution is manual, but for a journaling workflow where you mostly write on one device at a time, it's a non-issue.

What Doesn't Work Smoothly

Apps that hardcode their sync backend are a dead end here. Day One is the obvious example — there's no "sync server URL" setting, full stop. Same story with Notion and Bear. If the app doesn't expose a server endpoint you can point at an IP, no amount of network plumbing fixes it. The irony is that some of these apps have great mobile UX, but they've made a deliberate product choice to keep sync in-house. If self-hosted sync is a requirement, filter your app choices at the start: can I set a custom server URL? If the answer isn't clearly yes in the docs, assume no and move on.

The Moment Headscale Won Me Over (And When It Lost)

The moment I actually trusted Headscale was when I cracked open a psql session and just... looked at everything. No dashboard, no abstraction layer, no wondering what some SaaS company knows about my network topology.

-- Run this against your Headscale Postgres backend
SELECT
  machines.hostname,
  machines.last_seen,
  machines.expiry,
  pre_auth_keys.key,
  pre_auth_keys.used,
  pre_auth_keys.expiration
FROM machines
LEFT JOIN pre_auth_keys ON machines.auth_key_id = pre_auth_keys.id
ORDER BY machines.last_seen DESC;

That query returned exactly what I needed: every node that had ever touched my control plane, when it last phoned home, and whether its auth key was still live. My journaling setup uses Obsidian + syncthing over the Headscale tunnel, so knowing precisely which devices have valid credentials matters. With Tailscale's hosted control plane, you get the admin console UI — which is fine — but you cannot run a SELECT against their backend. You see what they choose to show you. That asymmetry bothered me more than I expected.

Then came the losing moment. I was running Headscale on a Hetzner CX21 (€5.77/month tier) and queued a kernel update during off-peak hours. The VPS rebooted, came back up, but Headscale didn't restart cleanly because I'd misconfigured my systemd service to depend on the wrong network target. Forty-five minutes of downtime. The wild part: my laptop and desktop stayed connected to each other the whole time. WireGuard is stateful — once the handshake is done and the tunnel is up, it doesn't need the coordinator anymore. The thing that broke was my partner's phone trying to re-register after she'd rebooted it to install an iOS update. Her device couldn't complete the auth flow because the control plane was dark. She couldn't sync her journal entries. That was not a fun conversation.

# The systemd unit that should have been there from day one
[Unit]
Description=Headscale VPN coordinator
After=network-online.target postgresql.service
Wants=network-online.target

[Service]
ExecStart=/usr/local/bin/headscale serve
Restart=always
RestartSec=5
# Without RestartSec, a crash loop hammers postgres immediately

[Install]
WantedBy=multi-user.target

The fix was trivial in retrospect. After=network-online.target instead of After=network.target is the difference between the service starting when the interface is actually ready versus when the network subsystem has merely initialized. I also added Restart=always with a sane RestartSec. But the damage to my credibility as the household's "infrastructure person" was already done. The failure wasn't Headscale's fault — it was mine — but that's actually the point. When you self-host, your mistakes become everyone's problem.

So here's my honest take on who should pick which option. If you're a solo developer who's already running Postgres for other projects, already has a VPS, and actually enjoys the occasional Saturday-morning debugging session — Headscale gives you something genuinely valuable: a control plane you fully own and can instrument. The operational cost amortizes across everything else you're running. But if your journaling setup involves other people — a partner, a small team, anyone who will notice and be annoyed by downtime you caused — the Tailscale free tier handles up to 3 users and 100 devices, costs nothing, and has a globally distributed control plane with a reliability track record that your single Hetzner box simply cannot match. The journaling use case doesn't push anywhere near those limits. Zero drama is genuinely worth something.

When to Pick What: Specific Scenarios

The decision isn't really about which tool is "better" — it's about matching your actual situation. I've seen people spin up Headscale for a 2-device personal setup and spend a weekend debugging cert issues when Tailscale free tier would've been running in 20 minutes. Equally, I've watched teams hit the 3-user free tier wall and reluctantly pay for something they could self-host on infrastructure they already own.

Go with Tailscale when...

You're solo or with one other person, you don't have a VPS sitting idle, and you want your journal accessible from your phone tonight. The auth flow takes maybe 15 minutes including the time you spend reading the dashboard. If your threat model is "I don't want this exposed to the public internet" rather than "I don't trust any third party with metadata about which devices talk to each other" — Tailscale free tier is genuinely the right call. Three users, 100 devices, no credit card. Also pick Tailscale if you're running on Windows or an iOS device as a primary node; Headscale client support on those platforms is functional but the experience is rougher.

Go with Headscale when...

You already have a VPS running Nginx or Caddy for something else — a $6/month Hetzner box, a DigitalOcean droplet, whatever. Adding Headscale to that machine costs you zero extra dollars and maybe 90 minutes. The other strong signal is team size: if you're coordinating journals across 4+ people (a family setup, a small research group, a dev team), you're looking at Tailscale's paid tier at $6/user/month. At 5 users that's $360/year for a coordination server. Headscale on existing infrastructure is $0/year. The metadata privacy argument is real too — Tailscale's coordination server sees device names, IP assignments, and connection timing even if it never sees your actual traffic. If that's in your threat model, Headscale eliminates it entirely.

Skip both and use raw WireGuard when...

Your topology is genuinely static — three servers in known locations that never change, no mobile devices, no new peers expected. wg-quick at this scale is maybe 20 lines of config per peer and zero moving parts:

# /etc/wireguard/wg0.conf on node A
[Interface]
PrivateKey = <node-a-private-key>
Address = 10.0.0.1/24
ListenPort = 51820

[Peer]
PublicKey = <node-b-public-key>
AllowedIPs = 10.0.0.2/32
Endpoint = node-b.example.com:51820
PersistentKeepalive = 25

Tailscale and Headscale both shine at dynamic mesh networking — devices coming and going, NAT traversal, key rotation. If you don't need any of that, you're adding complexity for no reason. Static WireGuard has no daemon, no coordination server, no TLS certs to renew. systemctl enable wg-quick@wg0 and it just runs forever.

The Headscale red flag worth taking seriously

If you've never set up cert renewal with Certbot or acme.sh, never written a systemd unit file, and have never looked at nginx reverse proxy config — the operational surface of Headscale will bite you. It's not that any individual piece is hard; it's that they all fail independently and silently. Your cert expires at 3am and your coordination server goes down. Your systemd service restarts but the socket file has wrong permissions. The Headscale binary gets an update that changes a config key name and it refuses to start with no obvious error. I'm not saying avoid it — I'm saying budget for the learning curve honestly. If you're comfortable SSHing into a box and reading journalctl -u headscale -f to debug, you'll be fine. If that sentence made you nervous, start with Tailscale.

Gotchas Worth Knowing Before You Start

The thing that catches most people off guard with Headscale isn't the setup — it's the upgrades. Headscale v0.23 introduced breaking changes that dropped compatibility with several older Tailscale clients. If you're running a mix of client versions across your nodes (which you probably are if you have phones, servers, and laptops), check the compatibility matrix in the README before you bump the server version. I've seen people upgrade Headscale on a Friday afternoon and spend the weekend debugging why half their nodes show "connected" in the admin UI but can't actually route traffic. The matrix lives at the top of the Headscale GitHub README — it's not buried, but you have to look for it deliberately.

Subnet routing is where Tailscale earns its keep for home labs — exposing a whole 192.168.1.0/24 through a single exit node without installing the client everywhere. But the documentation buries the prerequisite: IP forwarding has to be enabled at the kernel level, or packets just disappear silently with no error. Run this before you advertise routes:

# Enable IPv4 forwarding permanently
echo 'net.ipv4.ip_forward = 1' >> /etc/sysctl.conf

# Also add IPv6 if you're routing v6 traffic
echo 'net.ipv6.conf.all.forwarding = 1' >> /etc/sysctl.conf

sysctl -p
# Expected: net.ipv4.ip_forward = 1

Without that, your subnet router node will show as healthy, advertise routes successfully, and accept traffic — then drop every forwarded packet. It's maddening to debug if you don't know to look here first.

People treat Tailscale ACLs like a firewall replacement and that's a mistake. ACLs gate what Tailscale traffic can reach what — they do nothing about services that bind to 0.0.0.0. If your Prometheus instance starts on all interfaces and your VPS has a public IP, ACLs won't save you. Host-level firewall rules (ufw, nftables, or security groups if you're on a cloud provider) are still mandatory. Think of ACLs as logical access control inside the mesh, not perimeter security. The two layers complement each other — they don't substitute for each other.

Headscale's default key expiry of 90 days is the most common reason nodes silently stop connecting weeks after a working setup. There's no push notification, no obvious error — the node just goes offline and tailscale status reports it as disconnected. For servers and self-hosted machines you fully control, set expiration to zero when registering:

# Register a node with no key expiry (for machines you own completely)
headscale nodes register --user myuser --key  --expiration 0

# Or check expiry on existing nodes
headscale nodes list
# Look at the EXPIRY column — anything blank or near the current date needs attention

For nodes you don't fully control (like a friend's laptop you're adding to a shared network), keep expiry on — it's a useful security boundary. For your own infrastructure nodes, zero expiry plus a controlled rotation process beats surprise outages.

Before you go hunting through application logs or restarting services, run tailscale netcheck on both ends of a broken connection. It tells you DERP relay latency, whether UDP is being blocked forcing relay-only traffic, and your NAT type. A relay-only connection (no direct path) will show maybe 40-80ms of extra latency compared to direct UDP — acceptable for SSH, noticeable for anything latency-sensitive. If netcheck shows your firewall is blocking UDP 41641, fix that first. Opening that port in both directions often flips a relay connection to direct and cuts latency in half.

tailscale netcheck

# Useful output to look for:
# * UDP: true (if false, you're relay-only everywhere)
# * IPv4: reachable (address shown)
# * Preferred DERP: fra (or whatever region is closest)
# * DERP latency: fra=18ms, ams=22ms  ← higher than expected = network issue, not app issue

Disclaimer: This article is for informational purposes only. The views and opinions expressed are those of the author(s) and do not necessarily reflect the official policy or position of Sonic Rocket or its affiliates. Always consult with a certified professional before making any financial or technical decisions based on this content.

Originally published on techdigestor.com. Follow for more developer-focused tooling reviews and productivity guides.

DEV Community