DEV Community

Cover image for Build a Kubernetes Log Pipeline in 5 Minutes (No YAML Required)
Femi A
Femi A

Posted on • Originally published at transformd.co

Build a Kubernetes Log Pipeline in 5 Minutes (No YAML Required)

If you've ever tried to set up proper logging in a Kubernetes cluster, you know the drill: research log collectors, write a DaemonSet manifest, figure out the right volume mounts for /var/log/pods, configure a parsing pipeline, set up a destination, test it, break it, fix it, and eventually ship something that mostly works but nobody wants to touch again.

It doesn't have to be that complicated.

In this guide, we'll go from a bare Kubernetes cluster to a fully structured log pipeline — collecting, parsing, and routing logs — in about five minutes. No YAML to write. No VRL to learn. Just a working pipeline.

What we're building

By the end of this tutorial, you'll have:

  • A lightweight agent running as a DaemonSet on your cluster (~20–50 MB RAM, <1% CPU)
  • Automatic collection of all pod logs (stdout/stderr)
  • Structured parsing — Kubernetes metadata (pod name, namespace, container) attached to every log line
  • A destination of your choice receiving clean, structured JSON

The agent handles log collection using Vector under the hood. The control plane handles configuration. Your logs never leave your infrastructure until they hit the destination you choose.

Prerequisites

  • A Kubernetes cluster (any flavour — EKS, GKE, AKS, k3s, minikube, kind)
  • kubectl configured and pointing at the cluster
  • A Transformd account (the free tier works fine for this)

That's it. No Helm charts to customise, no CRDs to install.


Step 1: Create an infrastructure

Log into Transformd and click New Infrastructure. Give it a name — something like staging-cluster or dev-k8s. Select Kubernetes as the type. You'll get an infrastructure token. Copy it.

What's an infrastructure? An infrastructure in Transformd represents a deployment target — a cluster, a set of VMs, a single server. Each infrastructure gets its own agent and its own set of pipelines.

Step 2: Install the agent

The infrastructure page shows a one-line install command. For Kubernetes, it's a kubectl apply that deploys the agent as a DaemonSet:

kubectl apply -f https://transformd.co/install/<your-token>
Enter fullscreen mode Exit fullscreen mode

That single command creates:

  • A transformd namespace
  • A DaemonSet running the agent on every node
  • A ServiceAccount with read-only access to pods and namespaces (for metadata enrichment)
  • Volume mounts for /var/log/pods (read-only)

The agent connects to the control plane over an outbound encrypted tunnel. No inbound ports, no LoadBalancer, no ingress rules needed.

After a few seconds, your infrastructure should show as "Connected" in the dashboard.

Step 3: Create a pipeline

Click into your infrastructure and hit New Pipeline.

The source is already configured: Kubernetes pod logs. The agent automatically collects stdout/stderr from every pod in the cluster and attaches Kubernetes metadata (pod name, namespace, container name, node name, labels).

Now add a transform. Click the + button and choose Parse JSON. Most Kubernetes applications log in JSON format — this transform will parse the JSON string into structured fields. If a log line isn't valid JSON, it passes through unchanged (no crash, no data loss).

Step 4: Add a destination

Click Add Destination and pick where you want your logs to land:

Destination Best for
Grafana Loki Teams already using the Grafana stack
Elasticsearch Teams with existing ELK infrastructure
AWS S3 Cost-effective long-term storage
AWS CloudWatch AWS-native teams
Datadog Filter and transform before sending to Datadog
Kafka Event streaming architectures
Splunk Enterprise SIEM and log analysis

Enter your destination's connection details (URL, auth token, index name — whatever the destination needs). The agent connects directly to your destination from your infrastructure. The control plane never sees the connection credentials — they're stored encrypted on the agent.

Step 5: Deploy and verify

Hit Deploy. The control plane pushes the pipeline configuration to your agent over the encrypted tunnel. Within a few seconds, logs start flowing.

You can verify in two ways:

  1. In Transformd: The pipeline view shows live metrics — events in, events out, errors, throughput. You'll see numbers ticking up immediately.
  2. In your destination: Check your Loki/Elasticsearch/S3 for incoming structured log entries. They should have Kubernetes metadata (namespace, pod, container) as top-level fields.

That's it. You now have a production-grade Kubernetes log pipeline. Every pod in your cluster is sending structured, parsed logs to your destination. It took one kubectl command and a few clicks.


What makes this different

If you've used Fluentd, Fluent Bit, or raw Vector configs before, you might be wondering what Transformd actually does differently. A few things:

Your logs never leave your network

Transformd uses a split architecture. The control plane manages pipeline configuration — it never touches your log data. The agent runs on your nodes, processes logs locally, and sends them straight to your destination. This matters for compliance (GDPR, SOC 2, HIPAA) and for cost (no per-GB ingest fees from a middleman).

No YAML, no VRL, no config files

The pipeline builder is visual. You pick transforms from 80+ templates (parsing, filtering, enrichment, redaction, etc.), configure parameters via form fields, and preview the output against live data before deploying. The platform generates the VRL (Vector Remap Language) code for you.

Autopilot discovery

After installing the agent, Transformd's Autopilot can scan your cluster and discover every log source automatically — every pod, every namespace, every log format. It even suggests which transforms to apply. You can go from zero to full-cluster coverage in a couple of minutes.

Flat pricing, no ingest fees

Traditional log platforms charge per GB of ingest. With Transformd, pricing is based on the number of infrastructures, not data volume. The free tier gives you 1 infrastructure with unlimited pipelines.


What to do next

Now that you have a working pipeline, here are some things worth trying:

Filter noisy namespaces. Add a "Filter" transform to drop logs from kube-system or other noisy namespaces. You'll immediately reduce your log volume (and your destination bill) without losing anything you care about.

Parse application-specific formats. If you have services logging in non-JSON formats (Nginx access logs, syslog, custom formats), add a "Parse with regex" or "Parse key-value" transform. You can have multiple transforms in a pipeline, and they're applied in order.

Route different logs to different destinations. Security audit logs to your SIEM, application logs to Loki, debug logs to /dev/null. You can add up to 20 destinations to a single pipeline with per-destination filters.

Set up alerts. Transformd can fire alerts when specific conditions appear in your log stream — error rate spikes, crash loops, absence of expected logs, slow requests. Alerts route to Slack, PagerDuty, Microsoft Teams, or webhooks.

Download a Grafana dashboard. Transformd ships pre-built Grafana dashboards for Kubernetes, Linux, and Docker. Download the JSON from the infrastructure overview page, import into Grafana, and you have instant log visibility.


FAQ

Does this work with managed Kubernetes (EKS, GKE, AKS)?
Yes. The agent runs as a DaemonSet and reads pod logs from the node filesystem. Works identically on managed and self-hosted clusters.

What resources does the agent use?
Typically ~20–50 MB RAM and <1% CPU. We recommend 100m CPU request / 500m limit and 128Mi memory request / 512Mi limit. It's based on Vector, which is written in Rust and extremely efficient.

What if the control plane goes down?
Your pipelines keep running. The agent operates on its last-known configuration. You lose the ability to make changes until the control plane recovers, but log collection and routing continue uninterrupted.

Does Transformd see my logs?
No. The agent processes logs on your infrastructure and sends them directly to your destination. The control plane only handles pipeline configuration and health metrics.

Is there a free tier?
Yes — 1 infrastructure, unlimited pipelines, 2 team members, no credit card required. Sign up here.


Transformd is a log pipeline platform built on VRL — the transform engine used by Datadog, Cloudflare, and AWS. Try it free or read more on the blog.

Top comments (0)