Matthew

Posted on May 6

Part 8: Security Stack

#security #kubernetes #devops #aws

Security Stack — Kyverno, Falco, WAF, and GuardDuty

Part of the series: Building a Production-Grade DevSecOps Pipeline on AWS

Introduction: Defense in Depth

No single security tool is sufficient. A WAF blocks HTTP attacks but does nothing if an attacker exploits a container escape. Kyverno blocks bad pod configurations but can't stop an attacker who is already inside a running container. Each layer catches what the others miss.

An attacker must penetrate all 5 layers. A container escape attempt is caught
simultaneously by Falco (runtime), Kyverno (admission), and GuardDuty (API
anomaly detection).

┌─────────────────────────────────────────────────────────────────────┐
│  SECURITY LAYERS — each catches different attack vectors            │
│                                                                     │
│  Layer 1: SUPPLY CHAIN (Part 6)                                     │
│  ┌─────────────────────────────────────────────────────────────┐    │
│  │ Trivy: no HIGH/CRITICAL CVEs in image                       │    │
│  │ Cosign: image cryptographically signed before push          │    │
│  │ Distroless: no shell/tools available post-compromise        │    │
│  └─────────────────────────────────────────────────────────────┘    │
│                    ↓ (image passes, reaches cluster)                │
│  Layer 2: ADMISSION CONTROL (Kyverno)                               │
│  ┌─────────────────────────────────────────────────────────────┐    │
│  │ Blocks bad configs at kubectl apply / ArgoCD sync time      │    │
│  │ Pod never starts if it violates policy                      │    │
│  └─────────────────────────────────────────────────────────────┘    │
│                    ↓ (pod starts, attacker gets RCE)                │
│  Layer 3: RUNTIME DETECTION (Falco)                                 │
│  ┌─────────────────────────────────────────────────────────────┐    │
│  │ eBPF syscall monitoring — detects attacks already running   │    │
│  │ Alerts within 1 second of suspicious activity               │    │
│  └─────────────────────────────────────────────────────────────┘    │
│                    ↓ (attacker reaches HTTP layer)                  │
│  Layer 4: PERIMETER (AWS WAF)                                       │
│  ┌─────────────────────────────────────────────────────────────┐    │
│  │ Blocks SQLi, XSS, log4shell, rate limiting at ALB level     │    │
│  │ Attacker request never reaches your pod                     │    │
│  └─────────────────────────────────────────────────────────────┘    │
│                    ↓ (account-level threats)                        │
│  Layer 5: THREAT INTELLIGENCE (GuardDuty)                           │
│  ┌─────────────────────────────────────────────────────────────┐    │
│  │ ML-based: crypto mining, C2 comms, compromised credentials  │    │
│  │ Monitors CloudTrail, VPC Flow Logs, DNS queries             │    │
│  └─────────────────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────────────────┘

Kyverno — Admission Control

Kyverno is a Kubernetes-native policy engine. Policies are written in YAML, not a separate policy language, which makes them readable and maintainable by anyone who knows Kubernetes.

Version Selection

This matters enormously. Kyverno 3.7.x requires Kubernetes ≥ 1.30 because it uses ValidatingAdmissionPolicy (a v1 API). Our clusters run Kubernetes 1.29.

❌ Kyverno 3.7.x → CrashLoopBackOff on k8s 1.29
✅ Kyverno 3.2.6 (app version 1.12.5) → compatible with k8s 1.25–1.29

# infrastructure/kyverno/applicationset.yaml
source:
  repoURL:        https://kyverno.github.io/kyverno
  chart:          kyverno
  targetRevision: "3.2.6"

Installation Flags

helm install kyverno kyverno/kyverno \
  -n kyverno --create-namespace \
  --version 3.2.6 \
  --no-hooks \        # REQUIRED: cleanup CronJobs get ImagePullBackOff
  --wait \
  --timeout 10m

System Namespace Exclusions

Kyverno policies apply to all namespaces by default. System components like CoreDNS and kube-proxy run in kube-system and don't follow application-level security policies (they need root, they need hostPath, etc.). Exclude system namespaces in every policy:

# Applies to ALL policies below
exclude:
  any:
    - resources:
        namespaces:
          - kube-system
          - kyverno
          - cert-manager
          - external-secrets
          - argocd
          - argo-rollouts
          - monitoring
          - logging
          - falco

Policy 1: Block Privileged Containers

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: disallow-privileged-containers
spec:
  validationFailureAction: Enforce
  background: true
  rules:
    - name: check-privileged
      match:
        any:
          - resources:
              kinds: [Pod]
      exclude:
        any:
          - resources:
              namespaces: [kube-system, kyverno, cert-manager, external-secrets,
                           argocd, argo-rollouts, monitoring, logging, falco]
      validate:
        message: "Privileged containers are not allowed."
        # Use anyPattern in Kyverno v1.12.x (NOT validate.any)
        anyPattern:
          - spec:
              containers:
                - =(securityContext):
                    =(privileged): false
          - spec:
              containers:
                - =(securityContext): {}

API change in v1.12.x: Use validate.anyPattern not validate.any. The validate.any syntax was removed in the 1.12 API version.

Policy 2: Require Non-Root Containers

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-non-root
spec:
  validationFailureAction: Enforce
  background: true
  rules:
    - name: check-runasnonroot
      match:
        any:
          - resources:
              kinds: [Pod]
      exclude:
        any:
          - resources:
              namespaces: [kube-system, kyverno, cert-manager, external-secrets,
                           argocd, argo-rollouts, monitoring, logging, falco]
      validate:
        message: "Containers must run as non-root (runAsNonRoot: true)."
        anyPattern:
          - spec:
              securityContext:
                runAsNonRoot: true
          - spec:
              containers:
                - securityContext:
                    runAsNonRoot: true

Policy 3: Block hostPath Volumes

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: disallow-host-path
spec:
  validationFailureAction: Enforce
  background: true
  rules:
    - name: check-hostpath
      match:
        any:
          - resources:
              kinds: [Pod]
      exclude:
        any:
          - resources:
              namespaces: [kube-system, kyverno, logging, falco]
      validate:
        message: "hostPath volumes are not allowed."
        deny:
          conditions:
            any:
              - key: "{{ request.object.spec.volumes[].hostPath | length(@) }}"
                operator: GreaterThan
                value: "0"

Policy 4: Require Resource Limits

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-resource-limits
spec:
  validationFailureAction: Enforce
  background: true
  rules:
    - name: check-limits
      match:
        any:
          - resources:
              kinds: [Pod]
      exclude:
        any:
          - resources:
              namespaces: [kube-system, kyverno, cert-manager, external-secrets,
                           argocd, argo-rollouts, monitoring, logging, falco]
      validate:
        message: "CPU and memory limits are required on all containers."
        pattern:
          spec:
            containers:
              - resources:
                  limits:
                    memory: "?*"
                    cpu: "?*"

Policy 5: Require Signed Images

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-signed-images
spec:
  validationFailureAction: Enforce
  background: false   # Must check at admission, not retroactively
  rules:
    - name: check-image-signature
      match:
        any:
          - resources:
              kinds: [Pod]
              namespaces: [myapp]   # Only enforce on application namespaces
      verifyImages:
        - imageReferences:
            - "206617159586.dkr.ecr.us-east-1.amazonaws.com/myapp:*"
          attestors:
            - entries:
                - keys:
                    kms: "awskms:///arn:aws:kms:us-east-1:206617159586:key/YOUR_KEY_ID"

Kyverno Circular Deadlock — How to Fix It

If Kyverno's webhook configurations become corrupted (e.g., from a failed upgrade), new Kyverno pods can't start because they can't pass their own admission checks. It is a deadlock.

# Symptom: Kyverno pods stuck in Pending or CrashLoopBackOff
# Error: "failed calling webhook: the server is currently unable to handle the request"

# Fix: Delete the broken webhook configs — this temporarily disables admission control
kubectl delete validatingwebhookconfiguration kyverno-resource-validating-webhook-cfg
kubectl delete validatingwebhookconfiguration kyverno-policy-validating-webhook-cfg
kubectl delete mutatingwebhookconfiguration kyverno-resource-mutating-webhook-cfg

# Kyverno pods can now start without passing their own webhooks
# Once running, Kyverno recreates the webhook configs automatically
kubectl rollout restart deployment/kyverno -n kyverno

Falco — Runtime Threat Detection

Falco operates at the Linux kernel level using eBPF probes. It monitors every system call made by every process in every container. When a pattern matches a rule, it fires an alert within milliseconds.

┌──────────────────────────────────────────────────────────────┐
│  HOW FALCO WORKS                                             │
│                                                              │
│  Kernel syscalls (open, exec, connect, read, write...)       │
│         │                                                    │
│         │  eBPF probe (kernel module or ebpf driver)         │
│         ▼                                                    │
│  Falco engine                                                │
│  ├── Checks each syscall against rule set                    │
│  ├── Rule: "exec of sh in container → ALERT"                 │
│  └── Rule: "read /etc/shadow → ALERT"                        │
│         │                                                    │
│         │  JSON alert output to stdout                       │
│         ▼                                                    │
│  Fluent Bit (DaemonSet) picks up stdout                      │
│         │                                                    │
│         ▼                                                    │
│  CloudWatch Logs: /eks/cluster-name/falco                    │
└──────────────────────────────────────────────────────────────┘

Installation

# infrastructure/falco/applicationset.yaml
source:
  repoURL:        https://falcosecurity.github.io/charts
  chart:          falco
  targetRevision: "3.8.7"
  helm:
    values: |
      driver:
        kind: ebpf   # Modern eBPF driver (no kernel module compilation)
      falco:
        json_output: true     # JSON output for Fluent Bit parsing
        log_stderr: true
        log_level: info
      falcosidekick:
        enabled: false   # Using Fluent Bit for log shipping instead

Key Rules That Fire by Default

Rule	Trigger	Severity
Terminal shell in container	`exec` of sh/bash/zsh in container	WARNING
Read sensitive file untrusted	Read of /etc/shadow, /etc/passwd	WARNING
Write below root	Any write to / or system dirs	ERROR
Outbound Connection Not Expected	Container connects to unexpected IP	NOTICE
Privilege Escalation via setuid	setuid/setgid syscall	WARNING
Modify binary dirs	Write to /bin, /usr/bin	ERROR

Testing Falco

# In one terminal, watch Falco logs
kubectl logs -f -n falco -l app.kubernetes.io/name=falco -c falco | grep -v "Notice\|Informational"

# In another terminal, trigger a rule
kubectl exec -it -n myapp <pod-name> -- sh
# Falco fires: "Notice A shell was spawned in a container with an attached terminal"
# (Note: distroless containers have no shell — this only works if you exec into a debug container)

Custom Rule: Alert on curl/wget

# infrastructure/falco/custom-rules.yaml
- rule: Unexpected curl or wget in container
  desc: Detect curl or wget being used in a container (potential exfiltration)
  condition: >
    spawned_process and
    container and
    proc.name in (curl, wget, python, python3) and
    not proc.pname in (sh, bash)
  output: >
    Curl/wget detected in container
    (user=%user.name command=%proc.cmdline container=%container.name
     image=%container.image.repository)
  priority: WARNING
  tags: [network, mitre_exfiltration]

AWS WAF — Web Application Firewall

WAF sits in front of your ALB and inspects every HTTP request before it reaches your pods. This happens at the AWS network edge — your application code never sees malicious requests.

Terraform Module

# _modules/waf/main.tf

resource "aws_wafv2_web_acl" "main" {
  name  = "${var.env}-${var.region_alias}-web-acl"
  scope = "REGIONAL"   # For ALB (not CloudFront)

  default_action {
    allow {}   # Allow by default; rules below explicitly block
  }

  # Rule 1: AWS Managed — Common Rule Set (OWASP Top 10)
  rule {
    name     = "AWSManagedRulesCommonRuleSet"
    priority = 1
    override_action { none {} }
    statement {
      managed_rule_group_statement {
        name        = "AWSManagedRulesCommonRuleSet"
        vendor_name = "AWS"
      }
    }
    visibility_config {
      cloudwatch_metrics_enabled = true
      metric_name                = "CommonRuleSet"
      sampled_requests_enabled   = true
    }
  }

  # Rule 2: SQL Injection protection
  rule {
    name     = "AWSManagedRulesSQLiRuleSet"
    priority = 2
    override_action { none {} }
    statement {
      managed_rule_group_statement {
        name        = "AWSManagedRulesSQLiRuleSet"
        vendor_name = "AWS"
      }
    }
    visibility_config {
      cloudwatch_metrics_enabled = true
      metric_name                = "SQLiRuleSet"
      sampled_requests_enabled   = true
    }
  }

  # Rule 3: Known bad inputs (log4shell, Spring4Shell, etc.)
  rule {
    name     = "AWSManagedRulesKnownBadInputsRuleSet"
    priority = 3
    override_action { none {} }
    statement {
      managed_rule_group_statement {
        name        = "AWSManagedRulesKnownBadInputsRuleSet"
        vendor_name = "AWS"
      }
    }
    visibility_config {
      cloudwatch_metrics_enabled = true
      metric_name                = "KnownBadInputs"
      sampled_requests_enabled   = true
    }
  }

  # Rule 4: Rate limiting — 2000 req/5min per IP
  rule {
    name     = "RateLimitPerIP"
    priority = 4
    action { block {} }
    statement {
      rate_based_statement {
        limit              = 2000
        aggregate_key_type = "IP"
      }
    }
    visibility_config {
      cloudwatch_metrics_enabled = true
      metric_name                = "RateLimit"
      sampled_requests_enabled   = true
    }
  }

  visibility_config {
    cloudwatch_metrics_enabled = true
    metric_name                = "${var.env}-web-acl"
    sampled_requests_enabled   = true
  }
}

output "web_acl_arn" { value = aws_wafv2_web_acl.main.arn }

Associating WAF with Your Ingress

The WAF ACL ARN is injected per-cluster via the ApplicationSet and added as an ALB annotation:

# apps/myapp/templates/ingress.yaml
{{- if .Values.ingress.enabled }}
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: {{ include "myapp.fullname" . }}
  annotations:
    kubernetes.io/ingress.class: alb
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/target-type: ip
    alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443},{"HTTP":80}]'
    alb.ingress.kubernetes.io/ssl-redirect: "443"
    alb.ingress.kubernetes.io/certificate-arn: {{ .Values.ingress.certArn }}
    {{- if .Values.ingress.wafAclArn }}
    alb.ingress.kubernetes.io/wafv2-acl-arn: {{ .Values.ingress.wafAclArn }}
    {{- end }}

AWS GuardDuty — Threat Intelligence

GuardDuty operates at the AWS account level — it analyzes CloudTrail API logs, VPC Flow Logs, and DNS query logs using machine learning to identify threats.

# _modules/guardduty/main.tf

resource "aws_guardduty_detector" "main" {
  enable = true

  datasources {
    s3_logs {
      enable = true   # Detect unusual S3 access patterns
    }
    kubernetes {
      audit_logs {
        enable = true  # Monitor EKS audit logs for suspicious API calls
      }
    }
    malware_protection {
      scan_ec2_instance_with_findings {
        ebs_volumes {
          enable = true  # Scan EBS volumes when GuardDuty finds a threat
        }
      }
    }
  }
}

What GuardDuty Detects

Finding Type	Example
`CryptoCurrency:EC2/BitcoinTool.B!DNS`	EC2 instance querying known crypto mining pools
`UnauthorizedAccess:IAMUser/TorIPCaller`	API calls originating from Tor exit nodes
`CredentialAccess:Kubernetes/SuccessfulAnonymousAccess`	Anonymous access to Kubernetes API
`Execution:Kubernetes/ExecInKubernetes.Medium`	kubectl exec into a running pod (suspicious context)
`Exfiltration:S3/ObjectRead.Unusual`	Unusual S3 read patterns suggesting data theft

Attack Scenario: All Layers in Action

Here's how the security layers stop a real attack:

Scenario: Attacker finds a dependency with RCE vulnerability

1. Attacker discovers CVE-XXXX-1234 in one of your npm packages

   → Trivy: if the CVE is HIGH/CRITICAL, the build FAILS — image never pushed
     (Layer 1: Supply Chain)

2. If Trivy missed it (unfixed CVE) and image was pushed:
   Attacker triggers the RCE, gets command execution in the pod

   → Falco: "A shell was spawned in container myapp-abc123"
     Alert fires within 1 second to CloudWatch
     (Layer 3: Runtime Detection)

   → But wait — distroless has no /bin/sh to spawn
     Attacker needs a writable filesystem — which is also blocked
     (Layer 1: Distroless base)

3. Attacker tries to deploy a privileged pod to escape to the node:

   → Kyverno: BLOCKS the pod at admission — "Privileged containers not allowed"
     Pod never starts
     (Layer 2: Admission Control)

4. Attacker tries SQL injection via the public HTTP endpoint:

   → AWS WAF: blocks the request at the ALB
     Your pod code never executes the malicious query
     (Layer 4: Perimeter)

5. Attacker's stolen AWS key starts making API calls:

   → GuardDuty: unusual API call pattern detected
     Finding generated: "UnauthorizedAccess:IAMUser/AnomalousBehavior"
     (Layer 5: Threat Intelligence)

Summary

By the end of Part 8 you have:

✅ Kyverno 3.2.6 running on all 6 clusters (compatible with k8s 1.29)
✅ Five Kyverno policies enforcing: no privileged, no root, no hostPath, resource limits, signed images
✅ Falco DaemonSet monitoring all syscalls with eBPF driver
✅ Falco alerts flowing to CloudWatch via Fluent Bit
✅ AWS WAF WebACL with OWASP Top 10, SQLi, known bad inputs, and rate limiting
✅ GuardDuty enabled in all accounts with EKS audit log monitoring

Screenshot Placeholders

SCREENSHOT: AWS WAF console showing WebACL with managed rule groups and request metrics

SCREENSHOT: AWS GuardDuty console showing Findings summary (hopefully empty in production)

SCREENSHOT: kubectl get clusterpolicies showing all Kyverno policies as Ready

Next: Part 9 — Observability: Prometheus, Grafana, Fluent Bit, and CloudWatch

Follow the series — next part publishes next Wednesday.
Live system: https://www.matthewoladipupo.dev/health
Runbook: Operations Guide
Source code: myapp-infra | myapp-gitops | myapp

DEV Community