Slavyan Donev

Posted on May 11

How I Built an AI-Powered Alert Triage System in 2 Weeks (LangGraph + VirusTotal + MITRE ATT&CK)

#agents #ai #automation #cybersecurity

The Problem

Every SOC analyst and MSP team I've talked to has the same complaint:

"We get 200 alerts a day. Maybe 10 are real. But someone has to check all 200."

That's alert fatigue. And it's not a small problem — the average analyst spends 3-5 hours daily on manual triage. Most of that time is wasted on false positives.

I decided to build something to fix this. Two weeks later, I had a working MVP. Here's exactly how I built it.

The Architecture

The system has 4 main components:

Alert Input (Defender/SentinelOne/JSON)
        ↓
Alert Normalizer
        ↓
LangGraph Triage Agent
  ├── Enrich Node (VirusTotal + MITRE ATT&CK)
  ├── Analyze Node (LLM risk scoring)
  └── Human-in-the-Loop Node (Critical alerts)
        ↓
Output (Risk Score + Slack + Audit Log)

Step 1: Alert Normalizer

The first challenge: every security tool outputs alerts in a different format. Defender looks different from SentinelOne, which looks different from a generic SIEM.

I built a normalizer that takes any alert format and converts it to a single internal structure:

@dataclass
class NormalizedAlert:
    alert_id: str
    source: str          # defender / sentinelone / generic
    severity: str        # Low / Medium / High / Critical
    title: str
    timestamp: str
    mitre_technique: Optional[str]
    hostname: Optional[str]
    username: Optional[str]
    source_ip: Optional[str]
    raw: dict            # Original alert for audit

This means the rest of the system doesn't care where the alert came from. It always works with the same format.

Step 2: LangGraph State Machine

I used LangGraph to build the agent as a state machine. Each step in the triage process is a separate node:

class TriageState(TypedDict):
    alert: dict
    enrichment: Optional[dict]
    risk_score: Optional[int]
    risk_level: Optional[str]
    explanation: Optional[str]
    recommendation: Optional[str]
    needs_human: Optional[bool]
    error: Optional[str]

The graph flows like this:

enrich → analyze → [human_review if score >= 70] → format_output

Why LangGraph instead of a simple chain? Because real triage isn't linear. You need conditional routing — a Critical alert should follow a different path than a Low one. LangGraph makes this explicit and debuggable.

Step 3: Enrichment Tools

Before the LLM sees the alert, two tools run automatically:

VirusTotal IP Lookup

def check_ip(ip: str) -> IPReputation:
    url = f"https://www.virustotal.com/api/v3/ip_addresses/{ip}"
    headers = {"x-apikey": api_key}
    response = requests.get(url, headers=headers, timeout=10)
    # Returns malicious_votes, country, as_owner, is_known_bad

Why this matters: An alert marked "Low severity" came in for SSH login attempts. The source IP had 4 malicious votes on VirusTotal. The system automatically escalated it to High. Without enrichment, that alert would have been ignored.

MITRE ATT&CK Context

Instead of hitting an API for every request, I built a local database of the most common techniques:

MITRE_DB = {
    "T1059.001": MitreTechnique(
        "T1059.001", "PowerShell", "Execution",
        "Adversaries use PowerShell to execute commands, often with encoded payloads...",
        "high"
    ),
    "T1486": MitreTechnique(
        "T1486", "Data Encrypted for Impact (Ransomware)", "Impact",
        "Adversary encrypts data to disrupt availability...",
        "high"
    ),
}

This context goes directly into the LLM prompt — giving the model real knowledge about what each technique means and how dangerous it is.

Step 4: The LLM Analysis

The Triage Agent sends the enriched alert to Groq (Llama 3.3 70B) with a structured prompt that returns JSON:

{
  "risk_score": 95,
  "risk_level": "Critical",
  "explanation": "The source IP is flagged as MALICIOUS by 17 VirusTotal engines...",
  "recommendation": "Block IP immediately and isolate the device.",
  "needs_human": true
}

Key design decision: temperature 0.1. Security analysis needs consistency, not creativity.

Step 5: Human-in-the-Loop

For any alert with risk score >= 70, the system sends a Slack notification and waits for human approval. AI assists — humans decide on critical actions.

Step 6: REST API with FastAPI

@router.post("/triage", response_model=TriageResponse)
def triage_alert(alert_request: AlertRequest):
    normalized = normalize_alert(alert_request.model_dump(exclude_none=True))
    result = run_triage(normalized)
    return TriageResponse(...)

Microsoft Defender can now send a webhook to POST /triage and get back a full analysis in ~3 seconds.

Real Results

Running 6 sample alerts through the system:

A "Low severity" SSH alert was escalated to High because VirusTotal flagged the source IP (4 malicious votes)
A data exfiltration alert scored 95/100 Critical — destination IP had 17 VirusTotal votes, known Tor exit node used for C2

Tech Stack

Python 3.12 + LangGraph + FastAPI
Groq (Llama 3.3 70B) — free tier
VirusTotal API — free tier (500 req/day)
Slack Webhooks — notifications

Total cost for MVP: $0

Key Lessons

Enrich before you analyze — LLM without real threat intel is just guessing
LangGraph over simple chains — conditional routing requires a proper state machine
Human-in-the-Loop is not optional — never automate critical security decisions
Start with the data — understanding real alerts before coding saved hours

Currently looking for MSP and SOC teams for a free 2-week pilot.
If your team deals with alert fatigue — comment below or DM me.

Tags: python security ai langgraph cybersecurity

DEV Community