DEV Community

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Speed, caching, and the 40x cost wall

Speed, caching, and the 40x cost wall

2
Comments
3 min read
Turning Server Logs into Incident Summaries with Java and Groq

Turning Server Logs into Incident Summaries with Java and Groq

Comments
8 min read
Hermes Agent Skill Authoring — SKILL.md Structure and Best Practices

Hermes Agent Skill Authoring — SKILL.md Structure and Best Practices

Comments
10 min read
I Built a Free Daily AI News Engine Using Claude Code CLI — No API Key Needed

I Built a Free Daily AI News Engine Using Claude Code CLI — No API Key Needed

Comments
3 min read
Context Governance for Coding Agents

Context Governance for Coding Agents

1
Comments 2
25 min read
Forget Your RAG: Build Your Own LLM Wiki in C# with Ollama + Kimi (Step‑by‑Step Guide)

Forget Your RAG: Build Your Own LLM Wiki in C# with Ollama + Kimi (Step‑by‑Step Guide)

2
Comments
10 min read
Agentic RAG: What It Is, Why Teams Use It, and Where It Gets Complicated

Agentic RAG: What It Is, Why Teams Use It, and Where It Gets Complicated

Comments
3 min read
Coding in the Age of AI Is Not What You Think

Coding in the Age of AI Is Not What You Think

Comments
6 min read
DeepClaude: I Combined Claude Code with DeepSeek V4 Pro in My Agent Loop and the Numbers Threw Me Off

DeepClaude: I Combined Claude Code with DeepSeek V4 Pro in My Agent Loop and the Numbers Threw Me Off

1
Comments
8 min read
Gemma 4 MTP, vibevoice.cpp for Multimodal AI, & Ollama Desktop Layer for Local Deployment

Gemma 4 MTP, vibevoice.cpp for Multimodal AI, & Ollama Desktop Layer for Local Deployment

Comments
3 min read
What I learned tuning a Reddit DM agent through 8 versions in 24 hours

What I learned tuning a Reddit DM agent through 8 versions in 24 hours

Comments
15 min read
PII Protection for AI Agents: Why Detection Isn't Enough and What Prevents Actual Exposure

PII Protection for AI Agents: Why Detection Isn't Enough and What Prevents Actual Exposure

2
Comments 1
8 min read
I rebuilt my open-source AI coding agent that routes each pipeline stage to a different LLM

I rebuilt my open-source AI coding agent that routes each pipeline stage to a different LLM

Comments
5 min read
# Why `$0.0029` and `$0.0047` Can Both Be Right: Prefix Caching for API-Served LLM Judges *By Eyoel Nebiyu*

# Why `$0.0029` and `$0.0047` Can Both Be Right: Prefix Caching for API-Served LLM Judges *By Eyoel Nebiyu*

Comments
3 min read
From OOM to 262K Context: Running Qwen3-Coder 30B Locally on 8GB VRAM

From OOM to 262K Context: Running Qwen3-Coder 30B Locally on 8GB VRAM

Comments
8 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.