In the previous post, I showed you an AI doing something genuinely useful, helping me adapt a recipe for a dinner party. We talked about the basic ...
For further actions, you may consider blocking this person and/or reporting abuse
Thanks for the practical explanations! A few further thoughts:
The challenge of hallucination you highlight is exacerbated in voice-first interfaces. When a user asks 'mujhe bukhar hai' (I have a fever) in their mother tongue, they need accuracy, not plausible invention. There are no visual cues to flag uncertainty in a spoken response.
This makes data provenance and confidence scoring even more critical.
I focused mostly on text-based AI hallucinations, but you opened up my mental model further. You are right, voice makes it trickier because users lose visual trust signals and confidence can be mistaken for correctness.
The multilingual example makes this even more real. Provenance + confidence scoring feels critical in these use cases. Do you think voice assistants should say “I’m not certain” or “this information comes from medical guidelines” instead of optimizing purely for smooth conversational flow in the future? Or something else?
Thanks for sharing this perspective!
This is a great primer on the 'why' behind hallucinations. Most people assume AI is a database when it’s actually a reasoning engine—and those two things have very different relationship statuses with the 'Truth.'
However, from an Infrastructure Thinking perspective, the goal isn't just to understand why it lies, but to build a system where those lies can't reach the end-user. I’ve been working on a pattern I call the Sovereign Gateway, which treats the LLM as an untrusted agent. Instead of just hoping the model doesn't hallucinate, we use Versioned Snapshots and Forensic Integrity Checks to validate the output against a 'Ground Truth' database—like the SQL transactions and procedures mentioned in other foundational stacks—before the data is ever surfaced.
In my Sovereign Synapse series, I argue that the 'Staleness vs. Latency' trade-off is often where these hallucinations hide. If the data pipeline is too slow, the agent 'fills in the gaps.' By moving toward Shadow-Routing logic, we can audit the agent's forensic integrity in real time.
The 'Why' is important, but for those of us building production-grade AI, the 'How do we contain it' is the real challenge.
Thank you for the details Ken. The how is definitely a bigger challenge for production grade systems. The evaluations and ground truth are now so much more important! I would love to read more on your patternm, can you please point me to the right links?
The "context reduces, doesn't eliminate" framing carries over neatly to AI markup on long manuscripts. We run an auto-assign pass that tags every line in a chapter by speaker — narrator vs. each character — and even with the full chapter in context, the model will occasionally invent a speaker that the prose doesn't actually attribute, especially in dialogue blocks where the author drops attribution between turns and the reader is left to infer who's talking. Same prediction-filling-a-gap mechanism you describe, just applied to character attribution instead of facts. What we've found works is treating the pass as a draft that the writer expects to correct, rather than a finished answer — close in shape to your grounding + evaluation + guardrails triple, with the human edit step acting as the evaluator. The model is great at producing a useful-looking attribution; the writer is the one who knows whether it's actually true.
A recent response I got from a model when I asked where it got some numbers -- "I'm gonna be honest, I made that up." 😂
Recognisable 😂
🤣 atleast model is honest!
Just yesterday I had Opus asking me after every prompt: we have been going for a long time, let me save my context and continue tomorrow 😂
lol "do what you have to do buddy"
:D I really answered every time, you are a computer, just continue. But it became even worse, so I needed to start a new session :)
yes! I am talking about how long context window degrades the quality in upcoming blogs!
Interesting, kind of seems like some big corps are getting in on the "fake AI generated junk".
As far as I'm aware though there are a lot awesome options to learn, plan, and progress using AIs.
OPs point stands though, never believe the AI unless you are confident and back check frequently.