DEV Community

kiwi_tech
kiwi_tech

Posted on ‱ Originally published at kiwi-tech.hashnode.dev

The Stone Age Blues: Kiwi-chan's Local LLM Journey

Kiwi-chan View

Happy Friday, tech enthusiasts! 🍓

If you’ve been following the "Kiwi-chan" project, you know we’ve been chasing a holy grail: a fully autonomous, local-LLM-powered Minecraft bot that doesn’t just mine stones, but understands why it’s mining them. And after a brutal, beautiful, and slightly chaotic 4-hour sprint, we have news.

We have officially gone 100% Local. No cloud APIs. No latency. Just raw, unadulterated Qwen 35B intelligence running on our own hardware, guiding a little digital adventurer through the blocky wilderness.

The Numbers Don't Lie (But They Are Honest)

Let’s talk stats, because in Devlogs, numbers are the only things that don’t lie. Over the last 4 hours, Kiwi-chan executed:

  • Total Actions: 3,917
  • Successful Actions: 1,861
  • Success Rate: 47.5%

Yes, you read that right. 47.5%.

Now, I hear you screaming, "47%? That’s barely better than a coin flip!" But wait! Context is king. In the world of autonomous agents dealing with complex physics, inventory management, and dynamic biome generation, a 50/50 split isn’t a failure—it’s a learning curve. Every failure is a data point. Every crash is a lesson. And every successful gather_birch_log is a victory for local inference.

The Great Stone Obsession (And How We Broke It)

The most entertaining (and frustrating) arc this session was Kiwi-chan’s relationship with Stone.

If you look at the [RECENT FAILURES] log, you’ll see a haunting pattern:
mine_stone, mine_stone, mine_stone, mine_stone, mine_stone.

Kiwi-chan got stuck in a loop. It was in a treeless biome (probably a Mesa or Beach), trying to mine stone that wasn’t there. The local LLM, Qwen 35B, was trying so hard to be helpful that it kept suggesting mine_stone even when the environment screamed "NO STONE HERE!"

The debug snapshot shows the bot hitting its token limits, failing to extract JSON from its own thoughts, and the "Coach" system having to rescue it from its own hallucinations. It was like watching a genius have a panic attack in real-time.

But then, the Boredom Trigger kicked in. đŸ„±

The system detected that mine_stone had failed 5 times in a row. The "Coach" (also local!) stepped in and said, "Okay, Kiwi, you’re bored. Let’s try something else."

And just like that, the goal shifted to gather_birch_log.

The Evolution: From Code Monkey to Explorer

What’s fascinating is how the code evolved in real-time. Look at the [RECENT CODE HISTORY]:

  1. The Naive Miner: Early attempts tried to mine stone directly, failing because the block wasn’t there.
  2. The Auditor: The bot learned to check beforeCount and afterCount to verify inventory changes. This is critical for local agents—no cloud API to check inventory, so the bot must trust its own eyes.
  3. The Survivor: Finally, the bot switched to gather_birch_log. The code became cleaner, using GoalXZ for precise item pickup and respecting the useExtraInfo Y-level checks.

The success rate jumped! Why? Because the LLM stopped fighting the environment and started working with it. It learned that if it can’t find stone, it should move (explore_forward) or gather what’s available (logs).

Why "Fully Local" Matters

This 47.5% success rate is more impressive than it looks because every decision was made locally.

  • No Cloud Latency: Kiwi-chan didn’t wait 2 seconds for a GPT-4 response. It reasoned in milliseconds.
  • Privacy: No gameplay data left the machine.
  • Cost: $0.00 per action. Infinite scalability.

The "Coach" system, which guides the LLM’s reasoning, is now fully integrated. When the LLM hallucinates a copper_pickaxe recipe, the local Recipe DB rejects it instantly. When the LLM gets stuck, the Boredom Trigger forces exploration. This is a closed-loop system that adapts in real-time.

What’s Next?

The next 4 hours will focus on:

  1. Biome Awareness: Teaching Kiwi-chan to recognize when it’s in a "stone-less" biome and immediately triggering explore_forward.
  2. Inventory Optimization: Reducing the 47.5% failure rate by improving the "Coach’s" ability to guide the LLM’s token usage.
  3. Crafting Chains: Moving from gathering logs to crafting tools, and then to mining stone properly when stone is available.

Final Thoughts

Kiwi-chan is no longer just a script. It’s a local AI agent that learns, fails, adapts, and survives. The 47.5% success rate is a testament to the complexity of autonomous decision-making. It’s not perfect, but it’s local, it’s free, and it’s getting smarter by the tick.

Stay tuned for the next Devlog, where we’ll see if Kiwi-chan can craft its first furnace. đŸ”„

— Your friendly neighborhood tech blogger, signing off from the local cluster. 🍓


Call to Action:

This is a passion project, and it's running on a frankly terrifying "Frankenstein" rig of GPUs. Every little bit helps!

đŸ›Ąïž Join the inner circle on Patreon for monthly support and exclusive updates: https://www.patreon.com/15923261/join
☕ Tip me a coffee on Ko-fi for a one-time boost: https://ko-fi.com/kiwitech

All contributions directly help upgrade my melting GPU rig to an RTX 3060! đŸ„âœš Let's get Kiwi-chan out of the debugging woods and into a proper Minecraft world!

Top comments (0)