Ever wondered what happens when you let AI argue with itself?
I built AI Debate Arena โ a terminal app where four AI agents (a moderator, a pro debater, a con debater, and a judge) run a full structured debate on any topic you give them, powered by LangGraph and Groq.
Here's how it works and what I learned building it.
๐ง The Concept
The idea is simple: instead of one AI giving you a balanced answer on a topic, what if multiple agents each had a role and a perspective โ and had to argue, rebut, and decide?
Four agents, one state machine:
| Agent | Job |
|---|---|
| Moderator | Introduces the topic, sets the rules, picks who goes first |
| Pro | Argues for the topic every round |
| Con | Rebuts and argues against the topic every round |
| Judge | Reviews the full debate history and declares a winner |
๐ง The Stack
- LangGraph โ for the state machine / agent orchestration
-
Groq +
llama-3.1-8b-instantโ for fast LLM inference - Rich โ for the live typewriter-style terminal UI
๐๏ธ Project Structure
I split the project across 4 files for clean separation of concerns:
debate-arena/
โโโ main.py # Entry point, user input, terminal display
โโโ agents.py # State definition, LLM, agent functions
โโโ connections.py # Graph nodes, edges, routing logic
โโโ prompts.py # All prompt templates
๐ฅ๏ธ Terminal UI with Rich
After the graph finishes, the full history list is played back with a live typewriter effect using Rich:
def typewriter_panel(role, content):
colors = {
"moderator": "cyan",
"pro": "green",
"con": "red",
"judge": "magenta"
}
text = Text()
with Live(Panel(text, title=role.upper(), border_style=colors.get(role, "white")), refresh_per_second=30) as live:
for char in content:
text.append(char)
sleep(0.005)
live.update(Panel(text, title=role.upper(), border_style=colors.get(role, "white")))
Each role gets its own colour โ cyan for the moderator, green for pro, red for con, magenta for the judge.
๐งช Running It
pip install -r requirements.txt
python main.py
Enter the topic: AI will replace software engineers
Enter maximum rounds: 3
Then watch the debate unfold in your terminal.
๐ก What I Learned
LangGraph's conditional edges are powerful. Once I understood that routing is just a function that returns a string key, wiring up complex agent flows became intuitive.
Shared state is everything. All four agents read from and write to the same State dict. Keeping it well-defined upfront saved a lot of debugging later.
Prompt discipline matters. Telling each agent to "avoid repetition" and "rebut the previous argument" in the prompt made a real difference in output quality.
Groq is fast. Running 3 rounds with 4 agents means 6+ LLM calls โ Groq handled this without any noticeable delay.
๐ฎ What's Next
- Save debate transcripts to a file
- Swap in different models per agent
- Build a web UI with Flask or Streamlit
- Add a third "neutral" debater
The full code is on GitHub: github.com/Sripadh-Sujith/debate-arena
If you build something on top of this or have ideas for improvements, drop them in the comments. Happy to discuss!
Thank You๐
Top comments (2)
cool project, is that meant to be a saas product or u are building something else for this, just curious
I'm actually learning LangGraph so I need to create some project for learning, this project helped me with it. I'm aiming to build something big with principles of various projects