When I was implementing my weather station system, I asked myself: what if I built it again but this time using AI?
The idea I had in mind was to compare the same project implemented with both approaches. Is AI development as fast as the hype claims? What about code quality? Will I face the same challenges when using AI? As a developer, will I feel better or worse?
In this article I will try to explain how was building my Weather Monitoring System again using AI and how I felt, being as honest as possible.
Experiment Context
The Components
- Sensor Reader: A Python program that reads weather data from the sensor, displays the current readings and sends the data to the Web app
- Web App: Dashboard & API: a PHP + Symfony web application that receives the data from the weather station and display it in a dashboard.
AI Setup
I didn't want to spend money using AI, so I tried the free plan of Gemini, OpenRouter and also Ollama locally. But with the Gemini and OpenRouter I ran out of tokens too fast and using Ollama in my own PC didn't work properly.
So, following a colleague recommendation, I ended up using OpenCode and its default model: Big Pickle.
I have to say that, regardless of the final result, it worked quite well.
Source Code
You can check the source code for both approaches here:
- Sensor Reader
- Webapp
Rewrite approach
It is true that when rebuilding a project you don't have to experiment, you know exactly what you want. To compensate this advantage I tried to act as if the "handmade" projects didn't exist and also acting as a "vibe coder", that means, I only wanted it to work without writing a single line of code and without caring about how it is built.
Time Spent
I measured the time spent by assigning 2 hours per each day I committed code.
| Project | non-AI | AI |
|---|---|---|
| Sensor Reader | 42h | 14h |
| Webapp | 28h | 12h |
"Wow!! Four times faster using AI!!"
But the reality is that during the development, without AI, I had to learn Python and how the Pimoroni library works. Also, I didn't have a clear idea of how to structure the code. I refactored it several times as I learned about how to apply best programming practices in Python.
Another factor to take into account is how I used the version control system with both approaches. Using AI I made less commits than without it because:
- Without AI I commit each small step I consider ok.
- With AI I made fewer but larger commits as I waited until a complete functionality was working.
Regarding the Webapp project, I had it finished in around 16h. The rest of the time, until 28h, I was iterating to see what kind of statistical weather data I wanted to see together with current values and also how to display the information. Using AI I applied directly the solution I finally decided for the non-AI approach.
Taking this into account and the fact that, as a vibe coder, I didn't spend time reviewing the code, my "spider-sense" warns me that maybe the numbers are not as impressive as they seem.
Code Quality
I have used Sonarqube to obtain some metrics about code quality like maintainability, security, reliability or code duplication. Is it better using AI or not? Let's see.
Sensor Reader
The non-AI project had 45 maintainability issues for not following the snake_case convention. My decision was to follow the Camel Case, maybe because I am mainly a Java & PHP developer. For me this was not an issue and so I marked them as false positives.
I also removed some issues from the Weather Hat display class as it is just a wrapped copy of the Pimoroni's example. I only made a few changes to make it an implementation of the DisplayInterface.
Taking this into account, this is the result without AI:

And this is the result with AI:

The non-reliability issue from the AI project is for a useless assignment in the factory.py file.
In my opinion, the quality seems better without AI, but it is true that there is no big difference.
Regarding complexity and technical debt, these are the result.
| non-AI | AI | |
|---|---|---|
| Cyclomatic complexity | 72 | 87 |
| Cognitive complexity | 19 | 87 |
| Technical debt | 50min | 7min |
AI code is more difficult to understand (87 vs 19) but the technical debt is higher with the non-AI project due to my lack of knowledge with Python. The concept of interfaces doesn't exist in Python and I didn't approach it correctly.
Webapp
The results are quite similar but the AI approach has a 3.6% of duplicated code


The complexity and technical debt results are as follows:
| non-AI | AI | |
|---|---|---|
| Cyclomatic complexity | 70 | 72 |
| Cognitive complexity | 10 | 16 |
| Technical debt | 30min | 1h 43 min |
AI is clearly worse in these metrics, with more than three times of technical debt, and a code a bit more difficult to understand.
Development Experience
I had mixed feelings using AI. On one hand I was amazed about what you can accomplish with a simple set of requirements. On the other hand I felt overwhelmed and quite distrustful of the final result.
AI struggled in several important points and, in the end, I had to finally explain where the problem was as it was unable to find the solution.
In the Sensor Reader project, AI was applying the temperature offset incorrectly. Instead of indicating the offset value to the Pimoroni's library, before reading the sensor, AI applied it after the temperature was read. Due to this, the relative humidity values were wrong. After several iterations where I told the AI that the result was wrong and asked it to read to Pimoroni's documentation, I had to explain the AI how to fix the issue.
In the Web app I had several problems with authentication:
- In the first run, the user auth didn't work. AI had to fix several issues, one after another: 404 and 500 errors, infinite loop accessing the login form, etc. In one of these iterations, AI recommended me to remove the CSRF token in the login form!
- Once authentication was solved, logout didn't work and I had to implement it by myself as AI was unable to make it work.
- When introducing the JWT authentication AI removed the user session-based authentication!
- AI was unable to make JWT authentication work together with session-based authentication. It insisted several times that the problem was with the Apache web server configuration. But it was just how the project needs to be configured to be executed on the Apache web server. Again, I had to make it work by myself.
It was very frustrating and I ended up quite angry several times.
The SDD (Spec-Driven Development) experience was also frustrating.
Natural language is imprecise by nature. A real developer can fill the gaps of a specification document applying their own judgment and knowledge about the domain. AI will just hallucinate. To avoid this as much as possible, you have to be extremely precise writing the specification. And, IMHO, this is far more complex than coding.
I ended up using the AI to do the things step by step instead of telling it to use the specification file.
Conclusion
AI is helpful for solving limited tasks and problems like finding a bug, implementing an interface or connecting to an endpoint. Building an entire project from a specification document is another story. With limited tasks it is easy to review the code and feel more or less accountable of the result. Having tones of code to review I feel less in control of the final result.
Don't get me wrong, AI is a good tool, an amazing one, but you still need to know how to write code. And the only way I know to get better at using AI is to continue facing problems by ourselves.
The key point is to find what kind of problems we should solved ourselves and which ones can be delegated to the AI.
Top comments (23)
second build always goes faster - you’ve already solved the domain problems once. the interesting comparison would be two teams starting cold, not one dev running two passes. AI gets credit for your own prior learning.
You're completely right. I mention in the article that the time comparison maybe is not as impressive as It seems due to the reason you are stating.
This a personal project I have mplemented in my free time, I don't have a team unless my son learns to program :-)
Solo project changes the calculus completely — no team history to control for. Future pair programmer incoming.
Really enjoyed the honesty in this comparison. The part that stood out to me was your point that AI looked much faster partly because the problem was already well understood, while the non-AI version included the real learning curve and refactoring cost. Also interesting that Sonar showed lower technical debt for the AI version but much higher cognitive complexity — that feels very true in practice. As an AI founder, I think this is the right framing: AI compresses iteration time, but it doesn’t remove the need for judgment.
That is the most important thing for me when developing with AI: how to avoid the loss of judgement that can be provoked by the business pressure for velocity.
Exactly — velocity pressure is its own form of technical debt for judgment. When you're building under time pressure, you stop asking "is this the right abstraction?" and start asking "does it work right now?"
I've noticed AI amplifies whatever decision-making mode you're already in. If you're in slow/deliberate mode, it helps you build more. If you're in fast/deadline mode, it helps you skip the uncomfortable questions faster.
The discipline of stopping to think — "why did the AI suggest this?" — is what separates good AI-assisted dev from accumulated invisible debt.
100% agree!
The meta-discipline of pausing to ask that question gets harder under pressure precisely when it matters most. One pattern Ive found useful: treat it like a code review rule -- never merge AI-generated code without being able to explain why it works, not just that it works. The explanation itself surfaces the judgment gaps. The projects where AI hurt the most were the ones where we shipped fast and never had that reckoning. The debt did not show up in the code -- it showed up in the teams mental model of the system.
As someone once said: "With great power comes great responsibility" 😉
Great breakdown of your experience! From the rewrite approach section I knew it was going to be fair and balanced.
Thanks for your kind comment. I'm glad you liked it! 😊
This is one of the most realistic comparisons of AI vs non-AI development I’ve read. The time-saving benefits of AI are impressive, but your experience clearly shows that speed doesn’t always mean better quality or easier debugging. I especially agree with the point that vague specifications can create more confusion with AI than with human developers. AI works best as a coding assistant, not a replacement for problem-solving and engineering judgment. The balance between human expertise and AI tools will probably define the future of software development.
Exactly, finding when to use AI and when not is key.
Regarding specifications, I remember myself, in the early 2000, reading hundreds of pages of software requirement specification documents. They aim to be accurate to let the developers translate them into code without problems. But the reality was that these documents always contains gaps and sometimes contradictory statements depending the page you were reading.
Thanks for your comment Jack!
This is one of the more useful AI coding comparisons because it separates “time saved” from “confidence in the result.” The part about AI being great for bounded tasks but stressful for whole-project ownership feels very accurate. The hidden cost is not just code review, it is rebuilding enough understanding to trust what was generated.
Indeed! I'm more concerned about the hidden cost in form of stress or cognitive debt than the outcome of AI.
Thanks Leonard!
Your token limit struggles on Gemini/OpenRouter are relatable. Generic LLMs lack deep context for specialized tasks, hitting a wall beyond general knowledge.
This is crucial for our drug-interaction graph.
You're right, maybe I should use Claude or Codex, but OpenCode worked quite well.
I don't know how it is compared with others but, at least for learning and for what I needed in this personal project, it was ok for me.
Really appreciated how honest and balanced this was. A lot of AI discussions focus only on speed, but you highlighted something equally important, the mental overhead of trusting and maintaining generated code. The part about AI struggling with authentication flows and repeatedly missing the actual root cause felt very real.
Also loved your point that writing extremely precise specs can sometimes feel harder than just coding the thing yourself 😅 Great read.
Cognitive debt, frustration, accountability, ... these hidden costs that are hard to measure but you can see and feel, are the most concerning things when using AI from my point of view.
Thanks Anupam!
These side-by-side experiments are the most useful AI content right now, mostly because they expose the second-order effects nobody quantifies. The AI build often finishes first but accumulates implicit decisions the author can't reconstruct three weeks later, and that's the cost line that bites on the second iteration, not the first. The race-to-MVP comparison flatters AI, the maintenance-six-months-in comparison usually doesn't.
Technical and cognitive debt, reliability, accountability, ... downsides that most of the people that blindly follow AI don't want to take into account because... velocity!
Really enjoyed the honest comparison. AI definitely helps speed things up, but your examples show why real dev experience still matters.
Thanks!