Beyond the Hype: What I Actually Learned Building Software with AI


TL;DR

AI isn’t a magic wand that replaces developers; it’s a lever that transforms us into architects. After years of trial and error, I’ve moved beyond simple chatbots to a structured “Plan then Build” workflow using OpenCode. This article details my practical approach: how I talk to agents, why I limit their tools, and why I still write some of the code myself to stay in control.

Introduction

I’m Anthony, a Software Engineer. Like many of you, I watched the generative AI wave crash over us. At first with skepticism, then curiosity, and today with the conviction that our profession is undergoing a fundamental mutation.

This is the first article of a series about how I use AI in my daily development. I want to move away from the flashy marketing demos and talk about the reality of the field: the hallucinations, the messy PRs, and the specific techniques I’ve learned to make it actually work.

My thesis is simple: AI hasn’t changed what I code, but it has radically transformed how I build it.

From Commands to Conversation: The Art of Iterating with AI

The biggest mistake I made early on was treating AI like a command-line utility: input an order, get a result. This works for a simple script, but it fails for complex software engineering. To leverage AI effectively, I had to learn to iterate with it as I would with a human teammate. This meant changing the fundamental way I communicate.

State Needs, Not Commands

I moved from giving commands (“Do X”) to stating needs (“I need X”).

When you dictate the “how,” you limit the AI to your own immediate—and sometimes flawed—mental model. By focusing on the “what” and the “why,” you invite the agent to leverage its training. It stops being a mindless executor and starts suggesting better implementations, edge cases, or architectural patterns you might have overlooked.

The way you phrase requests changes everything. It’s the difference between getting code that compiles and getting code that actually solves the problem in the right way.

// ❌ The Command
"Write a Python script to parse this CSV and save it to the DB."

// ✅ The Need
"I need to ingest CSV datasets into our database.

-   It needs to handle potential duplicates safely.
-   It must be memory efficient as files can be >1GB.
-   It must handle validation errors without stopping the whole process.""

The “Do You Have Questions?” Loop

This brings me to my golden rule. At the end of every significant prompt, I add: “If you have any questions, please ask me.”

You’d be surprised how often the AI says, “Actually, should this filter be applied client-side or server-side? Also, what should happen if the list is empty?”

If I hadn’t asked, it would probably have guessed—and guessed wrong. This simple question transforms the interaction from a monologue into a dialogue. It catches assumptions before they become bugs and opens the door for the AI to contribute in ways I didn’t anticipate. It also allows me to say “I don’t know,” forcing us to investigate or make a decision together.

The Methodology: Plan, Split, Build

Having a powerful agent in your terminal is useless if you don’t have a process. My workflow is strictly divided into three phases.

1. Plan Before You Build

I never let an agent touch the code without a plan. When you ask an AI to “just do it,” it dives straight into implementation details without considering the broader system.

But what exactly is Plan Mode? It simply means working with an agent that has no write access to the repository. It can only read your files. Its sole purpose is to prepare the ground for the “builder” agents that come later. By stripping away the ability to write code, we force the conversation to stay at the architectural level.

I use Claude Opus 4.5 for this phase. It tends to have a nice “thoughtful” and “global” approach when integrating a feature into a codebase. I ask it to create a plan, and because it’s in “plan mode,” we gain a massive advantage: Speed of Iteration.

In a normal coding loop, iteration is slow because you have to write code, compile, fix types, run tests, and debug. In the planning phase, we strip all that away. We iterate on pure logic and architecture. “If we do it this way, will it break the user session?” “Actually, let’s move this logic to the backend.”

We can go through multiple iterations of a feature’s design in the time it would take to write the boilerplate for the first attempt. The more time I spend iterating on the plan, the smoother the build becomes.

2. Think Small (The “Split” Phase)

We all know that 1,000-line Pull Requests are a nightmare to review. But for AI, big tasks are even worse because of context drift.

When an agent works on a massive task, its context window fills up with noise—previous errors, intermediate steps, and unrelated file contents. As the context fills, the agent “forgets” initial instructions or starts hallucinating.

Once a plan is ready, I almost always break it down. “Okay, this plan is good. Let’s break it down into three distinct features.”

This is strictly about engineering reliability:

  1. Focus: A task with a small scope keeps the agent focused on a specific set of files, reducing the chance of breaking unrelated code.
  2. Parallelism: If the tasks are decoupled, I can spin up multiple agents (using different worktrees or branches) to work on different parts simultaneously.

3. Build with Focus

For the actual coding, I switch to Claude 4.5 Sonnet. I find it faster and sharper for specific tasks, and it’s a lot cheaper than Opus.

Crucially, I limit the tools the agent has access to. I used to think “The more tools, the better.” I was wrong. Every tool you add to an agent’s definition adds to its system prompt. A 20-line persona description followed by 200 lines of tool definitions dilutes the agent’s focus.

I now use specialized agents and sub-agents, for example:

  • The Builder: Has all file operations and can run commands. This is the core coding agent meant to build the feature and test them.
  • The Reviewer: Has tools to read the project, access to the gh cli and some tools of the GitHub MCP. It cannot edit. It is meant to review PRs and suggest improvements.
  • The Web QA: Has access the the chrome devtools MCP. Its role is to verify behavior and report errors if any.

By restricting the tools, I force the agent to stay in its lane, resulting in much higher accuracy.

The Unskippable Step: Code Reviews

AI creates code, but it doesn’t understand “value.” It doesn’t see the big picture. The thing is, I’m still ultimately responsible for what it creates. If you swing a mace and break a wall, you can’t blame the mace. If I deploy a bug written by AI, it’s my fault.

I review every single line. Every. Single. Line.

Reviewing AI code is actually harder than reviewing human code because AI code looks perfect. It’s confident. It compiles. But it might be subtly implementing the wrong logic or not take specific project cases in consideration. The LLM is not human, it does not know the feature to implement and often misses the big picture.

That’s why reviewing AI code requires even more discipline. I read through the code slowly, line by line, asking myself if I would have done it differently and why.

Staying Hands-On: Why I Still Write Code

There is a danger in becoming a “Reviewer-Only.” If you stop writing code, your mental model of the application fades. You lose touch with the variable names, the data structures, the “glue” that holds it all together.

I aim to write at least 10% of the code myself but if the project is complex I tend to write a lot more. It might be the complex core logic, or maybe just some glue code between two features built by AI. It doesn’t matter what it is, as long as it keeps my hands dirty.

Keeping my hands in the code ensures I understand how features connect. When the app crashes, I know why because I was part of the process, not just a spectator. It keeps me grounded in the reality of the codebase.

My Current Stack

  • The Brains: Claude Opus 4.5 for planning (big picture thinking) and Claude 4.5 Sonnet for building (speed and precision).
  • The Interface: OpenCode. It allows me to orchestrate these agents in my terminal, giving them access to LSP (Language Server Protocol) so they “see” errors just like an IDE does.
  • The Editor: VSCode or IntelliJ. I use them mostly for reading code and manual tweaks now.

Conclusion

AI hasn’t replaced me; it has promoted me. I spend less time typing boilerplate and more time acting as an architect and product owner.

It requires discipline. You have to resist the urge to let the AI “do it all.” You have to plan, you have to split tasks, and you have to review relentlessly. But when you get it right, it’s not just faster—it’s a whole new way of building software.