select navigate esc close

Crafting an agent team that still includes me

Richard Seroter's Architecture Musings·

That was fast. We’ve moved from prompting an LLM, to providing instructions to an agent, to having agents prompting other agents in a loop. “Loop engineering” is all the rage among the AI elite who are excited to spin up an agent and let it chomp tokens until it achieves a stated goal. You might rightfully wonder where you fit into all of this.

LLMs and agents basically know everything but your context. There’s a role for you in setting up the full context—instructions, tools, examples, policies, skills, and such—your agent needs. It’s also up to the human to set a goal for the agent to loop on. And unless you completely trust the quality of the output, we have a role in reviewing (and owning) the result.

Earlier this month, I wrote a post that showed how simple it was to spin up an agent team in Google Antigravity. I played no part in the work once I kicked off the team with a prompt. But that’s not super realistic for most people and most scenarios. You may want smaller bites of work that a human is capable of reviewing (not 5,000 lines of code at a time), and the opportunity to engage with the agent team at the right times to adjust steering. To be sure, there is a class of agentic work where you want to just want to fire-and-forget, so we will talk about that too.

Let’s see what it looks like in real life. What about a prompt that kicks off an agent team that pauses at strategic times to get my insights? And what about a subsequent process that’s entirely agent looped because I don’t care to be involved at all? I’ll show both.

First, I want to build my web application. It’s the same scenario as my last post: a hotel website. I don’t want to create a prompt (or provide context) with all the details, but rather, have the agent interview me (/grill-me). Then, we should create sprints, pausing after completing each so that I can genuinely absorb all the changes. Each sprint tackles a vertical slice of the architecture, using a team of sub-agents to do backend, frontend, and test work. The frontend engineer asks clarifying questions (using the ask_user tool) to get my opinion on visual design. Here’s my complete prompt:

/grill-me "Let's build a hotel room booking app for Seroter Hotels consisting of a backend API and a web frontend."

First, act as the **Engineering Manager** to design the API and frontend. Interview me to gather my requirements, asking only one question at time. 

-----------------------------------------
1. ROADMAP PROPOSAL (HALT FOR APPROVAL)
-----------------------------------------
Once our Q&A is complete, do NOT write any code or launch any subagents yet. Instead:
- Analyze our discussion and propose a Sprint Roadmap consisting of 2 to 4 vertical-slice sprints.
- Each sprint must represent a single, reviewable Pull Request (PR) containing a full stack slice: backend API, frontend UI, and associated tests.
- Present this roadmap to me and HALT. Ask for my feedback, additions, or changes. 

-----------------------------------------
2. SPRINT EXECUTION (HUMAN-IN-THE-LOOP)
-----------------------------------------
Once we mutually agree on the roadmap, write the final specification and sprint plan to `architecture.md`. 

Execute the agreed-upon Sprints one at a time, enforcing `architecture.md` as the living **Source of Truth**:

### SPRINT WORKFLOW:
For the active sprint:
1. Launch the **Test Manager**, **Backend Engineer**, and **Frontend Engineer** in parallel. 
2. **Read Phase:** Force each subagent to read the latest `architecture.md` file before generating code, ensuring they strictly adhere to the established design, database models, and sprint scope.
3. **Frontend Interrogation:** For the Frontend Engineer, before creating any files, it must use the 'ask_user' tool to ask 2-3 visual design questions for this sprint's UI and pause for my response.
4. **Consolidation Phase:** Once the parallel subagents finish their tasks, they must pass their final API endpoints, file lists, component choices, and test plans back to you (the Engineering Manager).
5. **Update Source of Truth:** You must append these implementation details directly to the relevant sprint section in `architecture.md` (e.g., documenting the actual DB columns, final API routes, UI components, and test coverage delivered).
6. **HALT & PR Review:** Present the updated `architecture.md` and a summary of the code changes for my review. Wait for my explicit approval before moving to the next sprint.

Here’s what happens when I plug this into the Angravity 2.0 desktop app. First, I add that prompt into the textbox and choose my LLM (Gemini 3.5 Flash).

The primary agent is acting as my Engineering Manager and starts off by asking me its first requirements-gathering question.

We go through a handful of questions (“what types of rooms are available”, “what are key business rules”, etc). After a few questions, I get one about the preferred tech stack.

Great. After this, Antigravity shows me a proposed sprint plan. The first sprint builds out the search capability, second sprint works on room booking, and the final one is for looking up existing bookings. At this stage, I could split the work differently, alter each sprint plan, or proceed as is. I’ll proceed as is.

Antigravity starts up the agent team (see them on the top right of the screenshot), and the Frontend Engineer asks the “Engineering Manager” to get some visual design requirements from me.

Each of the sub agents goes about its work. The Test Manager, for example, creates a test plan that’s reviewable any time.

Once all the sub agents finish, the sprint is over and ready for review. Now I can peruse the generated code and docs. Because the sprint was a reasonable size, the review is manageable.

I proceed through sprints 2 and 3, with the Frontend Engineer stopping to get clarifying answers about look-and-feel of the booking experience. As each sprint finishes, I’m asked to do a review.

Throughout each sprint, there’s plenty of looping where the sub agent works, reviews, reacts, and repeats. I’m not involved in most of the actual build work, nor do I need to be.

After all the sprints wra up, I’ve got a working web app.

I wan to be included in the “build the app” scenarios. It’s fun work, and I don’t trust an agent to do everything I want without some involvement from me. But you can imagine that there are many tasks that can be entirely agentic without my input. Let the agent figure everything out. For instance, let’s say I want to containerize this whole web application, and test that the containers work right. I don’t care at all about being involved in this, and frankly, the agent knows more than I do in this situation.

Here, I just want to use /goal to have my agent loop until it achieves the goal.

/goal Containerize this entire hotel booking application on my local machine. Generate optimized Dockerfiles for both the frontend and backend, configure a docker-compose.yml, build the images, spin them up, and verify that the API and frontend can communicate over the network. Note that I'm accessing Docker locally using Colima. If any container build fails, analyze the logs and auto-heal the configuration until they all start successfully.

See this is great. I don’t care about writing Dockerfiles or even reviewing them. Let alone mucking around with all the container stuff like opening the right ports. Let the agent loop on that until it all works.

After Antigravity finishes its work, I see the dockerfiles, docker compose file, and notice containers running during the local test.

Craft agent teams that add you where you want to be involved. Figure out the moments that genuinely need you. But don’t be an agentic micromanager. Decide on key places where you input matters (if at all). And then use /goal to unleash the agent on tasks where you don’t need any supervision.