Published June 19, 2026 by with 0 comment

Vibe Coding - 02: Subagents


---


## The Idea


I'm a cloud architect by day, and I have a hobby project: a Lua training mod for Street Fighter III: 3rd Strike running on FightCade (a MAME-based emulator). The mod adds training tools that don't exist in the original game — parry training, juggle air timers, tech throw indicators, and more.


The problem is time. After work, I have maybe an hour or two to make progress before I need to sleep. But Claude Code is sitting there, capable of reading code, writing code, and doing research — all without me watching. So I asked myself: can I let it keep working while I sleep?


The answer, after a bit of setup, is yes. But the more interesting question wasn't "can AI agents work overnight?" — it was "how do I stay in control while they do?"


---


## The Architecture

The core challenge with AI agents working autonomously is context. Each agent starts cold — it doesn't know what the previous one did, what decisions were made, or what state the codebase is in. I solved this with a single shared file: `OVERNIGHT_STATE.md`.


Think of it as a baton in a relay race. Every subagent:

1. Reads `OVERNIGHT_STATE.md` first to understand the current state and task

2. Does its assigned work

3. Writes its results and any blockers back into `OVERNIGHT_STATE.md` before stopping


No agent finishes without updating the baton. This way, even if Claude Code restarts or a session expires, the next agent can pick up exactly where things left off.


### The A/B/C Pattern


I split each task into three roles:


**Agent A — Research**

Reads the existing code, searches GitHub for reference implementations, and documents what it finds. It doesn't change anything — it just produces a detailed analysis: relevant file paths, line numbers, how the existing code is structured, and a proposed implementation plan.


**Agent B — Implementation**

Takes Agent A's analysis and writes the actual code. Strict scope limits — it only touches the files and lines A identified. No exploring, no refactoring, no scope creep.


**Agent C — Verification**

Reads Agent B's output independently and checks for logical errors, inconsistencies with the existing codebase, and anything that would prevent the code from running. It doesn't fix anything — it produces a pass/fail report with specific issues listed.


This separation matters. An agent that writes code and reviews its own work will miss things. Agent C comes in fresh, with no attachment to the implementation choices Agent B made.


---


## Human in the Loop — By Design


This is the part I spent the most time thinking about. Autonomous agents are only useful if you can trust what they produce. And trust has to be earned gradually.


### Start with a dry run


Before I let any agent touch real code, I ran a dry run: Agent A only, research task, no file modifications at all. The goal was just to verify that the state-passing mechanism worked — that an agent could read `OVERNIGHT_STATE.md`, do meaningful work, and write useful output back.


It worked. That dry run gave me enough confidence to move to the next step.


### Escalate task risk slowly


I didn't start with a feature. My first real overnight task was updating the README — adding documentation for a feature that already existed and was already tested. Low risk: worst case, the text is awkward and I rewrite it in the morning. No broken code, no untested logic.


Only after that succeeded did I move to actual code: the 720 Input Trainer feature. By then I had two overnight runs worth of evidence that the system was reliable.


The progression was deliberate:


| Night | Task | Risk level |

|-------|------|------------|

| 0 | Dry run (research only, no file changes) | None |

| 1 | README update (documentation only) | Low |

| 2 | New feature implementation | Medium |


### Pre-written constraints, not runtime decisions


The guard rails aren't instructions I give each agent at runtime — they're written into the system design before I go to sleep. The agents don't decide whether to push to git; they simply cannot, because the rule is in the briefing. The agents don't decide when to stop; they stop automatically when they hit a decision point they can't resolve.


The key constraints I wrote in:


**No git push.** The mod is a public repo. Nothing goes remote without my eyes on it. Agents can write to local files, but publishing is my job.


**No git commit either.** I test first, then commit. The agent produces a diff; I decide if it ships.


**Mandatory code review.** Agent C runs on every implementation task. Not optional, not skippable. If C finds a critical issue, the run stops and I wake up to a problem report, not broken code in my repo.


**Decision points stop the run.** If an agent hits something ambiguous — a design question, missing context, something that requires judgment — it writes it to the "Waiting for Peter" section of `OVERNIGHT_STATE.md` and stops. It doesn't guess.


**Usage monitoring.** I check Claude API spend before each agent starts via `ccusage`. If a 5-hour block exceeds $12, pause. If the daily total exceeds $25, stop entirely.


**Scope is explicit.** Each agent's briefing includes specific file paths and line numbers. "Implement the 720 mode" is not a valid briefing. "Add to `special_training_mode` table at line 548, add struct at line 576, add if block after line 3543" is.


---


## Last Night's Task: 720 Input Trainer


The specific task was building a training mode that detects whether a player has completed a 720° joystick rotation — the input for Hugo's Gigas Breaker super art in SF3, one of the hardest inputs in fighting games.


I won't go deep into the implementation here, but the highlight was Agent C catching a critical bug before I ever touched the code: Agent B had used Lua 5.3 bitwise operators (`|`, `&`, `<<`) in several places. FightCade embeds Lua 5.1, which doesn't support those operators. The entire script would have failed to load.


Agent C documented the exact line numbers and the corrected code. I wake up, apply a 5-minute fix, and I'm ready to test — instead of starting my morning debugging a cryptic syntax error.


That's the code review step earning its place.


---


## What I Woke Up To


A complete `OVERNIGHT_STATE.md` with:

- Agent A's architecture analysis and implementation plan

- Agent B's implementation notes (what changed, where, any surprises)

- Agent C's code review (one critical bug, exact fix provided, everything else passed)

- A clear checklist of what I need to do to finish and ship


Total time I need to spend this morning: apply the Lua syntax fix (5 minutes), test in FightCade (10 minutes), commit if it works.


That's a full feature research-and-implement cycle done overnight. I review, test, and ship.


---


## What's Next


The system works, but the human-in-the-loop pieces are still mostly manual. I set up the task list before I sleep, I manually kick off the first agent, and I review everything in the morning. That's fine — that's the point.


Eventually I want to add a monitoring layer through 9527 (my other AI assistant) so it can check `OVERNIGHT_STATE.md` on a schedule and notify me via WhatsApp if the run completes or gets stuck. But I'll build that incrementally, the same way I built this — dry run first, escalate slowly.


---


Reference:

1. GitHub:

https://github.com/Juilin77/3rd_training_lua_peter


First Published (YYYY-MM-DD) / Last Modified (YYYY-MM-DD)

2026-06-19 / 2026-06-19

0 comments:

Post a Comment