Vibe Coding - 02: Subagents ~ 快快樂樂的Peter (Happy Peter)

---

## The Idea

I'm a cloud architect by day, and I have a hobby project: a Lua training mod for Street Fighter III: 3rd Strike running on FightCade (a MAME-based emulator). The mod adds training tools that don't exist in the original game — parry training, juggle air timers, tech throw indicators, and more.

The problem is time. After work, I have maybe an hour or two to make progress before I need to sleep. But Claude Code is sitting there, capable of reading code, writing code, and doing research — all without me watching. So I asked myself: can I let it keep working while I sleep?

The answer, after a bit of setup, is yes. But the more interesting question wasn't "can AI agents work overnight?" — it was "how do I stay in control while they do?"

---

## The Architecture

The core challenge with AI agents working autonomously is context. Each agent starts cold — it doesn't know what the previous one did, what decisions were made, or what state the codebase is in. I solved this with a single shared file: `OVERNIGHT_STATE.md`.

Think of it as a baton in a relay race. Every subagent:

1. Reads `OVERNIGHT_STATE.md` first to understand the current state and task

2. Does its assigned work

3. Writes its results and any blockers back into `OVERNIGHT_STATE.md` before stopping

No agent finishes without updating the baton. This way, even if Claude Code restarts or a session expires, the next agent can pick up exactly where things left off.

### The A/B/C Pattern

I split each task into three roles:

**Agent A — Research**

Reads the existing code, searches GitHub for reference implementations, and documents what it finds. It doesn't change anything — it just produces a detailed analysis: relevant file paths, line numbers, how the existing code is structured, and a proposed implementation plan.

**Agent B — Implementation**

Takes Agent A's analysis and writes the actual code. Strict scope limits — it only touches the files and lines A identified. No exploring, no refactoring, no scope creep.

**Agent C — Verification**

Reads Agent B's output independently and checks for logical errors, inconsistencies with the existing codebase, and anything that would prevent the code from running. It doesn't fix anything — it produces a pass/fail report with specific issues listed.

This separation matters. An agent that writes code and reviews its own work will miss things. Agent C comes in fresh, with no attachment to the implementation choices Agent B made.

---

## Human in the Loop — By Design

This is the part I spent the most time thinking about. Autonomous agents are only useful if you can trust what they produce. And trust has to be earned gradually.

### Start with a dry run

Before I let any agent touch real code, I ran a dry run: Agent A only, research task, no file modifications at all. The goal was just to verify that the state-passing mechanism worked — that an agent could read `OVERNIGHT_STATE.md`, do meaningful work, and write useful output back.

It worked. That dry run gave me enough confidence to move to the next step.

### Escalate task risk slowly

I didn't start with a feature. My first real overnight task was updating the README — adding documentation for a feature that already existed and was already tested. Low risk: worst case, the text is awkward and I rewrite it in the morning. No broken code, no untested logic.

Only after that succeeded did I move to actual code: the 720 Input Trainer feature. By then I had two overnight runs worth of evidence that the system was reliable.

The progression was deliberate:

| Night | Task | Risk level |

|-------|------|------------|

| 0 | Dry run (research only, no file changes) | None |

| 1 | README update (documentation only) | Low |

| 2 | New feature implementation | Medium |

### Pre-written constraints, not runtime decisions

The guard rails aren't instructions I give each agent at runtime — they're written into the system design before I go to sleep. The agents don't decide whether to push to git; they simply cannot, because the rule is in the briefing. The agents don't decide when to stop; they stop automatically when they hit a decision point they can't resolve.

The key constraints I wrote in:

**No git push.** The mod is a public repo. Nothing goes remote without my eyes on it. Agents can write to local files, but publishing is my job.

**No git commit either.** I test first, then commit. The agent produces a diff; I decide if it ships.

**Mandatory code review.** Agent C runs on every implementation task. Not optional, not skippable. If C finds a critical issue, the run stops and I wake up to a problem report, not broken code in my repo.

**Decision points stop the run.** If an agent hits something ambiguous — a design question, missing context, something that requires judgment — it writes it to the "Waiting for Peter" section of `OVERNIGHT_STATE.md` and stops. It doesn't guess.

**Usage monitoring.** I check Claude API spend before each agent starts via `ccusage`. If a 5-hour block exceeds $12, pause. If the daily total exceeds $25, stop entirely.

**Scope is explicit.** Each agent's briefing includes specific file paths and line numbers. "Implement the 720 mode" is not a valid briefing. "Add to `special_training_mode` table at line 548, add struct at line 576, add if block after line 3543" is.

---

## Last Night's Task: 720 Input Trainer

The specific task was building a training mode that detects whether a player has completed a 720° joystick rotation — the input for Hugo's Gigas Breaker super art in SF3, one of the hardest inputs in fighting games.

I won't go deep into the implementation here, but the highlight was Agent C catching a critical bug before I ever touched the code: Agent B had used Lua 5.3 bitwise operators (`|`, `&`, `<<`) in several places. FightCade embeds Lua 5.1, which doesn't support those operators. The entire script would have failed to load.

Agent C documented the exact line numbers and the corrected code. I wake up, apply a 5-minute fix, and I'm ready to test — instead of starting my morning debugging a cryptic syntax error.

That's the code review step earning its place.

---

## What I Woke Up To

A complete `OVERNIGHT_STATE.md` with:

- Agent A's architecture analysis and implementation plan

- Agent B's implementation notes (what changed, where, any surprises)

- Agent C's code review (one critical bug, exact fix provided, everything else passed)

- A clear checklist of what I need to do to finish and ship

Total time I need to spend this morning: apply the Lua syntax fix (5 minutes), test in FightCade (10 minutes), commit if it works.

That's a full feature research-and-implement cycle done overnight. I review, test, and ship.

---

## What's Next

The system works, but the human-in-the-loop pieces are still mostly manual. I set up the task list before I sleep, I manually kick off the first agent, and I review everything in the morning. That's fine — that's the point.

Eventually I want to add a monitoring layer through 9527 (my other AI assistant) so it can check `OVERNIGHT_STATE.md` on a schedule and notify me via WhatsApp if the run completes or gets stuck. But I'll build that incrementally, the same way I built this — dry run first, escalate slowly.

---

Reference:

1. GitHub:

https://github.com/Juilin77/3rd_training_lua_peter

First Published (YYYY-MM-DD) / Last Modified (YYYY-MM-DD)

2026-06-19 / 2026-06-19

快快樂樂的Peter (Happy Peter)

Happy Peter

Vibe Coding - 02: Subagents

0 comments:

Post a Comment

Search

Categories

Popular Posts

最新回應

Blog Archive