ADR-0050: Auto-Agent HITL Pre-Flight Loop¶
- Status: Accepted
- Date: 2026-04-25
- Supersedes: none
- Superseded by: none
- Related: ADR-0002 (label state machine); ADR-0029 (caretaker loop pattern); ADR-0044 (principles audit); ADR-0045 (trust fleet +
hitl-escalationlabel); ADR-0049 (enabled_cbkill-switch convention). - Enforced by:
tests/test_auto_agent_preflight_loop.py;tests/scenarios/test_auto_agent_preflight_scenario.py;tests/test_loop_wiring_completeness.py. - Spec: docs/superpowers/specs/2026-04-25-auto-agent-hitl-preflight-design.md
- Plan: docs/superpowers/plans/2026-04-25-auto-agent-hitl-preflight.md
Context¶
HydraFlow's stated operating model is dark-factory: software projects meeting
the spec run lights-off, with humans paged only for raging fires. Today this
contract is broken at one specific seam — the hitl-escalation label fires
for ~25 distinct failure conditions across phases and caretaker loops, and
every one of them goes straight to a human. Routine, mechanically-resolvable
failures (flaky test, drifted cassette, mergeable rebase, lint regression)
demand the same human attention as genuinely novel failures.
Decision¶
Add a new caretaker loop AutoAgentPreflightLoop that intercepts every
hitl-escalation issue before a human sees it. The loop:
- Polls open
hitl-escalationissues that don't already havehuman-required. - Spawns a Claude Code subprocess (via
AutoAgentRunner, a thinHITLRunnersubclass) in the issue's worktree, with a sub-label-routed prompt and a parameterized "lead engineer" persona. - Up to 3 attempts per issue; subsequent attempts receive prior-attempt diagnoses in their context.
- On success: removes
hitl-escalation, posts a diagnosis comment, links the PR. - On failure: applies
human-required+ diagnosis. Humans watchhuman-requiredexclusively; they no longer watchhitl-escalation.
Hard tool restrictions (no CI config, no force-push, no secrets, no self-modification of principles or auto-agent code) are enforced at the worktree-tool layer.
The deny-list (default ["principles-stuck", "cultural-check"]) bypasses
pre-flight for sub-labels where Auto-Agent could recursively modify the system
that judges it.
Cost / wall-clock / daily-budget caps are wired but defaulted to unlimited —
observability-first, no caps until needed. A new AutoAgentStats System tab
tile + /api/diagnostics/auto-agent endpoint surface the relevant data.
Consequences¶
Positive: - The dark-factory contract is honored at the issue-queue layer. - Routine toil is absorbed by the auto-agent; humans only see what the agent itself bails on. - Human queue diagnoses are richer — failed pre-flights produce structured "what was tried, what was ruled out" comments. - Operator gets observability into "what does HydraFlow's own agent think?" via dashboard + audit JSONL.
Negative:
- A pre-flight runs on every escalated issue, costing LLM tokens (the audit
+ dashboard make this visible).
- Pre-flight latency adds to the time-to-human for issues that genuinely
need a human (bounded by 1 cycle ≈ ~3-10 min).
- The label state machine grows (new labels: human-required,
auto-agent-fatal, auto-agent-exhausted, auto-agent-pr-failed,
cost-exceeded, timeout).
Risks:
- Auto-agent could "fix" something incorrectly. Mitigations: hard tool
restrictions, principles-audit deny-list, attempt cap, human review of
the resulting PR before merge.
- Recursive self-modification risk. Mitigations: tool restrictions on
principles_audit_loop.py / auto_agent_preflight_loop.py /
ADR-0044/0049 implementation files.
- Runaway cost. Mitigations: caps wired into code paths (default off);
audit + dashboard surface unusual spend immediately.
Production subprocess wiring — landed in follow-up PR:
The initial PR landing this ADR shipped the full pipeline scaffolding
(poll → context → agent → decision → audit), the prompt envelope + 9
sub-label playbooks, the dashboard + /api/diagnostics/auto-agent
endpoint, and the adversarial corpus harness — but _build_spawn_fn
was a placeholder. The follow-up PR replaced that placeholder with
AutoAgentRunner (src/preflight/auto_agent_runner.py), which spawns
the real Claude Code subprocess via stream_claude_process,
captures cost via prompt_telemetry, applies the spec §5.2 tool
restrictions (--disallowedTools=WebFetch at the CLI; path-level
restrictions enforced post-hoc by the principles audit), and resolves
to per-issue worktrees via the injected WorkspacePort. The dashboard
will now show real spend and resolution rate as escalations flow
through the loop.
Alternatives Considered¶
- Per-call-site interception (modify each of ~25 escalation sites to call a helper) — rejected: too invasive; couples auto-agent to every loop.
- Extend
DiagnosticLoopto handle all escalations — rejected: conflates the focused diagnostic phase with general-purpose rescue; DiagnosticLoop would balloon. - Investigate-only (no fix) — rejected: too small an unlock; doesn't honor the dark-factory contract.
- Investigate + targeted fixes only (no full agent power) — rejected: doesn't capture the hardest cases (refactor, novel patches); locks the system out of its biggest unlock.
Source-file citations¶
The following files carry this ADR's decisions and must be kept in sync with any supersession:
src/models.py—StateDatafieldsauto_agent_attempts: dict[str, int]andauto_agent_daily_spend: dict[str, float].src/state/_auto_agent.py—AutoAgentStateMixin(attempts get/bump/clear + daily-spend get/add).src/config.py—auto_agent_preflight_enabled,auto_agent_preflight_interval,auto_agent_persona,auto_agent_max_attempts,auto_agent_skip_sublabels,auto_agent_cost_cap_usd,auto_agent_wall_clock_cap_s,auto_agent_daily_budget_usdfields + matching env-overrides.src/preflight/audit.py—PreflightAuditStoredurable JSONL persistence (file_lock + fsync) + 24h/7d aggregates + top-spend.src/preflight/context.py—PreflightContextdataclass +gather_context()(handlesescalation_context=None).src/preflight/decision.py—PreflightResult+apply_decision()pure label state machine for all 6 statuses.src/preflight/agent.py—PreflightAgent+run_preflight+ cost/wall-clock cap watchers.src/preflight/runner.py— prompt rendering helpers +parse_agent_response.src/preflight/auto_agent_runner.py—AutoAgentRunnerreal Claude Code subprocess spawn (production_build_spawn_fn) + per-attempt telemetry toinferences.jsonl+ cost estimate viamodel_pricing.src/sentry/reverse_lookup.py—query_sentry_by_title()(never-raises).src/auto_agent_preflight_loop.py—AutoAgentPreflightLoop._do_workpipeline + reconcile-on-close.prompts/auto_agent/— shared envelope (_envelope.md) +_default.md+ 9 sub-label prompt files.src/dashboard_routes/_diagnostics_routes.py—/api/diagnostics/auto-agentendpoint.src/ui/src/components/diagnostics/AutoAgentStats.jsx— System tab tile.tests/test_auto_agent_preflight_loop.py+tests/test_auto_agent_close_reconciliation.py+tests/test_auto_agent_loop_wiring.py+tests/test_preflight_auto_agent_runner.py— unit + wiring + runner tests.tests/scenarios/test_auto_agent_preflight.py— full-loop scenario tests.tests/auto_agent/adversarial/test_corpus.py+tests/auto_agent/adversarial/corpus/— 9-entry adversarial corpus + harness (run viamake auto-agent-adversarial).