ADR-0073 — RunsGCLoop: Artifact Retention Enforcement¶
Status: Proposed Date: 2026-05-19
Context¶
HydraFlow's RunRecorder persists per-run artifacts (transcripts, diffs, outputs) under <data_root>/runs/. As the factory operates continuously, the runs directory grows unbounded unless something enforces the configured retention policy. Without a caretaker, operators discover oversized run directories during incident response — not before.
Two config knobs govern retention:
artifact_retention_days— time-to-live for run directories (default: 7 days).artifact_max_size_mb— total size cap for the runs directory (default: 500 MB).
These are already enforced by RunRecorder.purge_expired and RunRecorder.purge_oversized; the gap is autonomous scheduling.
Decision¶
RunsGCLoop (src/runs_gc_loop.py) subclasses BaseBackgroundLoop (ADR-0001, ADR-0029) and runs on the interval set by config.runs_gc_interval. Each tick:
- Calls
RunRecorder.purge_expired(config.artifact_retention_days). - Calls
RunRecorder.purge_oversized(config.artifact_max_size_mb). - Calls
RunRecorder.get_storage_stats()and logs the result.
The loop follows the kill-switch convention (ADR-0049): enabled_cb("runs_gc") is checked first, then config.runs_gc_loop_enabled. No external I/O beyond the local filesystem; no GitHub or LLM calls.
Consequences¶
- The runs directory never grows beyond the configured caps during normal operation.
- Purge events are logged with counts (expired, oversized) for observability.
- Operators can still manually purge by deleting directories directly; the loop does not conflict with manual cleanup.
- If
artifact_retention_daysorartifact_max_size_mbare misconfigured to zero or negative, the loop delegates error handling toRunRecorder.
Alternatives considered¶
- Cron job. Rejected: adds external scheduling infrastructure the dark-factory pattern avoids.
- Purge on startup only. Rejected: long-running deployments need continuous enforcement, not just at boot.