Ubiquitous Language¶

43 terms across 3 bounded contexts.

See ADR-0053 for the governing pattern.

ADRCouncilReviewer¶

Kind: service · Context: caretaker · Anchor: src/adr_reviewer.py:ADRCouncilReviewer · Confidence: accepted Aliases: adr council reviewer, council reviewer

ADRCouncilReviewer is the domain service that runs multi-agent council review sessions on proposed Architecture Decision Records. It scans the ADR directory for files marked Status: Proposed, gates each candidate through ADRPreValidator, detects near-duplicate ADRs via similarity scoring, orchestrates multi-round council voting, and routes each outcome to acceptance, rejection, escalation, or duplicate-flagging. ADRReviewerLoop delegates all review logic to this service on every polling cycle.

Invariants: - CreditExhaustedError and AuthenticationError propagate out of the review batch rather than being swallowed per-item, so BaseBackgroundLoop can pause on a fatal billing signal. - Every ADR that reaches Accepted status is guaranteed to carry an Enforced by: line (injected as '(none)' if absent) before it is written back. - Pre-validation must pass before a council session is started; a failing ADR is routed and counted separately without blocking the rest of the batch.

ADRReviewerLoop¶

Kind: loop · Context: caretaker · Anchor: src/adr_reviewer_loop.py:ADRReviewerLoop · Confidence: accepted Aliases: ADR reviewer loop, adr council review loop, adr review loop

Caretaker loop that polls for ADRs in Proposed status and runs council reviews via ADRCouncilReviewer. The loop is intentionally thin: all review logic and output formatting live in ADRCouncilReviewer, keeping tick scheduling and business logic separately testable. Review interval is config.adr_review_interval.

Invariants: - The loop delegates entirely to ADRCouncilReviewer.review_proposed_adrs(); no review logic lives in the loop itself. - Kill-switch is via enabled_cb("adr_reviewer") and config.adr_reviewer_loop_enabled (ADR-0049).

AdrTouchpointAuditorLoop¶

Kind: loop · Context: caretaker · Anchor: src/adr_touchpoint_auditor_loop.py:AdrTouchpointAuditorLoop · Confidence: accepted Aliases: ADR touchpoint auditor loop, adr drift auditor loop, adr touchpoint gate caretaker

Trust-fleet loop that replaces the deleted ADR touchpoint pre-merge gate with an async caretaker (ADR-0056). Periodically scans recently-merged PRs and files hydraflow-find issues when an Accepted or Proposed ADR's cited src/ modules changed without the ADR file appearing in the same diff. Issues are aggregated into one rollup per ADR (ADR-NNNN dedup key) listing all drifted PRs; subsequent ticks update the body in-place. When the ADR file itself appears in a PR diff the rollup is auto-closed. The cursor (state.adr_audit_cursor) is seeded to "now" on first deploy — pre-existing history is not retroactively scanned.

Invariants: - One rollup issue per ADR, never one issue per drifted PR. - First-deploy cursor seed is "now" — historical PRs before deploy are not scanned. - ADR self-appearance in a PR diff closes the rollup; partial fixes (some-but-not-all modules updated) do not close it. - Maximum 3 repair attempts before HITL escalation. - Kill-switch is via enabled_cb("adr_touchpoint_auditor") (ADR-0049).

AgentPort¶

Kind: port · Context: shared-kernel · Anchor: src/ports.py:AgentPort · Confidence: accepted Aliases: agent port, agent runner port

Hexagonal port for agent runner operations used by infrastructure modules. Implemented by agent.AgentRunner via base_runner.BaseRunner. The port was introduced so that infrastructure modules like merge_conflict_resolver can accept the agent runner via dependency injection without importing from the runner layer, keeping the four-layer boundary clean and making those modules independently testable with a mock.

Invariants: - Pure Protocol — no implementation, no state. - Three methods: build_command constructs the CLI invocation; execute runs the subprocess and returns the full transcript; verify_result checks that the agent produced valid commits and that make quality passes. - Parameter names and types are kept identical to the concrete implementations to satisfy structural subtype checks in tests/test_ports.py.

AgentRunner¶

Kind: runner · Context: builder · Anchor: src/agent.py:AgentRunner · Confidence: accepted Aliases: agent runner, implement runner, claude agent runner

Subprocess runner for the implement phase: launches a claude -p process inside an isolated git worktree to implement a GitHub issue. Builds the agent's self-check checklist (extended by recent review escalations), spec-match guidance, and requirements-gap context, then commits the agent's changes locally. Pushing the branch and creating the PR are deliberately left to other phases.

Invariants: - Phase name is fixed: _phase_name == 'implement'. - The runner commits inside the worktree but never pushes or opens a PR — that work belongs to downstream phases. - Self-check checklist is dynamically extended with checklist items from recurring review escalations.

BaseBackgroundLoop¶

Kind: loop · Context: shared-kernel · Anchor: src/base_background_loop.py:BaseBackgroundLoop · Confidence: accepted Aliases: base background loop, loop base class

Abstract base class for every concurrent worker loop in the HydraFlow orchestrator (ADR-0001, ADR-0029). Owns the run-loop skeleton — enabled-check, interval management, status callbacks, BACKGROUND_WORKER_STATUS event publishing, error reporting, and trigger-based early wake-up — leaving subclasses to implement only the domain-specific _do_work and _get_default_interval hooks.

Invariants: - Subclasses must implement abstract methods _do_work and _get_default_interval. - AuthenticationError, AuthenticationRetryError, and CreditExhaustedError propagate; all other exceptions are logged and the loop retries on the next cycle. - Shared dependencies (event_bus, stop_event, status_cb, enabled_cb, sleep_fn, interval_cb) are bundled into a LoopDeps record passed to init.

BotPRPort¶

Kind: port · Context: shared-kernel · Anchor: src/term_proposer_loop.py:BotPRPort · Confidence: accepted Aliases: bot pr port

Hexagonal port used by caretaker loops (TermProposerLoop, others) to open auto-merging bot PRs without coupling to the GitHub-specific PR adapter. Production wiring composes push_branch + create_pr + add_pr_labels behind this Protocol.

Invariants: - Pure Protocol — no implementation; tests use a fake; production uses a thin adapter. - open_bot_pr is the only method; one PR per call; success returns the PR number.

CIMonitorLoop¶

Kind: loop · Context: caretaker · Anchor: src/ci_monitor_loop.py:CIMonitorLoop · Confidence: accepted Aliases: CI monitor loop, ci monitor loop, continuous integration monitor loop

Caretaker loop that watches CI status on the main branch and files a hydraflow-ci-failure issue when CI goes red (ADR-0029, ADR-0065). The loop auto-closes the issue when CI recovers to green. Duplicate issue creation is prevented by tracking the open CI-failure issue number in memory, with rehydration from GitHub labels on first tick to survive restarts cleanly.

Invariants: - At most one open hydraflow-ci-failure issue exists at any time; the loop tracks _open_issue to enforce this. - On startup the loop rehydrates from existing hydraflow-ci-failure issues before its first check — a clean restart never duplicates a pre-existing issue. - Kill-switch is via enabled_cb("ci_monitor") and config.ci_monitor_loop_enabled (ADR-0049).

ContractRefreshLoop¶

Kind: loop · Context: caretaker · Anchor: src/contract_refresh_loop.py:ContractRefreshLoop · Confidence: accepted Aliases: contract refresh loop, cassette refresh loop, fake contract refresh loop

Trust-fleet loop that refreshes cassettes for fake contract tests on a weekly cadence (ADR-0045, ADR-0047, spec §4.2). Each tick: records cassettes against live gh/git/docker/claude into a tmp directory, diffs them against committed cassettes, short-circuits on hash-matching repeat drift (dedup via DedupStore), stages drifted cassettes, and opens a contract-refresh: YYYY-MM-DD (<adapters>) PR labeled contract-refresh + auto-merge. A post-refresh replay gate (make trust-contracts) runs after staging; failure opens a companion hydraflow-find + fake-drift issue so the factory dispatches a fake-repair implementer — PR auto-merge is not blocked by the replay gate. Per-loop telemetry spans (trace_collector.emit_loop_subprocess_trace) cover each recorder subprocess and the replay gate.

Invariants: - Dedup is keyed on the drift-report hash; identical drift on consecutive ticks does not refile the same PR. - Dedup is recorded only after the PR is opened, never before — transient failures do not silently block the next tick. - The replay gate failure opens a companion issue but does not block the auto-merge PR. - Kill-switch is via enabled_cb("contract_refresh") (ADR-0049); no config field.

CorpusLearningLoop¶

Kind: loop · Context: caretaker · Anchor: src/corpus_learning_loop.py:CorpusLearningLoop · Confidence: accepted Aliases: corpus learning loop, adversarial corpus loop, escape signal ingestion loop

Trust-fleet loop that autonomously grows the adversarial test corpus from escape signals (ADR-0045, spec §4.1 v2). Each tick: reads open issues tagged with the escape label from the last DEFAULT_LOOKBACK_DAYS days, synthesizes each into a SynthesizedCase, runs three self-validation gates (harness acceptance, expected catcher trips, unambiguity across all catchers), materializes passing cases to tests/trust/adversarial/cases/<slug>/, and opens auto-merge PRs. A DedupStore keyed on corpus_learning:<issue_number>:<slug> prevents re-filing the same case on subsequent ticks.

Invariants: - All three validation gates must pass before a case reaches disk: harness accepts it, expected catcher trips, no other catcher also trips. - Cases that trip more than one catcher are rejected as ambiguous before they can corrupt the corpus. - No corpus_learning_enabled config field exists — kill-switch is purely via enabled_cb("corpus_learning") (spec §12.2, ADR-0049).

DependabotMergeLoop¶

Kind: loop · Context: caretaker · Anchor: src/dependabot_merge_loop.py:DependabotMergeLoop · Confidence: accepted Aliases: dependabot merge loop, bot pr merge loop, auto-merge bot PRs loop

Caretaker loop that polls open PRs via GitHubDataCache and auto-merges those authored by Dependabot and other configured bot accounts after CI passes (ADR-0054, ADR-0057, ADR-0058). The list of bot authors is configurable via config; the loop compares pr.author.lower() against the set. Only PRs with a passing ReviewVerdict are merged — CI must be green before the loop touches a PR.

Invariants: - Author matching is case-insensitive. - CI must pass (ReviewVerdict green) before any merge is attempted; the loop never force-merges. - Kill-switch is via enabled_cb("dependabot_merge") and config.dependabot_merge_loop_enabled (ADR-0049).

DiagnosticLoop¶

Kind: loop · Context: caretaker · Anchor: src/diagnostic_loop.py:DiagnosticLoop · Confidence: accepted Aliases: diagnostic loop, diagnostic self-healing loop, escalation diagnostic loop

Caretaker loop that picks up escalated HITL issues and attempts autonomous self-healing via DiagnosticRunner (ADR-0050). For each escalated issue the loop runs a diagnosis that identifies severity, root cause, affected files, and a fix plan, then posts a structured diagnostic comment with human-guidance section. Attempts are tracked via AttemptRecord; the loop respects the attempt budget before escalating further. CreditExhaustedError is re-raised via reraise_on_credit_or_bug so attempt budgets are not silently burned against an exhausted billing signal.

Invariants: - reraise_on_credit_or_bug(exc) is called in the broad except block — credit exhaustion is never silently swallowed. - Severity is structured (P0_SECURITY through P4_HOUSEKEEPING) and appears in the diagnostic comment; reviewers do not need to parse free text to triage. - Kill-switch is via enabled_cb("diagnostic") (ADR-0049).

DiagramLoop¶

Kind: loop · Context: caretaker · Anchor: src/diagram_loop.py:DiagramLoop · Confidence: accepted Aliases: diagram loop, arch regen loop, architecture regen loop, L24

Caretaker loop (L24) that keeps docs/arch/generated/ in sync with src/ by running the arch-regen runner on each tick and opening a single idempotent bot PR (arch-regen-auto branch) when drift is detected (ADR-0029, ADR-0049). If no drift is found the tick exits silently. A secondary functional-area coverage check fires after regen; failures open a separate chore(arch): unassigned functional area issue via PRPort.find_existing_issue + create_issue, distinct from the regen PR.

Invariants: - The regen PR always targets the fixed branch arch-regen-auto; a force-push updates any existing open PR rather than opening duplicates. - The functional-area coverage issue is separate from the regen PR — one concern per artifact. - Kill-switch is HYDRAFLOW_DISABLE_DIAGRAM_LOOP=1 (ADR-0049 convention; no config field).

EdgeProposerLoop¶

Kind: loop · Context: caretaker · Anchor: src/edge_proposer_loop.py:EdgeProposerLoop · Confidence: accepted Aliases: edge proposer loop, UL edge proposer loop, term edge loop

Caretaker loop that densifies the ubiquitous-language term context map by proposing depends_on and implements edges between existing terms via static import graph analysis (ADR-0058, ADR-0060, ADR-0062). Each tick walks the import graph produced by build_import_graph, infers structural relationships between terms, and opens auto-merge bot PRs labeled hydraflow-ul-edges with updated term files. Unlike EntryEvidenceLoop, edge inference is static-analysis-driven, not LLM-driven. The hydraflow-ul-edges label causes review_phase to skip agent pipeline routing.

Invariants: - Edge inference is based on the import graph (build_import_graph), not LLM judgment — results are deterministic for a given codebase state. - The hydraflow-ul-edges label causes review_phase to skip the agent pipeline; the structural inference IS the work (ADR-0058). - Kill-switch is via enabled_cb("edge_proposer") (ADR-0049).

EntryEvidenceLoop¶

Kind: loop · Context: caretaker · Anchor: src/entry_evidence_loop.py:EntryEvidenceLoop · Confidence: accepted Aliases: entry evidence loop, wiki entry evidence loop, UL evidence loop

Caretaker loop that backfills Term.evidence links by matching wiki entries to ubiquitous-language terms via LLM (ADR-0062). Each tick processes up to entry_evidence_max_entries_per_tick unmatched entries, calls the LLM once per entry to identify genuinely related terms (not superficial name-fragment matches), and opens auto-merge bot PRs labeled hydraflow-ul-evidence with the updated term files. A DedupStore prevents re-processing entries that already have evidence on subsequent ticks. Mirrors EdgeProposerLoop (ADR-0058) and TermProposerLoop (ADR-0054) in structure but is LLM-driven rather than static-analysis-driven.

Invariants: - One LLM call per wiki entry per tick — no batching across entries within a single call. - The hydraflow-ul-evidence label causes review_phase to skip agent pipeline routing; the LLM-driven matching IS the work (ADR-0062). - Kill-switch is via enabled_cb("entry_evidence") (ADR-0049); no config field. - entry_evidence_max_entries_per_tick bounds the LLM spend per cycle.

EventBus¶

Kind: service · Context: shared-kernel · Anchor: src/events.py:EventBus · Confidence: accepted Aliases: pub/sub bus, hydraflow event bus

Async pub/sub bus that fans HydraFlowEvent objects out to subscriber asyncio.Queues, retains a bounded in-memory history for replay, and optionally persists every event through an EventLog. Auto-injects the active session_id and repo slug onto outbound events so downstream consumers always see a fully tagged event stream.

Invariants: - History length is capped at max_history (default 5000); oldest entries are evicted when full. - Slow subscribers do not block the publisher: a full subscriber queue drops its oldest entry before the new event is enqueued. - History mutation is serialized through an asyncio.Lock.

FakeCoverageAuditorLoop¶

Kind: loop · Context: caretaker · Anchor: src/fake_coverage_auditor_loop.py:FakeCoverageAuditorLoop · Confidence: accepted Aliases: fake coverage auditor loop, fake coverage gap detector, uncassetted method detector

Trust-fleet loop that detects uncovered methods on fake adapters under src/mockworld/fakes/ via ast.parse (ADR-0045, ADR-0056, ADR-0057, spec §4.7). Compares two method sets: adapter-surface (public methods, covered by cassettes under tests/trust/contracts/cassettes/<adapter>/) and test-helper (scenario drivers like script_*, fail_service, heal_service, covered by scenario tests). Files one rollup issue per (fake_class, gap_kind) labeled hydraflow-find + fake_coverage_gap. Subsequent ticks update the body via PRPort.update_issue_body — appending newly-uncovered methods and striking through methods that gained coverage. Escalates after 3 attempts to hitl_escalation + fake_coverage_stuck.

Invariants: - One rollup issue per (fake_class, gap_kind) — never one issue per missing method. - Issue bodies are updated in-place on repeat ticks, not replaced. - Maximum 3 repair attempts before HITL escalation. - Kill-switch is via enabled_cb("fake_coverage_auditor") (ADR-0049).

FlakeTrackerLoop¶

Kind: loop · Context: caretaker · Anchor: src/flake_tracker_loop.py:FlakeTrackerLoop · Confidence: accepted Aliases: flake tracker loop, flaky test detector, flake detector loop

Trust-fleet loop that detects persistently flaky tests by parsing JUnit XML from the last 20 RC runs and counting mixed pass/fail occurrences per test (spec §4.5, ADR-0065). When a test's flake count reaches flake_threshold (default 3, comparison >=), the loop files a hydraflow-find + flaky-test issue. After 3 repair attempts on the same test_name the loop escalates to a second issue labeled hitl-escalation + flaky-test-stuck. The dedup key for the escalation issue clears when the escalation issue is closed.

Invariants: - The rolling window is fixed at the last 20 RC runs; earlier history is not scanned. - Flake detection requires at least one pass AND one fail within the window — pure-fail tests are not flakes. - Maximum 3 repair attempts per test before HITL escalation; the dedup key for the hydraflow-find issue does not reset until the escalation is resolved. - Kill-switch is via enabled_cb("flake_tracker") (ADR-0049).

GitHubCacheLoop¶

Kind: loop · Context: caretaker · Anchor: src/github_cache_loop.py:GitHubCacheLoop · Confidence: accepted Aliases: github cache loop, github data cache loop, github poller

Centralized GitHub data poller that replaces the pattern where every dashboard endpoint and background worker makes its own gh api calls (ADR-0041). A single GitHubCacheLoop polls GitHub on a fixed interval and stores results in GitHubDataCache — in memory and on disk. Dashboard endpoints and background workers read from the cache instantly rather than hitting the API. Write operations (create PR, merge, comment, label swap) still call gh directly because they need immediate confirmation.

Invariants: - Only one instance per repo runtime; all read consumers share the same cache snapshot. - Write operations bypass the cache and call gh directly. - Cache staleness is observable: each CacheSnapshot carries a fetched_at timestamp; age_seconds is infinite until the first poll completes.

GitHubCacheLoop¶

Kind: loop · Context: caretaker · Anchor: src/github_cache_loop.py:GitHubCacheLoop · Confidence: accepted Aliases: github cache loop, github data cache loop, github poller

Centralized GitHub data poller that replaces the pattern where every dashboard endpoint and background worker makes its own gh api calls (ADR-0041). A single GitHubCacheLoop polls GitHub on a fixed interval and stores results in GitHubDataCache — in memory and on disk. Dashboard endpoints and background workers read from the cache instantly rather than hitting the API. Write operations (create PR, merge, comment, label swap) still call gh directly because they need immediate confirmation.

Invariants: - Only one instance per repo runtime; all read consumers share the same cache snapshot. - Write operations bypass the cache and call gh directly. - Cache staleness is observable: each CacheSnapshot carries a fetched_at timestamp; age_seconds is infinite until the first poll completes.

HydraFlowConfig¶

Kind: aggregate · Context: shared-kernel · Anchor: src/config.py:HydraFlowConfig · Confidence: accepted Aliases: hydraflow config, config aggregate, orchestrator config

Pydantic-validated runtime configuration aggregate for the HydraFlow orchestrator. Bundles issue selection (ready labels, batch size, repo), per-phase concurrency caps (max_workers, max_planners, max_reviewers, max_triagers, max_hitl_workers), required-plugin manifests, language plugins, and per-phase skill whitelists into a single object passed to every loop and runner. Edited via the dashboard or config JSON file, not environment variables.

Invariants: - Worker concurrency fields default to 1 and are bounded by ge=1, le=10 (max_hitl_workers le=5). - batch_size is bounded ge=1, le=50. - repo is auto-detected from the git remote when left empty.

IssueFetcherPort¶

Kind: port · Context: shared-kernel · Anchor: src/ports.py:IssueFetcherPort · Confidence: accepted Aliases: issue fetcher port, github issue fetching port

Hexagonal port for fetching GitHub issues from the upstream source of truth. Exposes two methods consumed by domain code (phases and background loops): fetch_issue_by_number for single-issue lookups and fetch_issues_by_labels for label-scoped batch fetches. Implemented by issue_fetcher.IssueFetcher, which shells out to the gh CLI and applies internal caching and rate-limit back-off.

Invariants: - Pure Protocol — no implementation, no state. - Only the two methods domain code actually calls are declared here; the concrete IssueFetcher carries additional infrastructure methods (PR cache, collaborator cache) that stay off the port. - fetch_issues_by_labels accepts an optional exclude_labels list and a require_complete flag so callers can narrow results without extra filtering passes.

IssueStorePort¶

Kind: port · Context: shared-kernel · Anchor: src/ports.py:IssueStorePort · Confidence: accepted Aliases: issue store port, issue queue port, work queue port

Hexagonal port for the in-memory issue work-queue — exposes only the queue accessors that domain code (phases, background loops, phase utilities) actually uses (get_triageable, get_plannable, get_implementable, get_reviewable, ...). Implemented by issue_store.IssueStore; orchestrator-only and dashboard-only methods stay on the concrete class to keep the domain surface narrow.

Invariants: - Pure Protocol — no implementation, no state. - Only domain-consumed methods are declared; orchestrator and dashboard methods deliberately stay off the port.

LiveCorpusReplayLoop¶

Kind: loop · Context: caretaker · Anchor: src/live_corpus_replay_loop.py:LiveCorpusReplayLoop · Confidence: accepted Aliases: live corpus replay loop, shadow corpus replay, shadow drift replay

Trust-fleet loop that replays live shadow-corpus samples against registered fake-adapter or shape validators. LiveCorpusReplayLoop files hydraflow-find / shadow-drift issues when live adapter output diverges from the current fake or schema contract, and escalates to HITL only after the configured drift retry budget is exhausted.

Invariants: - Empty shadow corpus is an idle tick, not an error. - Drift issues are auto-agent routed before any human escalation. - Dispatcher registration is keyed by (adapter, command) so cassette

MergeStateWatcherLoop¶

Kind: loop · Context: caretaker · Anchor: src/merge_state_watcher_loop.py:MergeStateWatcherLoop · Confidence: accepted Aliases: merge state watcher loop, merge state watcher, conflict rebase loop

Caretaker loop that periodically scans all open PRs for merge conflicts and auto-rebases or escalates them (ADR-0029). The filter is intentionally broad: RC promotion PRs, Dependabot bumps, agent PRs, and manual PRs all benefit from auto-rebase when they go DIRTY against main. PRs already labeled hydraflow-hitl (being handled by PRUnsticker) or hydraflow-review (active reviewer worktree) are skipped to avoid stepping on in-progress work. Delegates the actual conflict-detection and rebase logic to MergeStateWatcher.

Invariants: - Default tick interval is 600 seconds (10 minutes). - PRs labeled hydraflow-hitl or hydraflow-review are skipped. - Kill-switch is via enabled_cb("merge_state_watcher") and config.merge_state_watcher_loop_enabled.

ObservabilityPort¶

Kind: port · Context: shared-kernel · Anchor: src/ports.py:ObservabilityPort · Confidence: accepted Aliases: observability port, sentry port, error capture port

Hexagonal port for the observability boundary (ADR-0044 P7.7). Exposes five methods: capture_exception, capture_message, breadcrumb, set_measurement, and flush. The production adapter is SentryObservabilityAdapter in src/observability/sentry_adapter.py, which wraps sentry_sdk and silently degrades to a no-op when the SDK is not installed. The port is intentionally minimal — rich APIs drag every backend into the union.

Invariants: - Pure Protocol — no implementation, no state. - The adapter is a no-op when sentry_sdk is absent; every method returns silently so callers never need a try/except around port calls. - Domain code never imports sentry_sdk directly; all observability routes through the injected ObservabilityPort so a future OTLP, structured-log, or sidecar adapter can replace Sentry without touching call sites.

PricingRefreshLoop¶

Kind: loop · Context: caretaker · Anchor: src/pricing_refresh_loop.py:PricingRefreshLoop · Confidence: accepted Aliases: pricing refresh loop, litellm pricing poller, model pricing caretaker

Daily caretaker loop that fetches LiteLLM's model_prices_and_context_window.json via stdlib urllib, filters to Anthropic-provider entries (normalizing Bedrock keys), diffs against src/assets/model_pricing.json, and opens or updates a pricing-refresh-auto PR when drift is detected (ADR-0029, ADR-0049). A bounds guard rejects suspicious price moves. Bounds violations, parse errors, and schema errors open a single deduplicated [pricing-refresh] hydraflow-find issue. Network errors log-and-retry on the next tick without filing issues.

Invariants: - Kill-switch is the HYDRAFLOW_DISABLE_PRICING_REFRESH=1 env var. - PR is always on the fixed branch pricing-refresh-auto; no-op ticks do not open a PR. - Bounds violations are separate from network errors; each has a distinct response path.

PRPort¶

Kind: port · Context: shared-kernel · Anchor: src/ports.py:PRPort · Confidence: accepted Aliases: pr port, pull request port, github pr port

Hexagonal port for GitHub PR, label, and CI operations — branch push, PR creation/merge, RC-branch creation, and the related label manipulations consumed by domain phases and background loops. Implemented by pr_manager.PRManager; signatures are kept identical to the concrete class to enable structural subtype checks.

Invariants: - Pure Protocol — no implementation, no state. - Method signatures must match pr_manager.PRManager exactly so structural subtype checks in tests/test_ports.py pass.

PRUnstickerLoop¶

Kind: loop · Context: caretaker · Anchor: src/pr_unsticker_loop.py:PRUnstickerLoop · Confidence: accepted Aliases: pr unsticker loop, pr unsticker, hitl unsticker

Caretaker loop that polls HITL items and delegates to PRUnsticker to resolve all HITL causes — merge conflicts, CI failures, and generic stuck states. Operates only on HITL issues that currently have an open PR. The loop wraps the unsticker worker in the standard BaseBackgroundLoop tick-and-interval skeleton, keeping the HITL resolution logic in PRUnsticker separate from the polling infrastructure.

Invariants: - Only processes HITL issues with an associated open PR (item.pr > 0); issues without PRs are skipped. - Kill-switch is via enabled_cb("pr_unsticker") and config.pr_unsticker_loop_enabled (ADR-0049). - Interval is driven by config.pr_unstick_interval.

RCBudgetLoop¶

Kind: loop · Context: caretaker · Anchor: src/rc_budget_loop.py:RCBudgetLoop · Confidence: accepted Aliases: rc budget loop, rc ci wall-clock detector, rc duration regression detector

4-hour RC CI wall-clock regression detector (ADR-0045 §4.8). Reads the last 30 days of rc-promotion-scenario.yml runs via gh run list, extracts per-run wall-clock duration, and emits a hydraflow-find + rc-duration-regression issue when the newest run trips either a gradual-bloat signal (current run ≥ 1.5× rolling median) or a sudden-spike signal (current run ≥ 2.0× recent-five maximum). Both signals are independent; both may fire on the same tick with distinct dedup keys. After 3 unresolved attempts per signal the loop escalates to hitl-escalation + rc-duration-stuck.

Invariants: - Kill-switch is via enabled_cb("rc_budget") only — no rc_budget_enabled config field (ADR-0049 §12.2). - Requires at least 5 historical data points before emitting any signal. - Dedup keys clear on escalation-close.

ReportIssueLoop¶

Kind: loop · Context: builder · Anchor: src/report_issue_loop.py:ReportIssueLoop · Confidence: accepted Aliases: report issue loop, bug report loop, dashboard report processor

Background loop that dequeues pending bug reports from state, saves any attached screenshots to temp files, and invokes the Claude CLI with /hf.issue so the agent can see the image, research the codebase, and file a well-structured GitHub issue (ADR-0028). This is the same issue-filing flow triggered by dashboard bug reports. Supports base64-encoded screenshot payloads and scans them for secrets before saving. Caps retries at 5 attempts per report.

Invariants: - Screenshot payloads are scanned for secrets before being written to disk. - Reports cap at _MAX_REPORT_ATTEMPTS (5) before being abandoned. - Kill-switch is via enabled_cb("report_issue") (ADR-0049).

RepoWikiStore¶

Kind: service · Context: shared-kernel · Anchor: src/repo_wiki.py:RepoWikiStore · Confidence: accepted Aliases: repo wiki store, wiki store, per-repo wiki

File-based per-repo wiki manager (ADR-0032). Owns the on-disk layout for both the self-repo wiki (flattened directly under wiki_root so it can live at docs/wiki/ alongside code) and managed-repo wikis (nested under wiki_root/owner/repo). Provides ingest, lookup, indexing, lint, and append-only operation logging across the topic pages and structured WikiIndex.

Invariants: - Self-repo pages live directly under wiki_root; every other slug is nested under wiki_root/owner/repo. - ingest() updates topic pages, refreshes index.json/index.md, and appends to log.jsonl in a single operation. - When a tracked_root with per-entry layout is configured, reads prefer it and fall back to the legacy topic-page layout.

ReviewInsightStorePort¶

Kind: port · Context: shared-kernel · Anchor: src/ports.py:ReviewInsightStorePort · Confidence: accepted Aliases: review insight store port, review insight port

Hexagonal port for persisting and querying recurring reviewer-feedback patterns. Implemented by review_insights.ReviewInsightStore. ReviewPhase injects this port to record each review outcome and to query which feedback categories have been seen often enough to inject mandatory guidance blocks into the next agent prompt. The port decouples ReviewPhase from the JSONL file-storage backend.

Invariants: - Pure Protocol — no implementation, no state. - Methods cover the full lifecycle: append_review writes a new record, load_recent reads recent history, get_proposed_categories and mark_category_proposed gate category escalation, and record_proposal, load_proposal_metadata, and update_proposal_verified track whether a proposed mandatory block reduced the pattern.

RouteBackCounterPort¶

Kind: port · Context: shared-kernel · Anchor: src/route_back.py:RouteBackCounterPort · Confidence: accepted Aliases: route-back counter port, route back counter, precondition retry counter port

Hexagonal port for the per-issue route-back counter. Lives in src/route_back.py alongside RouteBackCoordinator. The coordinator depends on this port rather than StateTracker directly, so unit tests can wire a tiny in-memory dict implementation without pulling in the full state layer. Production wiring connects StateTracker as the concrete adapter.

Invariants: - Pure Protocol — no implementation, no state. - Three methods: get_route_back_count reads the current count; increment_route_back_count returns the new count after incrementing; decrement_route_back_count rolls back an increment when a subsequent label swap fails, preventing transient network blips from burning route-back budget without any actual route-back occurring. - decrement_route_back_count must be a no-op (returning 0) when the counter is already at zero.

SentryLoop¶

Kind: loop · Context: caretaker · Anchor: src/sentry_loop.py:SentryLoop · Confidence: accepted Aliases: sentry loop, sentry ingestion loop, sentry issue poller

Background loop that polls the Sentry API for unresolved issues across configured projects, deduplicates them against already-filed GitHub issues, and invokes a Claude agent via /hf.issue to research the codebase and file a properly triaged GitHub issue — the same flow as dashboard bug reports (ADR-0055). Requires Sentry credentials in config. Sentry captures real code bugs only; transient failures and operational noise are excluded by the loop's dedup and filtering logic.

Invariants: - Issues are deduplicated before filing; re-ingestion of the same Sentry event does not produce duplicate GitHub issues. - Kill-switch is via enabled_cb("sentry") (ADR-0049). - Sentry errors in the ERROR+ level range trigger the issue-filing path; WARNING and below are skipped.

SkillPromptEvalLoop¶

Kind: loop · Context: caretaker · Anchor: src/skill_prompt_eval_loop.py:SkillPromptEvalLoop · Confidence: accepted Aliases: skill prompt eval loop, skill prompt drift detector, corpus health auditor

Weekly background loop with two responsibilities. First, it runs the full adversarial skill-prompt corpus on a fixed schedule, catching regressions the RC-gate subset sampling misses (ADR-0045 §4.6). Second, it samples 10% of provenance: learning-loop corpus cases, flagging any that the expected_catcher skill passes — a weak-case signal the §4.1 learner uses for corpus quality improvement. Files skill-prompt-drift issues on PASS→FAIL transitions and corpus-case-weak issues for human triage.

Invariants: - Both subsystems share a single loop tick; neither runs independently. - Issues are deduplicated before filing; the same regression does not produce duplicate bead spam. - Subclasses BaseBackgroundLoop; kill-switch is via enabled_cb("skill_prompt_eval") (ADR-0049).

StaleIssueGCLoop¶

Kind: loop · Context: caretaker · Anchor: src/stale_issue_gc_loop.py:StaleIssueGCLoop · Confidence: accepted Aliases: stale issue gc loop, stale hitl gc loop, hitl stale closer

Caretaker loop that auto-closes stale HITL escalation issues (ADR-0029). Scope is specifically open issues carrying the configured HITL label that have been inactive beyond stale_issue_threshold_days. Posts a farewell comment, then closes. Caps at 10 closures per cycle to avoid GitHub rate-limiting. Distinct from StaleIssueLoop, which handles stale general issues with no HydraFlow lifecycle label — the two loops share only the BaseBackgroundLoop framework and have zero business-logic overlap.

Invariants: - Maximum 10 issues closed per tick to respect GitHub rate limits. - Only closes issues carrying the HITL label (config.hitl_label), not general issues. - StaleIssueGCLoop and StaleIssueLoop are fully separate; do not conflate them.

StateTracker¶

Kind: service · Context: shared-kernel · Anchor: src/state/__init__.py:StateTracker · Confidence: accepted Aliases: state tracker, state facade, state mixin facade

JSON-file backed state service for crash recovery. Composes ~30 domain mixins (issue, workspace, HITL, review, route-back, epic, session, worker, principles audit, sentry, trust fleet, ...) into a single facade that writes /.hydraflow/state.json after every mutation and rotates timestamped backups so a corrupt primary file can be restored from .bak.

Invariants: - Every mutating method persists state to disk before returning. - Issue/PR/epic numbers are stored as string keys; helpers convert to int on read. - On corrupt primary file, load() falls back to the most recent .bak before defaulting to an empty StateData.

Task¶

Kind: entity · Context: builder · Anchor: src/models.py:Task · Confidence: accepted Aliases: work item, ticket

A source-agnostic work item abstraction representing tasks from any source (GitHub issues, Linear tickets, etc.) that flow through HydraFlow's pipeline. Tasks carry metadata, support typed relationships via TaskLink, and serve as the unified representation for all work regardless of origin. Relationship extraction follows first-match precedence across compiled regex patterns.

Invariants: - TaskLink relationships extracted via regex patterns with first-match precedence per target_id - URLs validated as empty or http(s):// via AfterValidator - Timestamps validated as empty or ISO 8601 format

TermPrunerLoop¶

Kind: loop · Context: caretaker · Anchor: src/term_pruner_loop.py:TermPrunerLoop · Confidence: accepted Aliases: term pruner loop, UL pruner, glossary pruner

Caretaker background loop that autonomously prunes stale terms from the ubiquitous-language glossary (ADR-0057). On each tick it scans every confidence == "accepted" term in docs/wiki/terms/; for any term whose code_anchor no longer resolves in the live symbol index (built by ubiquitous_language.build_symbol_index), it opens an auto-merging bot PR that flips confidence to deprecated and records a superseded_reason with the broken anchor. The loop makes no LLM calls — detection is purely structural.

Invariants: - Kill-switch: enabled_cb("term_pruner") AND config.term_pruner_enabled — both must be true for work to proceed. - Opens at most one PR per tick, bundling all eligible terms into a single hydraflow-ul-deprecated-labelled PR. - ReviewPhase skips routing for PRs carrying TERM_PRUNER_PR_LABEL so the deprecation PR is not sent through the agent pipeline. - Companion to TermProposerLoop: together they implement the two-tick grow/prune cycle that keeps make lint-ul anchor-resolution green without human intervention.

WikiRotDetectorLoop¶

Kind: loop · Context: caretaker · Anchor: src/wiki_rot_detector_loop.py:WikiRotDetectorLoop · Confidence: accepted Aliases: wiki rot detector loop, wiki rot detector, cite freshness loop

Trust-fleet caretaker loop (ADR-0045 §4.9) that detects broken code citations in per-repo wikis. On each tick it walks every RepoWikiStore-registered repo's wiki entries, extracts code references via three patterns (path.py:symbol, dotted src.module.Class, and bare identifiers inside fenced code blocks as hints only), then verifies each hard cite. HydraFlow-self cites are checked via AST introspection; managed-repo cites are verified via grep over wiki markdown mirrors. For each broken cite the loop files a hydraflow-find + wiki-rot issue via PRManager, with a fuzzy-match suggestion from difflib.get_close_matches when the containing module still exists. After three unresolved attempts for a given slug+cite pair the loop escalates to hitl-escalation + wiki-rot-stuck.

Invariants: - Kill-switch: enabled_cb("wiki_rot_detector") only — no config field (ADR-0049 trust-fleet convention). - Calls reraise_on_credit_or_bug(exc) in its broad except block to prevent CreditExhaustedError from being silently swallowed. - Does not run on startup (run_on_startup=False) — the first tick is deferred to the normal interval.

WorkspaceGCLoop¶

Kind: loop · Context: caretaker · Anchor: src/workspace_gc_loop.py:WorkspaceGCLoop · Confidence: accepted Aliases: workspace gc loop, workspace garbage collector, worktree gc loop

Background caretaker loop that periodically garbage-collects stale worktrees and orphaned branches. Handles three leak classes: worktrees tracked in StateTracker whose PR has been merged or closed, orphaned worktree directories on disk with no StateTracker entry, and orphaned remote branches with no open PR. Catches worktrees that leak when PRs are merged manually, via HITL resolution, or when implementations fail or crash mid-cleanup.

Invariants: - Kill-switch: enabled_cb("workspace_gc") AND config.workspace_gc_loop_enabled — both must be true to run. - Caps at _MAX_GC_PER_CYCLE = 20 collections per tick to avoid long-running passes. - State removal happens before WorkspacePort.destroy() so a crash between the two steps leaves the entry gone; destroy() is idempotent. - An optional is_in_pipeline_cb guard prevents GC of issues still being actively processed by a phase.

WorkspacePort¶

Kind: port · Context: shared-kernel · Anchor: src/ports.py:WorkspacePort · Confidence: accepted Aliases: workspace port, worktree port, git workspace port

Hexagonal port for git workspace lifecycle operations — create and destroy isolated worktrees per issue, merge main into a worktree, list conflicting files, hard-reset to origin/main, abort an in-progress merge, and run post-work cleanup. Implemented by workspace.WorkspaceManager; the abstraction is what lets phases stay agnostic to the concrete worktree machinery.

Invariants: - Pure Protocol — no implementation, no state. - Each managed worktree is keyed by issue_number; create() returns the worktree path used for subsequent calls.