feat(coverage): deterministic test-coverage gate on deploy-staging->deploy edge (ORCH-027)
Some checks failed
CI / test (push) Failing after 48s
CI / test (pull_request) Failing after 42s

Introduce a deterministic (no-LLM) coverage sub-gate that blocks coverage
degradation before a task branch merges into `main`. Existing gates judge only by
the FACT of passing (check_ci_green / check_tests_passed / merge-gate re-test), not
by completeness — so a batch autonomous run (ORCH-088) silently erodes coverage.

Pattern mirrors the security-gate (ORCH-022): leaf src/coverage_gate.py (never-raise)
+ thin check_coverage_gate in QG_CHECKS + _handle_coverage_gate splice in advance_stage,
run AFTER merge-gate (measured on the caught-up HEAD that lands in main) and BEFORE
image-freshness (fail before the expensive docker rebuild).

- measure_coverage: pytest --cov=src --cov-report=json in the per-branch worktree ->
  line coverage %; None on tool error -> fail-open + WARNING by default (FR-6).
- compute_coverage_verdict (pure): absolute | baseline | both + epsilon (NFR-4 anti-flap);
  baseline None -> bootstrap (absolute-only).
- coverage_baseline DB table (additive, CREATE TABLE IF NOT EXISTS) + ratchet-up in
  _handle_merge_verify (deploy->done): atomic compare-and-set under merge-lease, never
  decreases; bootstrap on first merge.
- Artefact 18-coverage-report.md (coverage_status: frontmatter, single source of truth);
  GET /queue `coverage` block; FAIL -> Telegram; optional POST /coverage/baseline override.
- Flags ORCH_COVERAGE_* (kill-switch + self-hosting-only scope) -> enduro untouched;
  STAGE_TRANSITIONS / existing check_* / verdict keys byte-for-byte unchanged (NFR-5/AC-8).
- pytest-cov==5.0.0 added to requirements.txt.

Tests: tests/test_coverage_gate.py (TC-01..TC-15). Frozen QG-registry anti-regress
tests + deploy-staging edge tests updated for the new sub-gate. Full suite green.

Docs: README / adr-0029 / PIPELINE_DOCS / 18-coverage-report.md template (architecture
stage) + CHANGELOG / CLAUDE.md / .env.example (this PR).

Refs: ORCH-027
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-10 01:04:21 +03:00
parent c0dc1940a6
commit b4b993cf63
16 changed files with 1496 additions and 2 deletions

View File

@@ -259,6 +259,38 @@ class Settings(BaseSettings):
security_dep_audit_fail_closed: bool = False
security_secrets_block: bool = True
# ORCH-027: deterministic test-coverage gate on the deploy-staging -> deploy edge
# (AFTER the merge-gate, BEFORE image-freshness). Measures line coverage of src/
# under pytest-cov in the per-branch worktree, compares to an absolute floor and/or
# the ratchet baseline of `main`, and FAILs (rollback to development + developer
# retry) on degradation. Leaf src/coverage_gate.py (never-raise); machine verdict in
# 18-coverage-report.md frontmatter (coverage_status:). See ADR-001-coverage-gate.md.
# coverage_gate_enabled -> SINGLE kill-switch; False -> pipeline 1:1 as before
# ORCH-027 for everyone. Env ORCH_COVERAGE_GATE_ENABLED.
# coverage_gate_repos -> CSV of repos where the gate is REAL; empty -> only
# the self-hosting repo (orchestrator). Mirrors
# security_gate_repos / image_freshness_repos.
# coverage_min_percent -> absolute floor (% line coverage) for policy
# absolute/both. Default 0.0 -> safe rollout: the
# ratchet baseline drives no-regression, the floor
# never false-fails day one.
# coverage_policy -> absolute | baseline | both (default both): which
# condition(s) must hold (D3).
# coverage_epsilon -> small non-negative noise tolerance (%) so jitter at
# the boundary does not bounce a task (NFR-4).
# coverage_tool_fail_closed -> strict mode: a coverage-tool error -> FAIL instead
# of the default fail-open + warning (FR-6). Default
# False (anti-loop, precedent ORCH-061/022).
# coverage_run_timeout_s -> wall-clock budget for the pytest --cov run (mirrors
# merge_retest_timeout_s / security_scan_timeout_s).
coverage_gate_enabled: bool = True
coverage_gate_repos: str = ""
coverage_min_percent: float = 0.0
coverage_policy: str = "both"
coverage_epsilon: float = 0.5
coverage_tool_fail_closed: bool = False
coverage_run_timeout_s: int = 900
# ORCH-061: tolerate KNOWN sandbox-infra FAILs (C9a/C9b) in the staging suite.
# The self-hosting deploy-staging stage looped because scripts/staging_check.py
# exited non-zero on ANY failed check, so two infra-only failures (sandbox bot