Commit Graph

3 Commits

Author SHA1 Message Date
1a132dcff2 feat(deploy): dedicated "Confirm Deploy" status triggers prod deploy
All checks were successful
CI / test (push) Successful in 18s
CI / test (pull_request) Successful in 18s
Split the overloaded `Approved` Plane status: it served BOTH as the human BRD
gate on `analysis` AND as the silent Phase B prod-deploy trigger on `deploy`
(ORCH-036), so a routine approve could launch a self-hosting prod restart.

ORCH-059 introduces a dedicated logical status `confirm_deploy` ("Confirm
Deploy") that triggers ONLY Phase B on `deploy`; `Approved` stays purely a
pipeline gate.

- plane_sync: map "Confirm Deploy" -> "confirm_deploy" in _PLANE_NAME_TO_KEY;
  intentionally absent from _DEFAULT_STATES => fail-closed (no UUID -> .get
  yields None, no KeyError, no blind deploy).
- webhooks/plane: handle_issue_updated routes "Confirm Deploy" (fail-closed
  .get) to new handle_confirm_deploy (guarded to stage=="deploy") ->
  _try_advance_stage(confirm_deploy=True).
- stage_engine: advance_stage gains keyword-only confirm_deploy=False; Phase B
  block returns early for deploy+finished_agent is None but only initiates the
  deploy when confirm_deploy=True; a plain Approved is a deterministic no-op
  (returns before check_deploy_status -> no false БАГ-8 rollback).
- Phase A CTA now asks the operator for "Confirm Deploy", not "Approved".

Contracts unchanged: STAGE_TRANSITIONS, QG_CHECKS, check_deploy_status, hook
exit codes, Phases A/C, merge-gate, DB schema. Conditional like ORCH-35/36
(self-hosting only). Docs updated (CLAUDE.md, architecture/README.md, CHANGELOG).

Refs: ORCH-059

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-07 19:12:41 +00:00
2f4c553fd8 feat(post-deploy): post-deploy prod monitoring + degradation reaction (ORCH-021)
Extend pipeline responsibility past deploy->done: after the terminal
transition for an applicable repo, arm a ~15min observation window that
probes prod and reacts to a degradation the restart-time health-check
missed ("green deploy, red prod").

- src/post_deploy.py: new leaf module (config + lazy qg/db only).
  Sentinel-file restart-safe state (.post-deploy-state-<repo>/<wi>/),
  no DB migration. probe_signals/classify/decide_action/run_rollback,
  all never-raise.
- Reserved-agent job `post-deploy-monitor` (no-LLM, Variant B, calque of
  deploy-finalizer): self-requeues each tick via enqueue_job.
- Deterministic classify: DEGRADED iff >= fail_threshold consecutive
  health failures OR window 5xx ratio > 5xx_threshold; fail-safe HEALTHY.
- Self-hosting invariant (BR-5/AC-8): a tick NEVER restarts the prod
  orchestrator container -> orchestrator is ALWAYS ALERT_ONLY.
- Conditionality (ORCH-35/36/43/58): kill-switch + CSV repos, empty ->
  self-hosting only.
- QG_CHECKS / STAGE_TRANSITIONS / schema unchanged (AC-12).
- Docs: CHANGELOG, CLAUDE artefact list (16-post-deploy-log.md),
  architecture README, .env.example (ORCH_POST_DEPLOY_*).

Refs: ORCH-021

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-07 14:40:06 +00:00
Dev Agent
7c68d1d812 docs(orchestrator): adopt enduro doc canon + CLAUDE.md + ADR (ORCH-9)
All checks were successful
CI / test (pull_request) Successful in 9s
2026-06-05 12:33:55 +03:00