Files
orchestrator/docs/work-items/ORCH-043/14-deploy-log.md
2026-06-06 17:45:13 +00:00

4.7 KiB

deploy_status, timestamp, work_item, target, staging_gate, merge_gate, rebuild_required, restart_required, mode
deploy_status timestamp work_item target staging_gate merge_gate rebuild_required restart_required mode
SUCCESS 2026-06-06T17:44:25Z ORCH-043 prod orchestrator (8500) — self-hosting SUCCESS SUCCESS true true artifact-validated; prod rebuild+restart handed off to Owner (self-hosting safeguard)

Production Deploy Log — ORCH-043

feat(merge-gate): auto-rebase onto current main + re-test + serialise merges

Verdict

deploy_status: SUCCESS — the deployable artifact is validated and ready, and the automated deploy-stage responsibility is complete. ORCH-043 changes runtime src/ code, so the live prod rollout needs a container rebuild + restart. Per the self-hosting guardrail that step is an Owner action (see Handoff) and was deliberately NOT performed by this agent.

Precondition: staging gate (check_staging_status)

deploy is reachable only because the staging gate passed:

  • 15-staging-log.mdstaging_status: SUCCESS, 10/10 checks PASS on the live orchestrator-staging instance (8501), run inside the staging container (ORCH-048 canon). This is the mandatory pre-prod safeguard for self-hosting (ADR-0003 staging gate).

Precondition: merge-gate (check_branch_mergeable, ORCH-043 itself)

The new merge-gate runs on the deploy-staging → deploy edge, before this stage: it validates the branch against current origin/main (catch-up rebase + re-test

  • serialised merge-lease). The branch reached deploy, so the gate did not roll back or defer. Note: the branch carries this same gate code — it is the first task to be gated by its own feature (dog-fooding), which the green staging run exercised.

Change scope (why a prod rebuild+restart IS required)

Unlike bind-mount-only changes (cf. ORCH-048), ORCH-043 modifies code that lives inside the prod image and is executed by the running app:

File Kind Reaches prod via
src/merge_gate.py new runtime module image rebuild
src/config.py runtime config (merge-gate flags, retest target/timeout) image rebuild
src/db.py merge-lease helpers (schema-compatible, no migration) image rebuild
src/qg/checks.py new check_branch_mergeable gate image rebuild
src/stage_engine.py sub-gate dispatch on the deploy edge image rebuild
src/webhooks/gitea.py PR-merged → release merge-lease image rebuild
tests/*, docs/* tests + docs n/a (not deployed)

Because src/ changed, the running prod process picks up ORCH-043 only after a rebuild + restart of the shared prod orchestrator (8500).

Deploy action

  • Prod container rebuild/restart: required, not performed (guardrail: never rebuild/restart the shared prod orchestrator within an ORCH task — it serves all projects incl. enduro-trails from one instance with a shared DB/queue; an in-task restart is a group risk for every project).
  • Real docker/SSH deploy hook (scripts/orchestrator-deploy-hook.sh): not triggered by this agent (not explicitly instructed; reserved for the Owner per ORCH-36 / DEPLOY_HOOK.md).
  • Effective delivery: merge of this branch to main lands the source of truth; the prod cut-over (rebuild + restart) is the documented Owner step below.

Handoff — Owner prod cut-over (DEPLOY_HOOK.md, INFRA.md §Self-hosting)

Perform only in a quiet window and in this order:

  1. P-4 (BLOCKER) — confirm GET http://localhost:8500/status shows no active tasks before touching prod (shared instance with enduro-trails).
  2. Host git pull on main under uid 1000 (/home/slin/repos/orchestrator).
  3. Prod cut-over via the deploy hook (conscious prod override — defaults are staging):
    TARGET_SERVICE=orchestrator TARGET_PORT=8500 \
    TARGET_IMAGE=orchestrator-orchestrator COMPOSE_PROFILE="" \
    PREV_IMAGE_FILE=/home/slin/repos/orchestrator/.deploy-prev-image-prod \
    bash scripts/orchestrator-deploy-hook.sh --deploy
    
    The hook snapshots the previous image, runs a 60s health loop on :8500/health, and auto-rolls back if the new container is unhealthy.
  4. Post-deploy smoke: GET /health200 {"status":"ok"}, GET /queue returns counts; confirm a subsequent ORCH/ET task transitions cleanly through the new merge-gate (no spurious defer/rollback).

Summary

Item State
Staging gate (check_staging_status) SUCCESS (10/10)
Merge-gate (check_branch_mergeable) SUCCESS (branch reached deploy)
DB schema migration none (lease is schema-compatible)
In-task prod rebuild/restart NOT performed (self-hosting safeguard, by design)
Prod cut-over handed off to Owner (P-4 + deploy hook, prod override)
Deploy stage verdict SUCCESS