Round-3 review follow-up on c53d625 (P1/P2):
- P1: --build-staging now runs staging_check via parametrized
STAGING_CONTAINER / STAGING_CHECK_PATH / STAGING_CHECK_MODE (default
orchestrator-staging / bind-mount path / stub) instead of hardcoding
$TARGET_SERVICE + the script path. docker exec runs INSIDE the staging
container (ORCH-048 canonical: B6 registry isolation), after health,
before exit 0. Fail-closed: any non-zero -> exit 1. STAGING only (8501).
- P2a: rebuild_staging_image now passes the STAGING target EXPLICITLY
(TARGET_SERVICE/TARGET_PORT/COMPOSE_PROFILE/STAGING_CONTAINER) so the
self-rebuild can never drift onto prod 8500 if hook defaults change (AC-9).
- P2b: TC-09 caller<->hook contract tests assert the ssh command carries
GIT_SHA + BUILD_CONTEXT + the staging target and never the prod 8500 one;
no-ssh-host fails closed.
- P3: consolidated the three duplicate README footers into one.
- Docs (golden source): DEPLOY_HOOK.md step 4 + env rows, README footer,
CHANGELOG, Dockerfile ARG GIT_SHA="" comment, .env.example freshness block.
Validates exactly the artefact later BUILD-ONCE retagged to prod (AC-4,
ADR-001 step 3). 632 tests pass, ruff clean, bash -n OK.
Refs: ORCH-058
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
104 lines
6.1 KiB
Plaintext
104 lines
6.1 KiB
Plaintext
ORCH_PLANE_API_URL=http://plane-app-api-1:8000
|
|
# External (browser) web URL of Plane for clickable issue links in notifications
|
|
# (ORCH-017). Falls back to ORCH_PLANE_API_URL; a loopback fallback is treated as
|
|
# "no web URL" and the Plane link is omitted. Example: https://plane.example.org
|
|
ORCH_PLANE_WEB_URL=
|
|
ORCH_PLANE_API_TOKEN=
|
|
ORCH_PLANE_WORKSPACE_SLUG=
|
|
ORCH_PLANE_WEBHOOK_SECRET=
|
|
ORCH_GITEA_URL=http://localhost:3000
|
|
ORCH_GITEA_TOKEN=
|
|
ORCH_GITEA_WEBHOOK_SECRET=
|
|
ORCH_CLAUDE_BIN=/usr/bin/claude
|
|
ORCH_REPOS_DIR=/home/slin/repos
|
|
ORCH_DB_PATH=/app/data/orchestrator.db
|
|
# ORCH-042: live-tracker mode. edit (DEFAULT) -> the task card is edited in place
|
|
# (editMessageText). bump -> on every update the old card is deleted and a fresh
|
|
# one is sent silently to the BOTTOM of the chat (deleteMessage + sendMessage +
|
|
# repoint). One card per task in both modes. Any value other than "bump" -> edit.
|
|
ORCH_TRACKER_MODE=edit
|
|
# ORCH-043: merge-gate (auto-rebase onto current origin/main + re-test + merge-lock)
|
|
# on the deploy-staging -> deploy edge. Deterministic sub-gate (no LLM) that catches
|
|
# the branch up to the CURRENT origin/main, re-tests it, and serialises merges so two
|
|
# green parallel branches can't break main.
|
|
# ENABLED -> global kill-switch (false -> whole gate is a no-op pass).
|
|
# REPOS -> CSV of repos where the gate is REAL; empty -> only the self-hosting
|
|
# repo (orchestrator); other repos -> conditional no-op (mirrors ORCH-35).
|
|
# RETEST_TIMEOUT_S -> wall-clock budget for the post-rebase re-test.
|
|
# RETEST_TARGET -> pytest target for the re-test.
|
|
# LOCK_TIMEOUT_S -> max merge-lease age before a stale lease is reclaimed.
|
|
# DEFER_DELAY_S -> delay before re-running the gate when the lock is busy.
|
|
# DEFER_MAX_ATTEMPTS -> defer retries before escalation (avoids livelock).
|
|
ORCH_MERGE_GATE_ENABLED=true
|
|
ORCH_MERGE_GATE_REPOS=
|
|
ORCH_MERGE_RETEST_TIMEOUT_S=600
|
|
ORCH_MERGE_RETEST_TARGET=tests/
|
|
ORCH_MERGE_LOCK_TIMEOUT_S=300
|
|
ORCH_MERGE_DEFER_DELAY_S=60
|
|
ORCH_MERGE_DEFER_MAX_ATTEMPTS=5
|
|
# ORCH-036: executable self-deploy of the `deploy` stage. For the self-hosting repo
|
|
# (orchestrator) the stage REALLY restarts prod (8500) via a detached host hook;
|
|
# deploy_status: SUCCESS means proven health-ok, not an LLM declaration. Three
|
|
# deterministic phases (A: request approve, B: human Approved -> detached deploy,
|
|
# C: finalizer maps hook exit-code -> deploy_status). Non-self repos: unchanged
|
|
# synchronous ssh deploy. SECRETS / host paths live ONLY on the host — do NOT commit.
|
|
# SELF_DEPLOY_ENABLED -> global kill-switch (false -> legacy synchronous deploy for all).
|
|
# SELF_DEPLOY_REPOS -> CSV of repos where Phase A/B/C is REAL; empty -> only the
|
|
# self-hosting repo (orchestrator); others -> no-op (mirrors ORCH-35).
|
|
# DEPLOY_REQUIRE_MANUAL_APPROVE -> require a human Plane "Approved" before the prod
|
|
# deploy (true on rollout; full auto is ORCH-54).
|
|
# DEPLOY_FINALIZE_DELAY_S -> delay before the first/each finalize poll (>= hook+health).
|
|
# DEPLOY_FINALIZE_MAX_ATTEMPTS -> bounded finalize-defer budget (anti-livelock).
|
|
# DEPLOY_SSH_USER / DEPLOY_SSH_HOST -> ssh target for the host hook (DEPLOY_SSH_HOST
|
|
# empty -> detached deploy will NOT launch; set on the host).
|
|
# DEPLOY_HOOK_SCRIPT -> path to the hook ON THE HOST (relative to the repo).
|
|
# DEPLOY_HOST_REPO_PATH -> orchestrator clone path on the host.
|
|
# DEPLOY_PROD_SOURCE_IMAGE -> staging-validated image, retagged build-once (no rebuild).
|
|
# DEPLOY_PROD_TARGET_SERVICE / _PORT / _IMAGE / _COMPOSE_PROFILE -> prod compose profile.
|
|
# DEPLOY_PROD_PREV_IMAGE_FILE -> prod prev-image snapshot (separate from staging's).
|
|
ORCH_SELF_DEPLOY_ENABLED=true
|
|
ORCH_SELF_DEPLOY_REPOS=
|
|
ORCH_DEPLOY_REQUIRE_MANUAL_APPROVE=true
|
|
ORCH_DEPLOY_FINALIZE_DELAY_S=90
|
|
ORCH_DEPLOY_FINALIZE_MAX_ATTEMPTS=10
|
|
ORCH_DEPLOY_SSH_USER=slin
|
|
ORCH_DEPLOY_SSH_HOST=
|
|
ORCH_DEPLOY_HOOK_SCRIPT=scripts/orchestrator-deploy-hook.sh
|
|
ORCH_DEPLOY_HOST_REPO_PATH=/home/slin/repos/orchestrator
|
|
ORCH_DEPLOY_PROD_SOURCE_IMAGE=orchestrator-orchestrator-staging
|
|
ORCH_DEPLOY_PROD_TARGET_SERVICE=orchestrator
|
|
ORCH_DEPLOY_PROD_TARGET_PORT=8500
|
|
ORCH_DEPLOY_PROD_TARGET_IMAGE=orchestrator-orchestrator
|
|
ORCH_DEPLOY_PROD_COMPOSE_PROFILE=
|
|
ORCH_DEPLOY_PROD_PREV_IMAGE_FILE=.deploy-prev-image-prod
|
|
|
|
# ORCH-058: staging-image provenance before the BUILD-ONCE prod retag (INV-FRESH).
|
|
# Guarantees the staging image promoted to prod is the EXACT artefact rebuilt from the
|
|
# validated commit — two layers, self-hosting only:
|
|
# A (liveness): QG sub-check `check_staging_image_fresh` on the deploy-staging->deploy
|
|
# edge rebuilds orchestrator-orchestrator-staging from the validated commit + recreates
|
|
# 8501; FAIL -> rollback to development. (builds/recreate STAGING only, never prod.)
|
|
# B (safety): the Dockerfile stamps `org.opencontainers.image.revision`; the prod hook
|
|
# fail-closes (exit 1) before `docker tag` if SOURCE_IMAGE's label != EXPECTED_REVISION.
|
|
# ENABLED -> single kill-switch for A+B as a WHOLE (never "B without A"); false -> legacy.
|
|
# REPOS -> CSV of repos where the gate is REAL; empty -> only self-hosting (orchestrator).
|
|
ORCH_IMAGE_FRESHNESS_ENABLED=true
|
|
ORCH_IMAGE_FRESHNESS_REPOS=
|
|
|
|
# ORCH-053: stuck-task reconciler (sweeper for lost webhooks). A background daemon
|
|
# replays a missed stage transition through the SAME gates/handlers a webhook would,
|
|
# fixing tasks that got stuck on a dropped event (502 on rebuild, no Plane/Gitea
|
|
# retries, unresolved sha->branch).
|
|
# ENABLED -> global kill-switch (self-hosting safety / staged rollout).
|
|
# PLANE_ENABLED -> separate flag for the F-2 Plane-API poll (mute only F-2).
|
|
# INTERVAL_S -> background sweep period (seconds).
|
|
# GRACE_DEFAULT_S -> default "stuck" threshold on tasks.updated_at (seconds).
|
|
# GRACE_OVERRIDES_JSON -> per-stage thresholds, e.g. {"development":300}; bad JSON -> default.
|
|
# NOTIFY_UNBLOCK -> send a Telegram message when a stuck task is unblocked.
|
|
ORCH_RECONCILE_ENABLED=true
|
|
ORCH_RECONCILE_PLANE_ENABLED=true
|
|
ORCH_RECONCILE_INTERVAL_S=120
|
|
ORCH_RECONCILE_GRACE_DEFAULT_S=600
|
|
ORCH_RECONCILE_GRACE_OVERRIDES_JSON=
|
|
ORCH_RECONCILE_NOTIFY_UNBLOCK=true
|