feat(ORCH-058): staging-image provenance before BUILD-ONCE prod retag (INV-FRESH) #57

Merged
admin merged 9 commits from feature/ORCH-058-self-deploy-retag-staging into main 2026-06-07 13:04:07 +03:00
Owner

Summary

Enforces INV-FRESH for the BUILD-ONCE self-deploy (ORCH-036): the staging image promoted to prod is provably built from the validated commit. Two layers, self-hosting only, single kill-switch ORCH_IMAGE_FRESHNESS_ENABLED.

  • Strategy A (liveness): QG sub-check check_staging_image_fresh on the deploy-staging -> deploy edge rebuilds orchestrator-orchestrator-staging from the validated commit (--build-staging, --build-arg GIT_SHA), recreates 8501, and runs staging_check.py --mode stub against the FRESH image (AC-4). STAGING only (8501), never prod (8500); FAIL -> rollback to development.
  • Strategy B (safety): Dockerfile stamps org.opencontainers.image.revision=$GIT_SHA; the prod hook fail-closes (exit 1) before docker tag if SOURCE_IMAGE's label != EXPECTED_REVISION.
  • Round-3 review follow-up (6ddff55): parametrized STAGING_CONTAINER/STAGING_CHECK_PATH/STAGING_CHECK_MODE, explicit staging target in rebuild_staging_image (P2a), TC-09 caller<->hook contract tests (P2b), README footer dedup (P3).

Test plan

  • pytest tests/ — 632 passed
  • ruff check clean on changed files
  • bash -n scripts/orchestrator-deploy-hook.sh
  • Reviewer re-check P1/P2/P3 (round 4)

Refs: ORCH-058

Generated with Claude Code

## Summary Enforces **INV-FRESH** for the BUILD-ONCE self-deploy (ORCH-036): the staging image promoted to prod is provably built from the validated commit. Two layers, self-hosting only, single kill-switch `ORCH_IMAGE_FRESHNESS_ENABLED`. - **Strategy A (liveness):** QG sub-check `check_staging_image_fresh` on the `deploy-staging -> deploy` edge rebuilds `orchestrator-orchestrator-staging` from the validated commit (`--build-staging`, `--build-arg GIT_SHA`), recreates 8501, and runs `staging_check.py --mode stub` against the FRESH image (AC-4). STAGING only (8501), never prod (8500); FAIL -> rollback to development. - **Strategy B (safety):** `Dockerfile` stamps `org.opencontainers.image.revision=$GIT_SHA`; the prod hook fail-closes (exit 1) before `docker tag` if `SOURCE_IMAGE`'s label != `EXPECTED_REVISION`. - Round-3 review follow-up (`6ddff55`): parametrized `STAGING_CONTAINER/STAGING_CHECK_PATH/STAGING_CHECK_MODE`, explicit staging target in `rebuild_staging_image` (P2a), TC-09 caller<->hook contract tests (P2b), README footer dedup (P3). ## Test plan - [x] `pytest tests/` — 632 passed - [x] `ruff check` clean on changed files - [x] `bash -n scripts/orchestrator-deploy-hook.sh` - [ ] Reviewer re-check P1/P2/P3 (round 4) Refs: ORCH-058 Generated with Claude Code
admin added 9 commits 2026-06-07 12:25:25 +03:00
docs: init ORCH-058 business request
All checks were successful
CI / test (push) Successful in 17s
e5f9c38e65
analyst(ET): auto-commit from analyst run_id=262
All checks were successful
CI / test (push) Successful in 16s
282636fedb
architect(ET): auto-commit from architect run_id=263
All checks were successful
CI / test (push) Successful in 16s
dbc32fc106
developer(ET): auto-commit from developer run_id=264
Some checks failed
CI / test (push) Failing after 17s
83397570fe
- deploy-hook: REVISION_LABEL/EXPECTED_REVISION (default unset -> backward-compat)
- deploy-hook: fail-closed guard inspects SOURCE_IMAGE revision label before docker tag, normalises <no value>, exit 1 on empty/mismatch
- deploy-hook: new --build-staging mode rebuilds staging image stamping GIT_SHA
- Dockerfile: ARG GIT_SHA + LABEL org.opencontainers.image.revision=$GIT_SHA

Closes TC07/TC08 (tests/test_deploy_hook_provenance.py).
docs(ORCH-058): add CHANGELOG entry, .env.example flags, fix README status
All checks were successful
CI / test (push) Successful in 17s
3b3d587300
Close AC-11 documentation gap left by the prior developer run: the
ORCH-058 feature (staging-image provenance before BUILD-ONCE retag) was
implemented and green but never recorded in the golden-source docs.

- CHANGELOG.md: add the ORCH-058 [Unreleased]/Added entry (layers A+B,
  validated_revision anchor, check_staging_image_fresh, EXPECTED_REVISION
  hook guard, new ORCH_IMAGE_FRESHNESS_* flags, ADR/test refs).
- .env.example (canon): document ORCH_IMAGE_FRESHNESS_ENABLED /
  ORCH_IMAGE_FRESHNESS_REPOS, mirroring the ORCH-036/043/053 precedent.
- docs/architecture/README.md: footer note design -> реализовано, aligning
  it with the already-updated section.

Refs: ORCH-058

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Closes reviewer P0/P1 (ORCH-058 attempt 3): the committed --build-staging hook
recomputed GIT_SHA=$(git rev-parse HEAD) in $REPO (prod clone on `main`) and built
`docker build ... "$REPO"`, ignoring the caller-supplied BUILD_CONTEXT/GIT_SHA. On
the deploy-staging -> deploy edge the PR is not yet merged, so `main` HEAD != the
validated SHA -> the staging image got the wrong revision label and Strategy-B's
guard fail-closed on EVERY valid self-deploy (AC-6 deadlock). It also only did
`docker build` + exit 0 — never recreating 8501 nor health-checking — so
rebuild_staging_image's rc=0 ("rebuilt and healthy") was a lie (AC-4 unmet).

- Hook --build-staging now honours caller BUILD_CONTEXT (validated worktree) and
  GIT_SHA, recreates orchestrator-staging on the fresh image and runs the 10x6s
  health-check; build/health failure -> exit 1 (FAILED contract preserved).
- image_freshness.rebuild_staging_image: document why COMPOSE_PROFILE/TARGET_SERVICE/
  TARGET_PORT are intentionally omitted (hook STAGING defaults -> 8501 only, P2).
- tests: assert the caller<->hook contract (builds from $BUILD_CONTEXT, no
  `git rev-parse HEAD` recompute, recreates + health-checks 8501) so the P0
  regression can't pass green again (P1).

Refs: ORCH-058

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
ORCH-058: --build-staging runs staging_check.py --mode stub vs fresh 8501 (AC-4)
All checks were successful
CI / test (push) Successful in 16s
c53d625744
Per ADR-001 step 3 / AC-4: after the freshly rebuilt staging container is
healthy, run staging_check.py --mode stub against the fresh 8501 stand BEFORE
reporting success, so the EXACT artefact BUILD-ONCE retagged to prod is the one
validated on staging. Fail-closed: staging_check rc!=0 -> exit 1 (not promoted).

- Invoked inside the container (docker exec $TARGET_SERVICE) per the canonical
  signature in scripts/staging_check.py header, --base-url http://localhost:$TARGET_PORT.
- Targets ONLY 8501 (staging), never 8500 (prod) - AC-9.
- --mode stub: fast, deterministic, no LLM spend (ADR).
- Static regression test test_tc07_build_staging_runs_staging_check_stub_after_health:
  asserts staging_check.py + --mode stub present, runs after health, before exit 0,
  fail-closed, and never hard-codes prod 8500.
fix(ORCH-058): parametrize staging_check in --build-staging + explicit staging target
All checks were successful
CI / test (push) Successful in 19s
CI / test (pull_request) Successful in 18s
6ddff5583d
Round-3 review follow-up on c53d625 (P1/P2):

- P1: --build-staging now runs staging_check via parametrized
  STAGING_CONTAINER / STAGING_CHECK_PATH / STAGING_CHECK_MODE (default
  orchestrator-staging / bind-mount path / stub) instead of hardcoding
  $TARGET_SERVICE + the script path. docker exec runs INSIDE the staging
  container (ORCH-048 canonical: B6 registry isolation), after health,
  before exit 0. Fail-closed: any non-zero -> exit 1. STAGING only (8501).
- P2a: rebuild_staging_image now passes the STAGING target EXPLICITLY
  (TARGET_SERVICE/TARGET_PORT/COMPOSE_PROFILE/STAGING_CONTAINER) so the
  self-rebuild can never drift onto prod 8500 if hook defaults change (AC-9).
- P2b: TC-09 caller<->hook contract tests assert the ssh command carries
  GIT_SHA + BUILD_CONTEXT + the staging target and never the prod 8500 one;
  no-ssh-host fails closed.
- P3: consolidated the three duplicate README footers into one.
- Docs (golden source): DEPLOY_HOOK.md step 4 + env rows, README footer,
  CHANGELOG, Dockerfile ARG GIT_SHA="" comment, .env.example freshness block.

Validates exactly the artefact later BUILD-ONCE retagged to prod (AC-4,
ADR-001 step 3). 632 tests pass, ruff clean, bash -n OK.

Refs: ORCH-058

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
admin merged commit 094b5e2f96 into main 2026-06-07 13:04:07 +03:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: admin/orchestrator#57