Commit Graph

257 Commits

Author SHA1 Message Date
3b3d587300 docs(ORCH-058): add CHANGELOG entry, .env.example flags, fix README status
All checks were successful
CI / test (push) Successful in 17s
Close AC-11 documentation gap left by the prior developer run: the
ORCH-058 feature (staging-image provenance before BUILD-ONCE retag) was
implemented and green but never recorded in the golden-source docs.

- CHANGELOG.md: add the ORCH-058 [Unreleased]/Added entry (layers A+B,
  validated_revision anchor, check_staging_image_fresh, EXPECTED_REVISION
  hook guard, new ORCH_IMAGE_FRESHNESS_* flags, ADR/test refs).
- .env.example (canon): document ORCH_IMAGE_FRESHNESS_ENABLED /
  ORCH_IMAGE_FRESHNESS_REPOS, mirroring the ORCH-036/043/053 precedent.
- docs/architecture/README.md: footer note design -> реализовано, aligning
  it with the already-updated section.

Refs: ORCH-058

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-07 08:27:57 +00:00
Dev Agent
f0c2986477 ORCH-058: implement fail-closed provenance guard in deploy-hook + GIT_SHA OCI label in Dockerfile
All checks were successful
CI / test (push) Successful in 16s
- deploy-hook: REVISION_LABEL/EXPECTED_REVISION (default unset -> backward-compat)
- deploy-hook: fail-closed guard inspects SOURCE_IMAGE revision label before docker tag, normalises <no value>, exit 1 on empty/mismatch
- deploy-hook: new --build-staging mode rebuilds staging image stamping GIT_SHA
- Dockerfile: ARG GIT_SHA + LABEL org.opencontainers.image.revision=$GIT_SHA

Closes TC07/TC08 (tests/test_deploy_hook_provenance.py).
2026-06-07 11:20:38 +03:00
83397570fe developer(ET): auto-commit from developer run_id=264
Some checks failed
CI / test (push) Failing after 17s
2026-06-07 07:46:19 +00:00
dbc32fc106 architect(ET): auto-commit from architect run_id=263
All checks were successful
CI / test (push) Successful in 16s
2026-06-07 07:27:38 +00:00
282636fedb analyst(ET): auto-commit from analyst run_id=262
All checks were successful
CI / test (push) Successful in 16s
2026-06-07 07:06:10 +00:00
e5f9c38e65 docs: init ORCH-058 business request
All checks were successful
CI / test (push) Successful in 17s
2026-06-07 10:01:11 +03:00
stream
e4c6401633 docs(history): LESSONS self-deploy bootstrap — каскад 4 инфра-багов (passwd/env/log-perms/stale-staging-image)
Some checks failed
CI / test (push) Has been cancelled
2026-06-07 09:52:39 +03:00
stream
115519ebb4 fix(compose): ORCH_DEPLOY_* env for self-deploy (prefix ORCH_, orchestrator hook, host-repo path) — ORCH-36 Phase B 2026-06-07 09:39:51 +03:00
stream
64e031a37f fix(docker): passwd entry for uid 1000 (slin) — fixes ssh/whoami, unblocks ORCH-36 self-deploy Phase B 2026-06-07 09:27:04 +03:00
01ff71978f docs(ORCH-036): staging gate SUCCESS — 10/10 checks pass (re-run inside orchestrator-staging)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-06 21:48:50 +00:00
stream
d5915a89b9 docs(history): LESSONS ORCH-036+053 — bootstrap-деплой, merge-конфликт, reconciler в проде 2026-06-07 00:34:36 +03:00
1ff8d85bb9 Merge pull request 'feat: executable self-deploy — stage deploy triggers host hook (ORCH-036)' (#55) from feature/ORCH-036-orch-36-deploy-b into main 2026-06-07 00:23:40 +03:00
stream
36c1898fac Merge remote-tracking branch 'origin/main' into feature/ORCH-036-orch-36-deploy-b
All checks were successful
CI / test (push) Successful in 16s
CI / test (pull_request) Successful in 14s
# Conflicts:
#	.env.example
#	CHANGELOG.md
#	docs/architecture/README.md
#	docs/operations/INFRA.md
#	src/config.py
2026-06-07 00:22:19 +03:00
e2dc9d6df6 Merge pull request 'ORCH-053: sweeper потерянных webhook (реконсиляция застрявших стадий)' (#56) from feature/ORCH-053-sweeper-webhook-stuck-task into main 2026-06-07 00:20:53 +03:00
c0bcb544cf tester(ET): auto-commit from tester run_id=201
All checks were successful
CI / test (push) Successful in 17s
CI / test (pull_request) Successful in 15s
2026-06-06 21:07:35 +00:00
2be39b398b reviewer(ET): auto-commit from reviewer run_id=199 2026-06-06 21:07:35 +00:00
d79defeadd fix(deploy): clear stale self-deploy markers on rollback; document env
Re-deploy after a FAILED prod deploy wedged the task on `deploy`: the
sentinel markers (approve-requested/initiated/result) are keyed by the
stable work_item_id, so after the БАГ-8 rollback (deploy -> development)
and a developer fix, Phase B's idempotency-guard saw a STALE `initiated`
and became a no-op — the detached hook never re-launched and the
finalizer was never enqueued. Add self_deploy.clear_state (never-raise,
idempotent) and call it on the check_deploy_status FAILED rollback and at
the start of Phase A, so every fresh prod-deploy pass starts clean.

Also document the new ORCH_SELF_DEPLOY_* / ORCH_DEPLOY_* descriptors in
the canonical .env.example (CLAUDE.md rule #8, ТЗ §2.6), modelled on the
ORCH-043 merge-gate block (placeholders only, secrets not committed).

Contracts untouched: STAGE_TRANSITIONS, QG_CHECKS, _parse_deploy_status,
БАГ-8, merge-gate.

Refs: ORCH-036
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-06 21:07:35 +00:00
9f43e6a0ae reviewer(ET): auto-commit from reviewer run_id=195 2026-06-06 21:07:35 +00:00
10f2a39a58 feat(deploy): build-once SOURCE_IMAGE retag in hook + deploy-stage docs
Add the optional, backward-compatible SOURCE_IMAGE branch to
orchestrator-deploy-hook.sh: when set, retag the staging-validated image
onto TARGET_IMAGE (docker tag) before `up -d --no-build` instead of
rebuilding — guarantees prod runs the exact artefact that passed staging
(AC-7 / TC-14). Unset -> prior behaviour; exit-code contract (0/1/2) and
health-loop untouched.

Update golden-source docs (AC-13): rewrite deployer.md `deploy` stage from
"paper SUCCESS" to the executable self-deploy (Phase A/B/C, no self-restart
from inside the container) and add the ORCH-036 CHANGELOG entry.

Refs: ORCH-036

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-06 21:07:35 +00:00
63187ff102 developer(ET): auto-commit from developer run_id=192 2026-06-06 21:07:35 +00:00
5c5525548d architect(ET): auto-commit from architect run_id=190 2026-06-06 21:07:35 +00:00
0d0cd6e281 analyst(ET): auto-commit from analyst run_id=189 2026-06-06 21:07:35 +00:00
480b203a9d docs: init ORCH-036 business request 2026-06-06 21:07:35 +00:00
7705552f08 docs(ORCH-036): staging gate log — staging_status SUCCESS (10/10 PASS)
Re-run of deploy-staging gate (merge-gate defer cycle). Canonical
staging_check.py (mode=stub) ran inside orchestrator-staging (8501);
all 10 checks passed (exit 0). No prod (8500) container touched.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-06 21:07:20 +00:00
c1196e34e8 deployer(ET): auto-commit from deployer run_id=204
All checks were successful
CI / test (push) Successful in 16s
CI / test (pull_request) Successful in 15s
2026-06-06 21:04:39 +00:00
d43603b224 docs(ORCH-053): deploy gate log — deploy_status SUCCESS
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-06 21:04:04 +00:00
682ae09316 docs(ORCH-036): staging gate SUCCESS log (10/10 checks PASS)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-06 20:58:12 +00:00
5089f99bb1 tester(ET): auto-commit from tester run_id=200
All checks were successful
CI / test (push) Successful in 15s
CI / test (pull_request) Successful in 16s
2026-06-06 20:55:25 +00:00
32161a180a reviewer(ET): auto-commit from reviewer run_id=198 2026-06-06 20:55:25 +00:00
7d2d77217a feat(reconciler): sweeper потерянных webhook (реконсиляция застрявших стадий)
Конвейер продвигается только входящими webhook; потерянное событие (502 на
ребилде, отсутствие ретраев у Plane/Gitea, неразрезолвленный sha→branch)
оставляет задачу молча застрявшей (класс инцидента ORCH-044). Новый фоновый
daemon-поток src/reconciler.py (паттерн queue_worker) доигрывает пропущенный
переход через те же штатные гейты/обработчики, что и webhook:

- F-1 gate-side: для задач stage≠done, без активного job и age(updated_at) ≥
  grace_for_stage(stage) — read-only пред-оценка канонического QG; зелёный →
  stage_engine.advance_stage(..., finished_agent=None); красный → тишина (спам
  нотификаций структурно невозможен). analysis F-1 не трогает (человеческий гейт).
- F-2 plane-side: опрос Plane API per-project (plane_sync.list_issues_by_state,
  курсорная пагинация, never-raise) → реплей In Progress/Approved/Rejected через
  существующие handle_status_start/handle_verdict (async из sync-потока, asyncio.run).
- F-3: усиление sha→branch в handle_ci_status — БД-fallback по единственной
  development-задаче repo (неоднозначность → не резолвим), debug→info.
- Анти-дубль на создании (db.create_task_atomic под process-wide Lock): гонка
  reconcile↔webhook не плодит второй task/branch/worktree/analyst-job (AC-4).
- F-4 observability: лог-строка разблокировки + Telegram + блок reconcile в /queue.

Старт/стоп в main.lifespan (после worker.start() / перед worker.stop()),
restart-safe, never-raise на единицу работы. Kill-switches ORCH_RECONCILE_ENABLED
/ ORCH_RECONCILE_PLANE_ENABLED + grace-настройки. Схема БД и реестры
STAGE_TRANSITIONS/QG_CHECKS не менялись.

Тесты: test_reconciler.py, test_reconciler_plane.py, test_gitea_sha_resolve.py,
test_config.py (33 новых, 563 всего зелёные). Документация обновлена (golden source):
architecture/README.md, INFRA.md, README.md, CHANGELOG.md, adr-0007 → accepted.

Refs: ORCH-053

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-06 20:55:25 +00:00
f5aae50514 architect(ET): auto-commit from architect run_id=194 2026-06-06 20:55:25 +00:00
a083ed8495 analyst(ET): auto-commit from analyst run_id=191 2026-06-06 20:55:25 +00:00
eac0eb4b3a docs: init ORCH-053 business request 2026-06-06 20:55:25 +00:00
434bd6243d docs(ORCH-053): staging gate SUCCESS log
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-06 20:55:10 +00:00
c21a279565 Merge pull request 'feat(merge-gate): auto-rebase onto current main + re-test + serialise merges (ORCH-043)' (#54) from feature/ORCH-043-merge-gate-auto-rebase-re-test into main
Some checks failed
CI / test (push) Has been cancelled
2026-06-06 21:24:33 +03:00
d9afb3a10d docs(ORCH-043): deploy gate log — deploy_status SUCCESS
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-06 17:45:13 +00:00
8447853db8 deployer(ET): auto-commit from deployer run_id=187
All checks were successful
CI / test (push) Successful in 16s
CI / test (pull_request) Successful in 16s
2026-06-06 17:41:23 +00:00
5dc5893a49 docs(ORCH-043): staging gate log — staging_status SUCCESS
Live staging-stand suite (scripts/staging_check.py, stub mode) ran inside
orchestrator-staging: 10/10 checks PASS, exit code 0. Merge-gate edge
(deploy-staging → deploy) cleared for ORCH-043.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-06 17:41:06 +00:00
581a8b595a tester(ET): auto-commit from tester run_id=186
All checks were successful
CI / test (push) Successful in 17s
CI / test (pull_request) Successful in 15s
2026-06-06 17:38:38 +00:00
ba51aa17bc reviewer(ET): auto-commit from reviewer run_id=185
All checks were successful
CI / test (push) Successful in 19s
CI / test (pull_request) Successful in 18s
2026-06-06 17:37:05 +00:00
00d69d9e27 feat(merge-gate): auto-rebase onto current main + re-test + serialise merges
All checks were successful
CI / test (push) Successful in 15s
CI / test (pull_request) Successful in 17s
Deterministic (no-LLM) sub-gate on the deploy-staging -> deploy edge that
catches a feature branch up to the CURRENT origin/main, re-tests the combined
tree, and serialises merges with a per-repo file lease — so two green parallel
branches can no longer break main (self-hosting safety for the orchestrator repo).

- src/merge_gate.py: branch_is_behind_main, auto_rebase_onto_main (push
  --force-with-lease ONLY the task branch, NEVER main), retest_branch, and a
  file merge-lease (atomic O_CREAT|O_EXCL, holder-aware release, stale reclaim).
  Strict never-raise contract; all git ops in the per-branch worktree.
- src/qg/checks.py: check_branch_mergeable composes the primitives under the
  lease; registered in QG_CHECKS. Conditional rollout (merge_gate_enabled /
  merge_gate_repos, default self-hosting only).
- src/stage_engine.py: sub-gate hook on deploy-staging (not a new stage). PASS ->
  advance; "merge-lock busy" -> DEFER (re-queue with available_at, anti-deadlock
  at max_concurrency=1, capped); conflict/red re-test -> rollback to development
  + developer retry (capped by MAX_DEVELOPER_RETRIES). Lease released on
  deploy->done / rollback / PR-merged webhook.
- src/db.py: enqueue_job(available_at_delay_s=...) for the defer (no schema change).
- src/webhooks/gitea.py: holder-aware lease release on PR-merged.
- src/config.py + .env.example: ORCH_MERGE_* settings.

Docs: README + adr-0006 (architect) already cover the design; CHANGELOG updated.
Tests: test_merge_gate.py, test_qg_merge_gate.py, test_merge_gate_race.py,
test_stage_engine.py::TestMergeGate, test_config.py, QG-registry snapshot.
Full suite: 535 passed.

Refs: ORCH-043

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-06 17:32:50 +00:00
ad1589084b architect(ET): auto-commit from architect run_id=183
All checks were successful
CI / test (push) Successful in 14s
2026-06-06 17:16:00 +00:00
77e7205ce8 analyst(ET): auto-commit from analyst run_id=182
All checks were successful
CI / test (push) Successful in 14s
2026-06-06 16:39:20 +00:00
445807dd90 docs: init ORCH-043 business request
All checks were successful
CI / test (push) Successful in 14s
2026-06-06 19:31:37 +03:00
39cb5dde70 Merge pull request 'fix(infra): ORCH-040 run containers as host uid 1000:1000 (not root)' (#53) from feature/ORCH-040-root-git into main
Some checks failed
CI / test (push) Has been cancelled
2026-06-06 19:26:35 +03:00
7b748b7ac5 docs(ORCH-040): deploy gate log — deploy_status SUCCESS
Self-hosting deploy verdict: artifact validated (staging gate green, compose
user=1000:1000 with МИНА 1 group_add intact). Prod cut-over handed to Owner
(P-1…P-4 + deploy hook) — in-task prod restart not performed by design.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-06 15:11:08 +00:00
bcf5256731 deployer(ET): auto-commit from deployer run_id=180
All checks were successful
CI / test (push) Successful in 15s
CI / test (pull_request) Successful in 14s
2026-06-06 15:09:01 +00:00
80275a3336 docs(ORCH-040): staging gate log — staging_status SUCCESS (10/10)
Staging check suite passed 10/10 (exit 0), run canonically inside
orchestrator-staging via the Docker Engine API (docker exec equivalent).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-06 15:08:50 +00:00
59e47ba067 tester(ET): auto-commit from tester run_id=179
All checks were successful
CI / test (push) Successful in 14s
CI / test (pull_request) Successful in 14s
2026-06-06 15:07:07 +00:00
be64761654 reviewer(ET): auto-commit from reviewer run_id=178
All checks were successful
CI / test (push) Successful in 13s
CI / test (pull_request) Successful in 13s
2026-06-06 15:05:26 +00:00