feat(staging): deterministic staging-runner replacing LLM deployer on deploy-staging (ORCH-115) #141

Merged
admin merged 8 commits from feature/ORCH-115-orch-replace-llm-deployer-with into main 2026-06-16 02:21:03 +03:00
Owner

ORCH-115 — детерминированный staging-раннер вместо LLM-деплойера на deploy-staging

Первый реализованный срез determinization-roadmap (ORCH-118 A6, replace-deterministic-now): на стадии deploy-staging для self-hosting orchestrator LLM-агент deployer заменён детерминированным кодом (src/staging_runner.py).

Что сделано

  • Перехват в launch_job до _spawn (_run_staging_runner_job, зеркало _run_deploy_finalizer_job) — дискриминатор «staging vs prod» = стадия задачи deploy-staging, не имя роли; should_intercept never-raise → False → штатный _spawn (fail-safe к LLM).
  • Leaf src/staging_runner.py (never-raise, паттерн self_deploy/proc_group): исполняет ту же staging-сюиту через proc_group (tree-kill + таймаут), маппит exit-код единым контрактом self_deploy.map_exit_code_to_status, пишет 15-staging-log.md (тот же staging_status: + 52c-схема), best-effort push в фичеветку, вызывает существующий advance_stage(finished_agent="deployer").
  • Двухуровневый исход (анти-ORCH-110): сюита исполнилась → verdict→advance (FAILED → тот же откат + developer-retry); tool-error → bounded DEFER → fail-closed FAILED + alert на исчерпании.
  • Флаги ORCH_STAGING_RUNNER_* (kill-switch + скоуп self-hosting), блок staging_runner в GET /queue.

Инвариант (NFR-1)

STAGE_TRANSITIONS / QG_CHECKS / check_staging_status / _parse_staging_status / machine-verdict staging_status: / схема БД — байт-в-байт не тронуты (замена продюсера артефакта, не гейта). Откат — ORCH_STAGING_RUNNER_ENABLED=false.

Документация (в этом же PR)

llm-call-sites.md/llm-determinization-roadmap.md/llm-usage-policy.md (A6 — реализован, машинные блоки + инвариант «единственный транспорт S0» целы), .openclaw/agents/deployer.md (LLM-ветвь → fallback), CLAUDE.md, CHANGELOG.md, витрина docs/overview/, .env.example.

Тесты

tests/test_orch115_staging_runner.py (TC-01…TC-13); LLM анти-дрейф (TC-14) зелёные. Полный pytest tests/ -q2105 passed.

ADR: docs/work-items/ORCH-115/06-adr/ADR-001-deterministic-staging-runner.md, сквозной docs/architecture/adr/adr-0048-deterministic-staging-runner.md.

Refs: ORCH-115

🤖 Generated with Claude Code

## ORCH-115 — детерминированный staging-раннер вместо LLM-деплойера на `deploy-staging` Первый реализованный срез determinization-roadmap (ORCH-118 A6, `replace-deterministic-now`): на стадии `deploy-staging` для self-hosting `orchestrator` **LLM-агент `deployer` заменён детерминированным кодом** (`src/staging_runner.py`). ### Что сделано - **Перехват в `launch_job` до `_spawn`** (`_run_staging_runner_job`, зеркало `_run_deploy_finalizer_job`) — дискриминатор «staging vs prod» = **стадия задачи** `deploy-staging`, не имя роли; `should_intercept` never-raise → `False` → штатный `_spawn` (fail-safe к LLM). - **Leaf `src/staging_runner.py`** (never-raise, паттерн `self_deploy`/`proc_group`): исполняет ту же staging-сюиту через `proc_group` (tree-kill + таймаут), маппит exit-код **единым** контрактом `self_deploy.map_exit_code_to_status`, пишет `15-staging-log.md` (тот же `staging_status:` + 52c-схема), best-effort push в фичеветку, вызывает **существующий** `advance_stage(finished_agent="deployer")`. - **Двухуровневый исход (анти-ORCH-110):** сюита исполнилась → verdict→advance (FAILED → тот же откат + developer-retry); tool-error → bounded DEFER → fail-closed FAILED + alert на исчерпании. - Флаги `ORCH_STAGING_RUNNER_*` (kill-switch + скоуп self-hosting), блок `staging_runner` в `GET /queue`. ### Инвариант (NFR-1) `STAGE_TRANSITIONS` / `QG_CHECKS` / `check_staging_status` / `_parse_staging_status` / machine-verdict `staging_status:` / схема БД — **байт-в-байт не тронуты** (замена *продюсера* артефакта, не гейта). Откат — `ORCH_STAGING_RUNNER_ENABLED=false`. ### Документация (в этом же PR) `llm-call-sites.md`/`llm-determinization-roadmap.md`/`llm-usage-policy.md` (A6 — реализован, машинные блоки + инвариант «единственный транспорт S0» целы), `.openclaw/agents/deployer.md` (LLM-ветвь → fallback), `CLAUDE.md`, `CHANGELOG.md`, витрина `docs/overview/`, `.env.example`. ### Тесты `tests/test_orch115_staging_runner.py` (TC-01…TC-13); LLM анти-дрейф (TC-14) зелёные. Полный `pytest tests/ -q` — **2105 passed**. ADR: `docs/work-items/ORCH-115/06-adr/ADR-001-deterministic-staging-runner.md`, сквозной `docs/architecture/adr/adr-0048-deterministic-staging-runner.md`. Refs: ORCH-115 🤖 Generated with [Claude Code](https://claude.com/claude-code)
admin added 4 commits 2026-06-16 02:00:18 +03:00
docs: init ORCH-115 business request
All checks were successful
CI / test (push) Successful in 1m8s
a353a72f20
analyst(ET): auto-commit from analyst run_id=732
All checks were successful
CI / test (push) Successful in 1m6s
ac203c0ccf
architect(ET): auto-commit from architect run_id=733
All checks were successful
CI / test (push) Successful in 1m9s
f120e4bd8f
feat(staging): deterministic staging-runner replacing LLM deployer on deploy-staging (ORCH-115)
All checks were successful
CI / test (push) Successful in 1m8s
CI / test (pull_request) Successful in 1m8s
b50cf1dd08
Replace the LLM `deployer` agent on the `deploy-staging` stage (self-hosting
orchestrator) with a deterministic staging-runner intercepted in launch_job
BEFORE _spawn (the deploy-finalizer / post-deploy-monitor reserved-agent
precedent). The runner executes the SAME staging suite, maps the exit-code to
`staging_status:` via the existing self_deploy.map_exit_code_to_status contract,
writes 15-staging-log.md, and initiates the UNCHANGED check_staging_status gate
exactly as a finished LLM-deployer would.

Invariant (NFR-1): this replaces only the *producer* of the artifact — the
artifact contract, the gate / _parse_staging_status / check_staging_status name,
STAGE_TRANSITIONS, the machine-verdict key `staging_status:` and the DB schema are
byte-for-byte unchanged. Additive, under a kill-switch + repo-scope CSV,
never-raise, fail-safe back to the LLM path.

Two-level outcome (D5, anti ORCH-110): suite executed -> verdict -> advance
(FAILED -> the existing deploy-staging -> development rollback + developer-retry,
same as a FAILED LLM verdict); tool-error (suite did not execute) -> bounded DEFER
-> fail-closed FAILED + alert on exhaustion (infra != code fault; never a silent
advance / false green).

First implemented slice of the LLM determinization roadmap (ORCH-118 A6,
replace-deterministic-now).

- New leaf src/staging_runner.py (never-raise; proc_group tree-kill + timeout)
- launch_job intercept + _run_staging_runner_job (mirror _run_deploy_finalizer_job)
- config: ORCH_STAGING_RUNNER_* keys (enabled/repos/timeout/infra-retry budget)
- GET /queue staging_runner observability block
- docs: llm-call-sites/roadmap/usage-policy (A6 implemented; machine blocks +
  single-transport invariant intact), deployer.md (LLM branch -> fallback),
  CLAUDE.md, CHANGELOG.md, overview (tech-pipeline/tech-agents/tech-quality-security),
  .env.example
- tests/test_orch115_staging_runner.py (TC-01..TC-13); LLM anti-drift green (TC-14)

Refs: ORCH-115

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
admin added 1 commit 2026-06-16 02:08:19 +03:00
reviewer(ET): auto-commit from reviewer run_id=735
All checks were successful
CI / test (push) Successful in 1m16s
CI / test (pull_request) Successful in 1m13s
e3ce01b824
admin added 1 commit 2026-06-16 02:11:40 +03:00
tester(ET): auto-commit from tester run_id=736
All checks were successful
CI / test (push) Successful in 1m10s
CI / test (pull_request) Successful in 1m7s
aed3ba0cbb
admin added 1 commit 2026-06-16 02:15:23 +03:00
deploy-staging(ORCH-115): staging gate SUCCESS (8/10 PASS, C9a/C9b infra-waived)
All checks were successful
CI / test (push) Successful in 1m12s
CI / test (pull_request) Successful in 1m14s
a975591a3c
admin merged commit 65c17d85e3 into main 2026-06-16 02:21:03 +03:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: admin/orchestrator#141