feat(staging): deterministic staging-runner replacing LLM deployer on deploy-staging (ORCH-115)

Replace the LLM `deployer` agent on the `deploy-staging` stage (self-hosting orchestrator) with a deterministic staging-runner intercepted in launch_job BEFORE _spawn (the deploy-finalizer / post-deploy-monitor reserved-agent precedent). The runner executes the SAME staging suite, maps the exit-code to `staging_status:` via the existing self_deploy.map_exit_code_to_status contract, writes 15-staging-log.md, and initiates the UNCHANGED check_staging_status gate exactly as a finished LLM-deployer would. Invariant (NFR-1): this replaces only the *producer* of the artifact — the artifact contract, the gate / _parse_staging_status / check_staging_status name, STAGE_TRANSITIONS, the machine-verdict key `staging_status:` and the DB schema are byte-for-byte unchanged. Additive, under a kill-switch + repo-scope CSV, never-raise, fail-safe back to the LLM path. Two-level outcome (D5, anti ORCH-110): suite executed -> verdict -> advance (FAILED -> the existing deploy-staging -> development rollback + developer-retry, same as a FAILED LLM verdict); tool-error (suite did not execute) -> bounded DEFER -> fail-closed FAILED + alert on exhaustion (infra != code fault; never a silent advance / false green). First implemented slice of the LLM determinization roadmap (ORCH-118 A6, replace-deterministic-now). - New leaf src/staging_runner.py (never-raise; proc_group tree-kill + timeout) - launch_job intercept + _run_staging_runner_job (mirror _run_deploy_finalizer_job) - config: ORCH_STAGING_RUNNER_* keys (enabled/repos/timeout/infra-retry budget) - GET /queue staging_runner observability block - docs: llm-call-sites/roadmap/usage-policy (A6 implemented; machine blocks + single-transport invariant intact), deployer.md (LLM branch -> fallback), CLAUDE.md, CHANGELOG.md, overview (tech-pipeline/tech-agents/tech-quality-security), .env.example - tests/test_orch115_staging_runner.py (TC-01..TC-13); LLM anti-drift green (TC-14) Refs: ORCH-115 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 01:59:43 +03:00
parent f120e4bd8f
commit b50cf1dd08
16 changed files with 1235 additions and 7 deletions
--- a/docs/overview/tech-pipeline.md
+++ b/docs/overview/tech-pipeline.md
@@ -34,6 +34,16 @@ created → analysis → architecture → development → review → testing →
 | `done` | — | — | терминал |
 | `cancelled` | — | — | терминал (сток отмены) |

+> **Детерминированный staging-раннер (ORCH-115).** На стадии `deploy-staging` для self-hosting
+> `orchestrator` работу ведёт **детерминированный код** (`src/staging_runner.py`), а не LLM-агент
+> `deployer`: он перехватывается в `launch_job` до запуска агента (как Phase A/B/C прод-деплоя),
+> исполняет ту же staging-сюиту, маппит exit-код в `staging_status:` и инициирует **тот же** гейт
+> `check_staging_status`. Это замена *продюсера* артефакта, а не гейта: контракт `15-staging-log.md`,
+> имя/семантика `check_staging_status`, `STAGE_TRANSITIONS` — не изменились. Под kill-switch
+> `staging_runner_enabled` (скоуп `staging_runner_repos`, пусто → self-hosting only); при выключении
+> на стадии снова работает LLM-`deployer` байт-в-байт. Это первый реализованный срез
+> determinization-roadmap (см. `docs/architecture/llm-determinization-roadmap.md`).
+
 ## Под-гейты деплойного ребра — врезки, не стадии

 На переходе `deploy-staging → deploy` исполняются четыре под-гейта в нормативном порядке