Commit Graph

4 Commits

Author SHA1 Message Date
9d16ee473a feat(testing): deterministic test-runner replacing LLM tester on the testing stage (ORCH-116)
Second realised slice of the determinization-roadmap (ORCH-118 A5,
needs-hybrid-fallback): on the `testing` stage for the self-hosting
`orchestrator` repo the LLM `tester` agent is replaced by a deterministic
test-runner (src/test_runner.py), intercepted in launch_job BEFORE _spawn
(deploy-finalizer / post-deploy-monitor / staging-runner precedent).

It runs the regression `python -m pytest <target>` in the task worktree via
proc_group (tree-kill) + an optional read-only smoke (/health, /status, /queue
+ serial_gate), maps the exit-code -> result: PASS|FAIL via the existing
self_deploy.map_exit_code_to_status contract, writes 13-test-report.md and
initiates the EXISTING check_tests_passed gate exactly as a finished LLM-tester.

Invariant (NFR-1): only the *producer* changes — the artifact contract
(13-test-report.md / result:), the gate check_tests_passed / _parse_tests_verdict,
STAGE_TRANSITIONS and the DB schema are byte-for-byte UNCHANGED. Additive, under
a kill-switch (test_runner_enabled), never-raise, fail-closed, self-hosting scope,
two-level outcome (tool-error DEFER, anti ORCH-110), hybrid (LLM strictly
off-control-path). 52c-`status:` is aligned with the verdict (D6.1) so the
three-field _parse_tests_verdict never false-negatives a PASS.

Docs (ORCH-118 NFR-6, atomic with code): llm-call-sites.md (A5 implemented),
llm-determinization-roadmap.md (rank 2 implemented), llm-usage-policy.md,
README/internals/overview, tester.md, CLAUDE.md, CHANGELOG.md. Coverage:
tests/test_orch116_test_runner.py (TC-01..TC-14); LLM anti-drift tests green.
Full suite: 2137 passed.

Refs: ORCH-116
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 09:37:40 +03:00
b50cf1dd08 feat(staging): deterministic staging-runner replacing LLM deployer on deploy-staging (ORCH-115)
All checks were successful
CI / test (push) Successful in 1m8s
CI / test (pull_request) Successful in 1m8s
Replace the LLM `deployer` agent on the `deploy-staging` stage (self-hosting
orchestrator) with a deterministic staging-runner intercepted in launch_job
BEFORE _spawn (the deploy-finalizer / post-deploy-monitor reserved-agent
precedent). The runner executes the SAME staging suite, maps the exit-code to
`staging_status:` via the existing self_deploy.map_exit_code_to_status contract,
writes 15-staging-log.md, and initiates the UNCHANGED check_staging_status gate
exactly as a finished LLM-deployer would.

Invariant (NFR-1): this replaces only the *producer* of the artifact — the
artifact contract, the gate / _parse_staging_status / check_staging_status name,
STAGE_TRANSITIONS, the machine-verdict key `staging_status:` and the DB schema are
byte-for-byte unchanged. Additive, under a kill-switch + repo-scope CSV,
never-raise, fail-safe back to the LLM path.

Two-level outcome (D5, anti ORCH-110): suite executed -> verdict -> advance
(FAILED -> the existing deploy-staging -> development rollback + developer-retry,
same as a FAILED LLM verdict); tool-error (suite did not execute) -> bounded DEFER
-> fail-closed FAILED + alert on exhaustion (infra != code fault; never a silent
advance / false green).

First implemented slice of the LLM determinization roadmap (ORCH-118 A6,
replace-deterministic-now).

- New leaf src/staging_runner.py (never-raise; proc_group tree-kill + timeout)
- launch_job intercept + _run_staging_runner_job (mirror _run_deploy_finalizer_job)
- config: ORCH_STAGING_RUNNER_* keys (enabled/repos/timeout/infra-retry budget)
- GET /queue staging_runner observability block
- docs: llm-call-sites/roadmap/usage-policy (A6 implemented; machine blocks +
  single-transport invariant intact), deployer.md (LLM branch -> fallback),
  CLAUDE.md, CHANGELOG.md, overview (tech-pipeline/tech-agents/tech-quality-security),
  .env.example
- tests/test_orch115_staging_runner.py (TC-01..TC-13); LLM anti-drift green (TC-14)

Refs: ORCH-115

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 01:59:43 +03:00
651b9af7c3 fix(merge-gate): tolerate re-test infra-timeout + tree-kill spawned pytest
Eliminate the false `deploy-staging -> development` rollback that fired when the
merge-gate local re-test timed out (infra/resource) on a green CI + tester +
staging branch (incident ORCH-109/PR #129: a 516.7s suite blew its 600s budget
under CPU starvation from orphaned pytest processes -> timeout misrouted as a
code fault -> developer-retry loop -> manual gate).

Additive, 5 independent kill-switches, never-raise, self-hosting scope. Untouched
byte-for-byte: STAGE_TRANSITIONS, the QG_CHECKS registry, check_branch_mergeable
name/semantics, machine-verdict keys, the DB schema. INV-4 (never push/force-push
main) and the no-prod-restart rule are preserved.

- D1: new stdlib-only leaf src/proc_group.py runs the spawned re-test/coverage
  pytest in its own process group (start_new_session) and tree-kills the WHOLE
  group on timeout (os.killpg SIGTERM->grace->SIGKILL); used by
  merge_gate.retest_branch and coverage_gate.measure_coverage. No orphan leak.
  Fallback never-break: subprocess_tree_kill_enabled=False / non-POSIX -> the
  prior subprocess.run.
- D2/D3: merge_gate.classify_retest_failure distinguishes timeout/red/lock-busy/
  other; an infra timeout routes to _handle_merge_gate_infra_retry (bounded
  re-queue, task stays on deploy-staging, no rollback / no developer-retry); a
  red re-test / conflict still rolls back (BR-6). Exhaustion -> one infra alert.
- D4: skip the local re-test when the pre-merge rebase was a proven no-op (HEAD
  already CI/tester/staging-validated); fail-safe runs the re-test on any
  uncertainty. Flag merge_retest_skip_when_current_enabled.
- D5: merge_retest_timeout_s 600 -> 900 + _resolve_retest_timeout validation;
  reaper_max_running_s invariant preserved without change.
- D6: in-process counters + read-only merge_gate block in GET /queue; appended
  ("ORCH-110","classify_retest_failure","src/merge_gate.py") to
  MAIN_REGRESSION_MARKERS. Docs (README/internals overview/CLAUDE/CHANGELOG/
  .env.example) updated in the same PR.

Tests: tests/test_orch110_*.py (TC-01..TC-12, incl. the red-before/green-after
incident regression). Full suite green (1988 passed).

Refs: ORCH-110

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 10:42:34 +03:00
6d798c01ef docs(overview): витрина системы docs/overview/ — бизнес+тех, 3 аудитории, презентация (ORCH-011)
Единая точка входа в документацию платформы (ADR-001 D1–D9):
- docs/overview/ — 10 файлов: индекс (маршруты «Я заказчик / Я менеджер /
  Я разработчик» + норматив «изменил функциональность → обнови витрину в том же
  PR»), business.md (без жаргона, 6 сценариев), 7 тех-блоков (link-first),
  presentation.md (16 слайдов + процедура сборки «команда + Проверка:»).
- scripts/build_presentation.py — генератор .pptx в тёмном дизайне (python-pptx;
  чистый stdlib-парсер parse_slides + ленивый import pptx; бинарь не коммитится,
  build/ в .gitignore; зависимость НЕ в прод-образе — машинный гард TC-09).
- tests/test_system_docs.py — структурный анти-дрейф: derive-сверки стадий/
  гейтов/агентов импортом STAGE_TRANSITIONS/QG_CHECKS/glob промптов/config,
  валидность ссылок, FORBIDDEN-скан + секрет-эвристика, слайды каноническим
  парсером, NFR-2, указатели.
- reviewer.md — ось обзорных доков ORCH-079 расширена на витрину (D7; канон 52d
  байт-в-байт, только текст внутри секций) + анти-регресс ассерт в
  test_agent_prompts_canon.py.
- Указатели: README.md, CLAUDE.md (правила №2/№6, «Структура»),
  PRODUCT_VISION.md (врезка-ссылка), CHANGELOG.md.

Рантайм байт-в-байт: src/**, docker-compose.yml, Dockerfile, requirements* —
ноль изменений (docs+tests+dev-скрипт, паттерн ORCH-102/103). pytest: 1873 passed.

Refs: ORCH-011

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 09:36:40 +03:00