work_item: ORCH-116
stage: analysis
author_agent: analyst
status: ready-for-review
created_at: 2026-06-16
model_used: claude-opus-4-8
title: "Детерминированный test-раннер вместо LLM-тестера (стадия testing)"
framework: pytest
scope: >
  Покрывает Phase 1: перехват tester-джоба на стадии testing до _spawn, исполнение
  тест-контракта (pytest через proc_group + опц. read-only smoke), маппинг exit-кода
  в result: PASS|FAIL, запись/push 13-test-report.md, инициацию существующего гейта
  check_tests_passed, two-level outcome (tool-error DEFER, анти-ORCH-110), kill-switch/
  скоуп/backward-compat для репо без тест-контракта, never-raise/fail-safe, изоляцию
  процесса/таймаут (tree-kill), гибрид (LLM не в control-path вердикта), наблюдаемость,
  и анти-дрейф инвариант (STAGE_TRANSITIONS/QG_CHECKS/check_tests_passed/_parse_tests_verdict/
  схема БД не тронуты). Вне покрытия: Phase 2 (project test contract для не-self репо,
  off-control-path LLM-триаж), ORCH-115 staging/deploy-раннер, LLM-роли reviewer/developer,
  живой Claude CLI и живой прод-стенд (мокируются).
notes: >
  Тесты не требуют живого Claude CLI или сети: subprocess/pytest-run (proc_group) и
  advance_stage мокируются; пьюр-маппинг и рендер frontmatter тестируются напрямую;
  smoke-GET мокируются. Полный регресс tests/ должен оставаться зелёным. Анти-дрейф
  (TC-09) защищает критический инвариант NFR-1. Эталон реализации/покрытия —
  tests/test_orch115_staging_runner.py (ORCH-115).

tests:
  - id: TC-01
    type: unit
    description: "applies(repo): enabled=False -> False (откат к LLM); пустой CSV -> True только для orchestrator; непустой CSV -> membership; репо без резолвимого тест-контракта -> False (BR-9 backward-compat); ошибка -> False (never-raise, fail-safe)."
    module: tests/test_orch116_test_runner.py
    expected: PASS

  - id: TC-02
    type: unit
    description: "Маппинг exit-кода: 0 -> PASS; 1/2/любой ненулевой -> FAIL; None/нечисло/ошибка запуска -> FAIL (fail-closed). Токены PASS/FAIL согласованы с _parse_tests_verdict; второй несогласованный маппинг не вводится."
    module: tests/test_orch116_test_runner.py
    expected: PASS

  - id: TC-03
    type: unit
    description: "Рендер 13-test-report.md: frontmatter содержит result: PASS|FAIL (UPPERCASE) + 52c-схему (work_item/stage=testing/author_agent=test-runner/status/created_at/model_used=n/a); хвост stdout pytest и smoke-итог копируются в тело."
    module: tests/test_orch116_test_runner.py
    expected: PASS

  - id: TC-04
    type: integration
    description: "Сгенерированный раннером 13-test-report.md читается НЕИЗМЕНЁННЫМ _parse_tests_verdict -> корректный (bool, reason) для PASS и FAIL (контракт артефакта/гейта check_tests_passed неизменен)."
    module: tests/test_orch116_test_runner.py
    expected: PASS

  - id: TC-05
    type: integration
    description: "launch_job перехватывает tester-джоб на стадии testing для in-scope репо ДО _spawn (как D1/D2/ORCH-115): _spawn НЕ вызывается, agent_runs не создаётся, возвращается None, jobs-строка ведётся mark_job. _spawn мокирован."
    module: tests/test_orch116_test_runner.py
    expected: PASS

  - id: TC-06
    type: integration
    description: "Дискриминатор: tester-джоб на стадии НЕ testing (защита) и любой не-tester джоб НЕ перехватываются раннером; should_intercept never-raise -> False при DB-сбое (fall-through к _spawn)."
    module: tests/test_orch116_test_runner.py
    expected: PASS

  - id: TC-07
    type: integration
    description: "После PASS-вердикта раннер инициирует advance_stage(finished_agent='tester') ровно как завершившийся LLM-tester (advance_stage мокирован/наблюдается) -> check_tests_passed -> testing->deploy-staging; после FAIL — существующий откат testing->development + developer-retry (stage_engine.py:849)."
    module: tests/test_orch116_test_runner.py
    expected: PASS

  - id: TC-08
    type: integration
    description: "Kill-switch: test_runner_enabled=False -> на testing для orchestrator вызывается _spawn (прежний LLM-путь байт-в-байт), раннер не активируется."
    module: tests/test_orch116_test_runner.py
    expected: PASS

  - id: TC-09
    type: unit
    description: "Анти-дрейф NFR-1: STAGE_TRANSITIONS (src/stages.py), реестр/имена QG_CHECKS, check_tests_passed/_parse_tests_verdict и токены result:/verdict:/status: неизменны; в схеме БД нет новой таблицы/колонки от ORCH-116. Структурная проверка."
    module: tests/test_orch116_test_runner.py
    expected: PASS

  - id: TC-10
    type: integration
    description: "Two-level outcome (анти-ORCH-110): сюита НЕ исполнилась (spawn-error/таймаут/returncode None) -> bounded DEFER (re-queue tester-джоба + restart-safe маркер), БЕЗ отката на development и БЕЗ расхода developer-retry; на исчерпании test_runner_infra_max_retries -> fail-closed FAIL + advance + INFRA-alert. Никогда тихий advance/ложный green."
    module: tests/test_orch116_test_runner.py
    expected: PASS

  - id: TC-11
    type: unit
    description: "never-raise/fail-safe: pytest-run бросает/таймаутит/worktree-ошибка -> раннер не падает, исход FAIL (fail-closed) или bounded DEFER, никогда тихий advance/ложный green; воркер/очередь не клинятся. Все публичные функции test_runner.py never-raise."
    module: tests/test_orch116_test_runner.py
    expected: PASS

  - id: TC-12
    type: unit
    description: "Изоляция/таймаут: pytest исполняется в worktree ветки задачи через proc_group.run_in_process_group (tree-kill); test_runner_timeout_s применяется; малформ/непозитив -> дефолт 900 + WARNING (never-break); reaper_max_running_s не правится."
    module: tests/test_orch116_test_runner.py
    expected: PASS

  - id: TC-13
    type: unit
    description: "Self-hosting safety: в командах раннера нет запрещённых литералов (рестарт 8500 / docker compose up orchestrator / --build / force-push main / правки .env); smoke-запросы строго read-only GET (/health,/status,/queue); лог пушится только в фичеветку."
    module: tests/test_orch116_test_runner.py
    expected: PASS

  - id: TC-14
    type: integration
    description: "Наблюдаемость + гибрид: GET /queue содержит блок test_runner (enabled/repos/target/счётчики runs/pass/fail/tool_error/deferred); на прогон пишется один структурный лог-вердикт, различающий код-фейл и tool-error; LLM не вызывается для вынесения result: в happy/fail-path."
    module: tests/test_orch116_test_runner.py
    expected: PASS

  - id: TC-15
    type: integration
    description: "Анти-дрейф LLM-карты: llm-call-sites.md (A5)/roadmap (rank 2)/policy обновлены под реализацию (инвариант 'ровно один first_slice=yes' цел); tests/test_llm_call_site_inventory.py и tests/test_llm_determinization_docs.py остаются зелёными после правок."
    module: tests/test_llm_call_site_inventory.py
    expected: PASS