diff --git a/CHANGELOG.md b/CHANGELOG.md index 7b1e1bc..c008b1f 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,7 @@ ## [Unreleased] ### Added +- **Дословный текст findings reviewer/tester встраивается в `task_desc` заворота** (ORCH-046): при откате на `development` строка `task_desc` (попадает в `.task-dev.md` developer-агента) теперь несёт суть претензий, а не только ссылку на файл — устраняет «испорченный телефон», из-за которого агент шёл «читать файл», терял ключевые P0/P1 / причину FAIL и заворачивался снова, выжигая `MAX_DEVELOPER_RETRIES` и токены. Новый defensive-модуль `src/review_parse.py` (контракт «never raise», как `src/frontmatter.py`): `extract_review_findings(path)` — дословные пункты P0/P1 из секции `## Findings` файла `12-review.md`; `extract_test_failures(path)` — релевантный фрагмент тела `13-test-report.md` (приоритет `## Вывод pytest` → FAIL-строки `## Результаты` → `## Итог`). Обе функции усекают результат до `MAX_FINDINGS_CHARS`/`MAX_FAILURES_CHARS` (≈2000) с маркером `…(truncated)`. Две rollback-ветки `src/stage_engine.py` (reviewer REQUEST_CHANGES, tester `check_tests_passed` FAIL) встраивают извлечённый текст и **сохраняют ссылку** на полный файл («Полный контекст»); при пустом/битом артефакте — graceful-фоллбэк на прежнюю ссылку-строку (никаких исключений в `advance_stage`). Tester-ветка дополнительно всегда включает `reason` гейта. Последовательность отката, `_developer_retry_count`, поля `AdvanceResult` и реестр `QG_CHECKS` не менялись. ADR `docs/work-items/ORCH-046/06-adr/ADR-001-embed-findings-in-task-desc.md`. Тесты: `tests/test_review_parse.py`, `tests/test_stage_engine.py::TestRollbackTaskDescEmbedding`. - **Поллинг с ретраем в quality-gate `check_ci_green`** (ORCH-045): гейт CI превращён из single-shot в polling, чтобы устранить race condition — раньше один опрос combined commit-status сразу после пуша developer-а ловил транзиентный `pending` (типично 1-3с, реальный кейс ORCH-017: опрос 17:58:54 → pending, CI дозеленел 17:58:55) и задача застревала насмерть без повторного опроса. Теперь: `success` → пропуск сразу; `failure`/`error` → провал сразу (терминально, ретрай бессмыслен); `pending`/unknown → `time.sleep` и повторный опрос до `ci_poll_max_attempts` раз; истечение попыток → явный `(False, "CI still pending after s")` (тупик больше не молчаливый); 404 → как раньше; транзиентная `httpx.HTTPError` на попытке логируется и ретраится в рамках бюджета. Параметры — новые настройки `ORCH_CI_POLL_MAX_ATTEMPTS` (12) и `ORCH_CI_POLL_INTERVAL_S` (10) в `src/config.py` (~2 мин ожидания pending). Сигнатура `check_ci_green(repo, branch)` и реестр `QG_CHECKS` не менялись; `check_tests_passed` не затронут. ADR `docs/architecture/adr/adr-0004-ci-poll-retry.md`. Тесты: `tests/test_qg.py::TestCheckCIGreen`. - **Прямые ссылки на BRD и Plane-таску в Telegram-уведомлении об апруве** (ORCH-017): пингующее сообщение `notify_approve_requested` теперь встраивает две HTML-``-ссылки — на `docs/work-items//01-brd.md` (Gitea branch-view: `gitea_public_url`→`gitea_url`) и на issue в Plane (`{web_base}/{workspace}/projects/{project_id}/issues/{plane_issue_id}/`). Новая настройка `ORCH_PLANE_WEB_URL` (внешний браузерный web-URL Plane; фолбэк на `plane_api_url`). **Loopback-guard:** если итоговый Plane web-base указывает на localhost/127.0.0.1/0.0.0.0/::1 или пуст — Plane-ссылка опускается (не выпускаем битый localhost-URL). Graceful degradation: каждая ссылка строится независимо и опускается при нехватке данных, сообщение и призыв «Переведите задачу в статус Approved …» сохраняются всегда; ровно одно пингующее сообщение, разделяемая `send_telegram` не тронута. Динамические подписи экранируются `html.escape`, `parse_mode=HTML` сохранён. ADR `docs/work-items/ORCH-017/06-adr/ADR-001-telegram-approve-links.md`. Тесты: `test_notify_approve_links.py`, `test_analysis_approve_flow_links.py`. - **Конфигурируемые модель LLM и режим работы (`--effort`) агентов** (ORCH-41): модель/effort каждого агента вынесены из хардкода `launcher.py` в конфиг — глобально per-agent (`ORCH_AGENT_MODEL_` / `ORCH_AGENT_EFFORT_`, дефолты `ORCH_AGENT_MODEL_DEFAULT=claude-opus-4-8`, `ORCH_AGENT_EFFORT_DEFAULT=high`) и per-project (`agent_models` / `agent_efforts` в `ORCH_PROJECTS_JSON`). Резолверы `resolve_agent_model` / `resolve_agent_effort` (приоритет project > per-agent env > default > пусто), валидация effort `{low,medium,high,xhigh,max}`, опц. `ORCH_AGENT_FALLBACK_MODEL` (`--fallback-model`). Хардкод `"model":"opus"` (architect/reviewer) удалён. Тесты: `test_resolve_agent_model.py`, `test_resolve_agent_effort.py`. diff --git a/docs/architecture/README.md b/docs/architecture/README.md index 4baebe5..9fdfe85 100644 --- a/docs/architecture/README.md +++ b/docs/architecture/README.md @@ -7,6 +7,7 @@ - **Webhook Receivers** (`src/webhooks/plane.py`, `gitea.py`) — приём событий, HMAC-проверка, дедупликация (`_dedup.py`). Роуты: `POST /webhook/plane`, `POST /webhook/gitea`. - **State Machine** (`src/stages.py`) — `STAGE_TRANSITIONS`: переходы, агент и QG каждой стадии. Хелперы: `get_next_stage`, `get_agent_for_stage`, `get_qg_for_stage`, `get_previous_stage`. - **Stage Engine** (`src/stage_engine.py`) — исполнение переходов, диспетчеризация QG (`_run_qg`), откаты, синхронизация с Plane. +- **Review/Test Parsers** (`src/review_parse.py`, ORCH-046) — defensive-извлечение дословного must-fix текста из артефактов для встраивания в `task_desc` заворота: `extract_review_findings` (P0/P1 из `12-review.md`), `extract_test_failures` (фрагмент тела `13-test-report.md`). Контракт «never raise»: любая ошибка → `""`. - **Quality Gates** (`src/qg/checks.py`) — проверки выхода со стадии, реестр `QG_CHECKS`. - **Agent Launcher** (`src/agents/launcher.py`) — запуск Claude CLI агентов в изолированном git worktree, мониторинг, auto-advance. - **Queue** (`src/queue_worker.py`, ORCH-1) — персистентная очередь задач (SQLite `jobs`), atomic claim, max_concurrency, ретраи, restart-safe. @@ -46,6 +47,13 @@ created → analysis → architecture → development → review → testing → - Deploy / deploy-staging FAILED → откат на `development`. - `get_previous_stage` использует порядок ключей `STAGE_TRANSITIONS`. +### Обогащение `task_desc` при заворотах (ORCH-046) +При откате на `development` `task_desc` (попадает в `.task-dev.md` developer-агента) несёт **дословный must-fix текст**, а не только ссылку — чтобы агент видел суть претензий сразу и не повторял ту же ошибку: +- **reviewer REQUEST_CHANGES** → дословные пункты P0/P1 из секции `## Findings` файла `12-review.md` (`extract_review_findings`); +- **tester `check_tests_passed` FAIL** → `reason` гейта + фрагмент тела `13-test-report.md` (приоритет: `## Вывод pytest` → FAIL-строки `## Результаты` → `## Итог`; `extract_test_failures`). + +Ссылка на полный файл-артефакт сохраняется всегда («Полный контекст»). Парсеры `src/review_parse.py` — defensive (never-raise); при отсутствующем/битом артефакте `task_desc` graceful-фоллбэк на прежнюю ссылку-строку, последовательность отката и retry-счётчик не меняются (ADR `docs/work-items/ORCH-046/06-adr/ADR-001-embed-findings-in-task-desc.md`). + ### Plane Sync: единый status-коммент агентов (ORCH-016) Все агенты (analyst / architect / developer / reviewer / tester / deployer) пишут финальный коммент через **один хелпер** `usage.build_status_comment(...)` (ADR `docs/work-items/ORCH-016/06-adr/ADR-001-unified-status-comment.md`). Формат HTML, разделители `
`: diff --git a/src/review_parse.py b/src/review_parse.py new file mode 100644 index 0000000..db57778 --- /dev/null +++ b/src/review_parse.py @@ -0,0 +1,205 @@ +"""Defensive extractors for reviewer / tester artifact bodies (ORCH-046). + +When a task is rolled back to ``development`` the stage engine builds the +``task_desc`` that ends up in the developer agent's ``.task-dev.md``. Historically +that text only carried a *link* to the artifact file (12-review.md / +13-test-report.md); the developer agent had to go read the file, and the key +must-fix points (reviewer P0/P1 findings, tester failure reason) were lost in +transit — "испорченный телефон" that burns the retry budget. + +This module extracts the **verbatim** must-fix text so the stage engine can embed +it directly in ``task_desc`` (ADR docs/work-items/ORCH-046/06-adr/ADR-001-*). + +Contract — **never raises** (mirrors ``src/frontmatter.py`` and +``src/qg/checks.py::_parse_tests_verdict``): any error — missing file, IOError, +malformed markdown/YAML, missing section — yields ``""``. The caller then falls +back to the previous link-only ``task_desc``. No network calls; disk reads only. +""" + +import logging +import re + +logger = logging.getLogger("orchestrator.review_parse") + +# Truncation limits (module-level per ТЗ §2.3). The full context always stays in +# the artifact file; the embedded text is a focused excerpt. +MAX_FINDINGS_CHARS = 2000 +MAX_FAILURES_CHARS = 2000 + +_TRUNCATED_MARKER = "\n…(truncated)" + +# Recognize a `### P0`/`### P1` subsection header by the presence of the P0/P1 +# token, tolerant to case and the dash/em-dash that follows it. +_P01_HEADER_RE = re.compile(r"(? str | None: + """Read a file as UTF-8. Never raises; returns None on any OS error.""" + try: + with open(path, "r", encoding="utf-8", errors="replace") as f: + return f.read() + except OSError as e: + logger.debug(f"review_parse: cannot open {path}: {e}") + return None + + +def _strip_frontmatter(content: str) -> str: + """Drop a leading ``--- … ---`` YAML frontmatter block, if present.""" + if content.startswith("---"): + parts = content.split("---", 2) + if len(parts) >= 3: + return parts[2] + return content + + +def _truncate(text: str, limit: int) -> str: + """Trim ``text`` to ``limit`` chars, appending a truncation marker if cut.""" + if len(text) <= limit: + return text + return text[:limit].rstrip() + _TRUNCATED_MARKER + + +def _section_body(md: str, heading_token: str) -> str: + """Return the body lines under the first ``## <…heading_token…>`` heading. + + Capture stops at the next level-2 (``## ``) heading. Matching is + case-insensitive substring match on the heading line, so callers pass a token + like ``"Вывод pytest"`` or ``"Findings"``. ``### ``-level headers do NOT + delimit the section (they start with ``"### "``, not ``"## "``). + """ + out: list[str] = [] + capturing = False + for line in md.splitlines(): + if line.startswith("## "): + if capturing: + break + if heading_token.lower() in line.lower(): + capturing = True + continue + if capturing: + out.append(line) + return "\n".join(out) + + +def _is_placeholder_item(text: str) -> bool: + """True for empty or template-placeholder list items (non-substantive). + + The canonical reviewer template seeds each severity with + ``- [ ] <описание> (если есть)``. Such lines must be ignored so an empty P0/P1 + subsection does not leak the placeholder into ``task_desc``. + """ + t = text.strip() + if not t: + return True + if "(если есть)" in t: + return True + # An item whose entire payload is an angle-bracket placeholder, e.g. "<описание>". + if t.startswith("<") and t.endswith(">"): + return True + return False + + +def _item_payload(line: str) -> str | None: + """If ``line`` is a markdown list item, return its payload text; else None. + + Handles ``- foo``, ``* foo`` and checkbox forms ``- [ ] foo`` / ``- [x] foo``. + """ + m = re.match(r"\s*[-*]\s+(?:\[[ xX]?\]\s*)?(.*)$", line) + if not m: + return None + return m.group(1) + + +def _findings_subsections(findings_body: str): + """Yield ``(header_line, body_lines)`` for each ``### `` subsection.""" + header: str | None = None + body: list[str] = [] + for line in findings_body.splitlines(): + if line.startswith("### "): + if header is not None: + yield header, body + header = line + body = [] + elif header is not None: + body.append(line) + if header is not None: + yield header, body + + +def extract_review_findings(path: str) -> str: + """Дословный текст P0/P1 findings из 12-review.md. Never raises; '' при ошибке/пусто. + + Reads the ``## Findings`` section of a reviewer report and returns the verbatim + P0 (Blocker) and P1 (Must fix) subsection items, suitable for embedding in a + rollback ``task_desc``. P2/P3 are ignored. Empty/placeholder-only subsections + are skipped; if no substantive P0/P1 item exists, returns ``""``. The result is + truncated to ``MAX_FINDINGS_CHARS``. + """ + content = _read(path) + if content is None: + return "" + + try: + body = _strip_frontmatter(content) + findings_body = _section_body(body, "Findings") + if not findings_body.strip(): + return "" + + blocks: list[str] = [] + for header, sub_body in _findings_subsections(findings_body): + if not _P01_HEADER_RE.search(header): + continue + kept: list[str] = [] + for line in sub_body: + payload = _item_payload(line) + if payload is None: + continue + if _is_placeholder_item(payload): + continue + kept.append(line.rstrip()) + if kept: + blocks.append("\n".join([header.rstrip(), *kept])) + + if not blocks: + return "" + return _truncate("\n\n".join(blocks), MAX_FINDINGS_CHARS) + except Exception as e: # defensive: never raise out of the extractor + logger.debug(f"review_parse: extract_review_findings failed for {path}: {e}") + return "" + + +def extract_test_failures(path: str) -> str: + """Релевантный фрагмент тела 13-test-report.md (причина FAIL). Never raises; '' при ошибке/пусто. + + Picks the first non-empty source, in priority order: + 1. ``## Вывод pytest`` — the pytest run output (shows failing tests); + 2. rows of the ``## Результаты`` table that contain ``FAIL``; + 3. ``## Итог`` — the verdict summary. + The result is truncated to ``MAX_FAILURES_CHARS``. The gate ``reason`` is added + by the caller; this returns the report-body excerpt on top of it. + """ + content = _read(path) + if content is None: + return "" + + try: + # 1. pytest output. + pytest_out = _section_body(content, "Вывод pytest").strip() + if pytest_out: + return _truncate(pytest_out, MAX_FAILURES_CHARS) + + # 2. FAIL rows from the results table. + results = _section_body(content, "Результаты") + fail_rows = [ln.rstrip() for ln in results.splitlines() if "FAIL" in ln.upper()] + if fail_rows: + return _truncate("\n".join(fail_rows).strip(), MAX_FAILURES_CHARS) + + # 3. Verdict summary. + itog = _section_body(content, "Итог").strip() + if itog: + return _truncate(itog, MAX_FAILURES_CHARS) + + return "" + except Exception as e: # defensive: never raise out of the extractor + logger.debug(f"review_parse: extract_test_failures failed for {path}: {e}") + return "" diff --git a/src/stage_engine.py b/src/stage_engine.py index 9d9ce13..008790b 100644 --- a/src/stage_engine.py +++ b/src/stage_engine.py @@ -32,6 +32,7 @@ from dataclasses import dataclass, field from .db import get_db, update_task_stage, enqueue_job from .stages import get_next_stage, get_qg_for_stage, get_agent_for_stage from .git_worktree import get_worktree_path +from .review_parse import extract_review_findings, extract_test_failures from .qg.checks import QG_CHECKS from .notifications import ( notify_stage_change, @@ -416,12 +417,24 @@ def _handle_qg_failure_rollbacks( result.rolled_back_to = "development" retry_count = _developer_retry_count(task_id) if retry_count < MAX_DEVELOPER_RETRIES: - task_desc = ( + # ORCH-046: embed the verbatim P0/P1 findings into task_desc so the + # developer agent sees the must-fix points directly (not just a link). + # extract_review_findings never raises; "" -> graceful link-only fallback. + review_ref = f"docs/work-items/{work_item_id}/12-review.md" + review_path = os.path.join(get_worktree_path(repo, branch), review_ref) + findings = extract_review_findings(review_path) + head = ( f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\n" f"Stage: development\nNote: REQUEST_CHANGES from reviewer " - f"(attempt {retry_count+1}/3). Fix findings in " - f"docs/work-items/{work_item_id}/12-review.md" + f"(attempt {retry_count+1}/3)." ) + if findings: + task_desc = ( + f"{head}\nFindings (P0/P1):\n{findings}\n" + f"Полный контекст: {review_ref}" + ) + else: + task_desc = f"{head} Fix findings in {review_ref}" new_job = enqueue_job("developer", repo, task_desc, task_id=task_id) result.enqueued_agent = "developer" result.enqueued_job_id = new_job @@ -452,11 +465,23 @@ def _handle_qg_failure_rollbacks( ) retry_count = _developer_retry_count(task_id) if retry_count < MAX_DEVELOPER_RETRIES: - task_desc = ( + # ORCH-046: embed the gate `reason` plus a verbatim excerpt of the + # test-report body (pytest output / FAIL rows / Итог) into task_desc. + # extract_test_failures never raises; "" -> graceful reason+link fallback. + report_ref = f"docs/work-items/{work_item_id}/13-test-report.md" + report_path = os.path.join(get_worktree_path(repo, branch), report_ref) + failures = extract_test_failures(report_path) + head = ( f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\n" - f"Stage: development\nNote: Tests FAILED. " - f"Fix failures described in docs/work-items/{work_item_id}/13-test-report.md" + f"Stage: development\nNote: Tests FAILED. Причина: {reason}." ) + if failures: + task_desc = ( + f"{head}\nДетали:\n{failures}\n" + f"Полный контекст: {report_ref}" + ) + else: + task_desc = f"{head} Fix failures described in {report_ref}" new_job = enqueue_job("developer", repo, task_desc, task_id=task_id) result.enqueued_agent = "developer" result.enqueued_job_id = new_job diff --git a/tests/test_review_parse.py b/tests/test_review_parse.py new file mode 100644 index 0000000..bd8b49c --- /dev/null +++ b/tests/test_review_parse.py @@ -0,0 +1,237 @@ +"""Unit tests for src/review_parse (ORCH-046). + +Covers the defensive extractors that pull verbatim must-fix text out of the +reviewer / tester artifacts for embedding into the rollback ``task_desc``: + + - extract_review_findings (12-review.md, ## Findings -> P0/P1) + - extract_test_failures (13-test-report.md, pytest/FAIL/Итог excerpt) + +Both must NEVER raise (return "" on missing/broken/empty input) and must ignore +template placeholders / non-must-fix severities. See 04-test-plan.yaml (TC-01..08). +""" + +import os +import tempfile + +import pytest + +from src.review_parse import ( + extract_review_findings, + extract_test_failures, + MAX_FINDINGS_CHARS, + MAX_FAILURES_CHARS, +) + + +@pytest.fixture +def write_file(tmp_path): + def _w(name: str, content: str) -> str: + p = tmp_path / name + p.write_text(content, encoding="utf-8") + return str(p) + return _w + + +# --------------------------------------------------------------------------- +# extract_review_findings +# --------------------------------------------------------------------------- +_REVIEW_WITH_FINDINGS = """--- +type: review +work_item_id: ORCH-046 +verdict: REQUEST_CHANGES +version: 1 +--- + +# Review ORCH-046 + +## Summary +Несколько проблем. + +## Findings + +### P0 — Blocker +- [ ] Документация не обновлена при изменении src/review_parse.py + +### P1 — Must fix +- [ ] extract_test_failures не обрабатывает пустой отчёт +- [ ] Отсутствует docstring у _section_body + +### P2 — Should fix +- [ ] Переименовать переменную blocks в more descriptive + +## Документация +Требует обновления README. +""" + + +class TestExtractReviewFindings: + def test_tc01_returns_verbatim_p0_p1(self, write_file): + """TC-01: P0/P1 findings present -> verbatim text returned (AC-1, AC-5).""" + path = write_file("12-review.md", _REVIEW_WITH_FINDINGS) + out = extract_review_findings(path) + # P0 + P1 verbatim items present. + assert "Документация не обновлена при изменении src/review_parse.py" in out + assert "extract_test_failures не обрабатывает пустой отчёт" in out + assert "Отсутствует docstring у _section_body" in out + # Subsection headers preserved. + assert "P0" in out and "P1" in out + # P2 must NOT leak in. + assert "Переименовать переменную" not in out + + def test_tc02_only_p2_p3_returns_empty(self, write_file): + """TC-02: only P2/P3 (no must-fix P0/P1) -> '' (AC-5).""" + content = """--- +verdict: REQUEST_CHANGES +--- + +## Findings + +### P0 — Blocker +- [ ] <описание> (если есть) + +### P1 — Must fix +- [ ] <описание> (если есть) + +### P2 — Should fix +- [ ] Косметика в naming +""" + path = write_file("12-review.md", content) + assert extract_review_findings(path) == "" + + def test_tc03_missing_file_returns_empty(self): + """TC-03: non-existent path -> '' without raising (AC-4).""" + missing = os.path.join(tempfile.gettempdir(), "no-such-review-orch046.md") + assert extract_review_findings(missing) == "" + + def test_tc04_broken_or_no_findings_section_returns_empty(self, write_file): + """TC-04: empty file / markdown without ## Findings -> '' (AC-4, AC-5).""" + # Empty file. + assert extract_review_findings(write_file("empty.md", "")) == "" + # No Findings section. + no_section = "# Review\n\n## Summary\nвсё хорошо\n" + assert extract_review_findings(write_file("nofind.md", no_section)) == "" + # Broken YAML frontmatter (unterminated) — body parsing still graceful. + broken = "---\nverdict: [unclosed\n# Review\nno findings here\n" + assert extract_review_findings(write_file("broken.md", broken)) == "" + + def test_tc05_long_findings_truncated(self, write_file): + """TC-05: very long findings truncated to limit with marker (AC-1).""" + big_item = "- [ ] " + ("x" * 5000) + content = f"## Findings\n\n### P0 — Blocker\n{big_item}\n" + path = write_file("12-review.md", content) + out = extract_review_findings(path) + assert len(out) <= MAX_FINDINGS_CHARS + len("\n…(truncated)") + assert "…(truncated)" in out + + def test_case_insensitive_and_dash_tolerant_header(self, write_file): + """P0/P1 recognized regardless of case / dash style.""" + content = """## Findings + +### p0 - blocker +- [ ] Нижний регистр заголовка + +### P1 — Must fix +- [ ] Em-dash заголовок +""" + out = extract_review_findings(write_file("12-review.md", content)) + assert "Нижний регистр заголовка" in out + assert "Em-dash заголовок" in out + + def test_never_raises_on_directory_path(self, tmp_path): + """Passing a directory path must not raise -> ''.""" + assert extract_review_findings(str(tmp_path)) == "" + + +# --------------------------------------------------------------------------- +# extract_test_failures +# --------------------------------------------------------------------------- +_REPORT_FAIL = """--- +type: test-report +work_item_id: ORCH-046 +result: FAIL +--- + +# Test Report — ORCH-046 + +## Окружение +- Python: 3.12 + +## Результаты + +| TC ID | Описание | Результат | +|-------|----------|-----------| +| TC-01 | парсер findings | PASS | +| TC-09 | rollback task_desc | FAIL | + +## Вывод pytest +FAILED tests/test_stage_engine.py::TestReviewerRequestChanges::test_embed - AssertionError +1 failed, 40 passed in 2.13s + +## Итог +FAIL +""" + + +class TestExtractTestFailures: + def test_tc06_extracts_pytest_output(self, write_file): + """TC-06: relevant body excerpt (pytest output) from FAIL report (AC-2, AC-5).""" + path = write_file("13-test-report.md", _REPORT_FAIL) + out = extract_test_failures(path) + assert "FAILED tests/test_stage_engine.py" in out + assert "1 failed, 40 passed" in out + + def test_priority_falls_back_to_fail_rows(self, write_file): + """No pytest section -> FAIL rows of the results table are used.""" + content = """--- +result: FAIL +--- + +## Результаты + +| TC ID | Описание | Результат | +|-------|----------|-----------| +| TC-01 | ok | PASS | +| TC-09 | broken | FAIL | + +## Итог +FAIL +""" + out = extract_test_failures(write_file("13-test-report.md", content)) + assert "TC-09" in out + assert "broken" in out + # PASS rows are not failure-relevant. + assert "TC-01" not in out + + def test_priority_falls_back_to_itog(self, write_file): + """No pytest section and no FAIL rows -> Итог summary is used.""" + content = """--- +result: FAIL +--- + +## Итог +Регресс упал: смотрите CI лог. +""" + out = extract_test_failures(write_file("13-test-report.md", content)) + assert "Регресс упал" in out + + def test_tc07_missing_file_returns_empty(self): + """TC-07: non-existent path -> '' without raising (AC-4).""" + missing = os.path.join(tempfile.gettempdir(), "no-such-report-orch046.md") + assert extract_test_failures(missing) == "" + + def test_tc08_broken_or_empty_report_returns_empty(self, write_file): + """TC-08: empty / section-less report -> '' without raising (AC-4, AC-5).""" + assert extract_test_failures(write_file("empty.md", "")) == "" + no_sections = "---\nresult: FAIL\n---\n\n# Test Report\nничего полезного\n" + assert extract_test_failures(write_file("nosec.md", no_sections)) == "" + + def test_long_failures_truncated(self, write_file): + """Long pytest output is truncated to the limit with a marker.""" + big = "x" * 5000 + content = f"## Вывод pytest\n{big}\n" + out = extract_test_failures(write_file("13-test-report.md", content)) + assert len(out) <= MAX_FAILURES_CHARS + len("\n…(truncated)") + assert "…(truncated)" in out + + def test_never_raises_on_directory_path(self, tmp_path): + assert extract_test_failures(str(tmp_path)) == "" diff --git a/tests/test_stage_engine.py b/tests/test_stage_engine.py index b07720a..03378d9 100644 --- a/tests/test_stage_engine.py +++ b/tests/test_stage_engine.py @@ -101,6 +101,14 @@ def _jobs(): return [dict(r) for r in rows] +def _job_contents(): + """task_content of every enqueued job, oldest first (ORCH-046 task_desc check).""" + conn = get_db() + rows = conn.execute("SELECT task_content FROM jobs ORDER BY id").fetchall() + conn.close() + return [r[0] for r in rows] + + def _add_developer_runs(task_id, n): conn = get_db() for _ in range(n): @@ -335,6 +343,179 @@ class TestTesterFail: assert _jobs() == [] +# --------------------------------------------------------------------------- +# ORCH-046: rollback task_desc carries verbatim reviewer/tester must-fix text +# --------------------------------------------------------------------------- +_REVIEW_MD = """--- +type: review +work_item_id: ET-001 +verdict: REQUEST_CHANGES +version: 1 +--- + +# Review ET-001 + +## Summary +Есть блокеры. + +## Findings + +### P0 — Blocker +- [ ] Гонка в claim_next_job: отсутствует guard в WHERE + +### P1 — Must fix +- [ ] Нет обработки OSError при чтении отчёта + +### P2 — Should fix +- [ ] Переименовать blocks +""" + +_REPORT_MD = """--- +type: test-report +work_item_id: ET-001 +result: FAIL +--- + +# Test Report — ET-001 + +## Результаты + +| TC ID | Описание | Результат | +|-------|----------|-----------| +| TC-09 | rollback | FAIL | + +## Вывод pytest +FAILED tests/test_stage_engine.py::TestTaskDescEmbedding - AssertionError +1 failed, 50 passed in 3.01s + +## Итог +FAIL +""" + + +class TestRollbackTaskDescEmbedding: + """ORCH-046 AC-1/AC-2/AC-3/AC-4: the rollback task_desc embeds verbatim + must-fix text (reviewer P0/P1, tester reason + report excerpt) plus the link. + """ + + def _patch_worktree(self, monkeypatch, tmp_path, work_item_id, filename, body): + """Make get_worktree_path resolve to tmp_path and seed the artifact file.""" + artifact = tmp_path / "docs" / "work-items" / work_item_id + artifact.mkdir(parents=True, exist_ok=True) + (artifact / filename).write_text(body, encoding="utf-8") + monkeypatch.setattr( + stage_engine, "get_worktree_path", lambda repo, branch: str(tmp_path) + ) + + def test_tc09_reviewer_embeds_p0_p1_and_link(self, monkeypatch, tmp_path): + """TC-09: reviewer REQUEST_CHANGES -> task_desc has verbatim P0/P1 + link.""" + monkeypatch.setattr( + stage_engine, "QG_CHECKS", + {**stage_engine.QG_CHECKS, + "check_reviewer_verdict": _fail("verdict: REQUEST_CHANGES")}, + ) + self._patch_worktree(monkeypatch, tmp_path, "ET-001", "12-review.md", _REVIEW_MD) + task_id = _make_task("review") + advance_stage(task_id, "review", "enduro-trails", "ET-001", + "feature/ET-001-x", finished_agent="reviewer") + contents = _job_contents() + assert len(contents) == 1 + desc = contents[0] + # AC-1: verbatim P0/P1 findings. + assert "Гонка в claim_next_job: отсутствует guard в WHERE" in desc + assert "Нет обработки OSError при чтении отчёта" in desc + # P2 must not leak. + assert "Переименовать blocks" not in desc + # AC-3: link to full file preserved. + assert "docs/work-items/ET-001/12-review.md" in desc + + def test_tc10_tester_embeds_reason_excerpt_and_link(self, monkeypatch, tmp_path): + """TC-10: tester FAIL -> task_desc has reason + report excerpt + link.""" + monkeypatch.setattr( + stage_engine, "QG_CHECKS", + {**stage_engine.QG_CHECKS, + "check_tests_passed": _fail("1 test failed")}, + ) + self._patch_worktree( + monkeypatch, tmp_path, "ET-001", "13-test-report.md", _REPORT_MD + ) + task_id = _make_task("testing") + advance_stage(task_id, "testing", "enduro-trails", "ET-001", + "feature/ET-001-x", finished_agent="tester") + contents = _job_contents() + assert len(contents) == 1 + desc = contents[0] + # AC-2: gate reason present. + assert "1 test failed" in desc + # AC-2: report body excerpt (pytest output) present. + assert "FAILED tests/test_stage_engine.py::TestTaskDescEmbedding" in desc + # AC-3: link to full file preserved. + assert "docs/work-items/ET-001/13-test-report.md" in desc + + def test_tc11_reviewer_graceful_fallback_when_no_file(self, monkeypatch, tmp_path): + """TC-11: missing/broken 12-review.md -> graceful link-only fallback (AC-4).""" + monkeypatch.setattr( + stage_engine, "QG_CHECKS", + {**stage_engine.QG_CHECKS, + "check_reviewer_verdict": _fail("verdict: REQUEST_CHANGES")}, + ) + # Worktree resolves but the review file does not exist. + monkeypatch.setattr( + stage_engine, "get_worktree_path", lambda repo, branch: str(tmp_path) + ) + task_id = _make_task("review") + res = advance_stage(task_id, "review", "enduro-trails", "ET-001", + "feature/ET-001-x", finished_agent="reviewer") + # Rollback still happens exactly as before. + assert res.rolled_back_to == "development" + assert _stage(task_id) == "development" + contents = _job_contents() + assert len(contents) == 1 + desc = contents[0] + # Falls back to the previous link-string behavior (no findings block). + assert "Fix findings in docs/work-items/ET-001/12-review.md" in desc + assert "Findings (P0/P1):" not in desc + + def test_tc11_tester_graceful_fallback_keeps_reason(self, monkeypatch, tmp_path): + """AC-2/AC-4: missing report -> reason still present, link fallback.""" + monkeypatch.setattr( + stage_engine, "QG_CHECKS", + {**stage_engine.QG_CHECKS, + "check_tests_passed": _fail("2 tests failed")}, + ) + monkeypatch.setattr( + stage_engine, "get_worktree_path", lambda repo, branch: str(tmp_path) + ) + task_id = _make_task("testing") + advance_stage(task_id, "testing", "enduro-trails", "ET-001", + "feature/ET-001-x", finished_agent="tester") + desc = _job_contents()[0] + assert "2 tests failed" in desc + assert "docs/work-items/ET-001/13-test-report.md" in desc + + def test_tc12_retry_and_rollback_behavior_unchanged(self, monkeypatch, tmp_path): + """TC-12 (AC-6): embedding does not change retry/rollback semantics. + + 4th developer attempt still alerts instead of enqueueing, even with a + valid review file present. + """ + monkeypatch.setattr( + stage_engine, "QG_CHECKS", + {**stage_engine.QG_CHECKS, + "check_reviewer_verdict": _fail("verdict: REQUEST_CHANGES")}, + ) + self._patch_worktree(monkeypatch, tmp_path, "ET-001", "12-review.md", _REVIEW_MD) + task_id = _make_task("review") + _add_developer_runs(task_id, 3) # already at the cap + res = advance_stage(task_id, "review", "enduro-trails", "ET-001", + "feature/ET-001-x", finished_agent="reviewer") + assert res.rolled_back_to == "development" + assert res.alerted is True + assert stage_engine.send_telegram.called + # No new developer job past the cap, regardless of embedding. + assert _jobs() == [] + + # --------------------------------------------------------------------------- # BUG 8: deploy verdict gates deploy -> done (not the LLM exit code) # ---------------------------------------------------------------------------