fix(qg): testing gate reads documented tester result: frontmatter key (ORCH-017)

check_tests_passed/_parse_tests_verdict gated the testing -> deploy-staging transition on `verdict:`/`status:` in 13-test-report.md, but the tester agent prompt (.openclaw/agents/tester*) documents `result: PASS | FAIL` as THE machine-readable field. A report that followed the contract literally (ORCH-017: only `result: PASS`, no verdict:/status:) was bounced back to development with a misleading "Tests FAILED". ORCH-016 only passed because its report redundantly carried both `verdict:` and `result:`. Treat `result:` as a first-class machine field alongside verdict/status; a negative token in any field stays authoritative (ET-013 contract preserved). Self-hosting QG fix: unblocks every project whose tester emits only `result:`. Docs updated in-PR: CHANGELOG, architecture README machine-keys note. Tests: test_qg.py::TestCheckTestsPassed::test_result_pass_only_passes / _fail_only_fails. Refs: ORCH-017 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-05 18:34:25 +00:00
parent 0e999d289d
commit e62d51aa77
4 changed files with 51 additions and 13 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -21,6 +21,7 @@
 - Цепочка стадий: `... testing → deploy-staging → deploy → done` (была без `deploy-staging`).

 ### Fixed
+- **Гейт `check_tests_passed` теперь читает документированный ключ `result:`** (ORCH-017): `_parse_tests_verdict` распознавал только `verdict:`/`status:` во frontmatter `13-test-report.md`, тогда как промпт tester-агента (`.openclaw/agents/tester*`) предписывает писать `result: PASS | FAIL`. Отчёт, следующий контракту буквально (только `result: PASS`, без `verdict:`/`status:`), проваливал гейт с обманчивым «Tests FAILED» и откатывался на `development` (ORCH-016 проходил лишь потому, что дублировал `verdict:` и `result:`). Теперь `result:` — равноправное машинное поле наряду с `verdict:`/`status:`; отрицательный токен в любом из полей по-прежнему авторитетен. Тесты: `tests/test_qg.py::TestCheckTestsPassed::test_result_pass_only_passes`, `…::test_result_fail_only_fails`.
 - БАГ-8: провал deploy/deploy-staging → корректный откат на `development`.
 - Изоляция тестов от живого Plane API (PR #27): autouse-фикстура сброса settings.

--- a/docs/architecture/README.md
+++ b/docs/architecture/README.md
@@ -58,7 +58,7 @@ created → analysis → architecture → development → review → testing →
 ```

 - **Длительность** считается launcher'ом (`_monitor_agent`) и пробрасывается в `_post_usage_comments`; для analyst (коммент строится в `stage_engine`) используется DB-фоллбэк `usage.get_agent_duration(task_id, agent)`.
- **Vердикт-парсер** — `src/frontmatter.read_frontmatter_value(...)` (defensive, никогда не raise). Машинные ключи: `verdict:` (reviewer/tester), `deploy_status:` (14-deploy-log.md), `staging_status:` (15-staging-log.md).
+- **Vердикт-парсер** — `src/frontmatter.read_frontmatter_value(...)` (defensive, никогда не raise). Машинные ключи: `verdict:` (reviewer/tester), `result:` (tester `13-test-report.md`, осн. ключ по промпту tester-агента; `check_tests_passed` читает `verdict:`/`status:`/`result:`), `deploy_status:` (14-deploy-log.md), `staging_status:` (15-staging-log.md).
 - Формат коммента **не** меняет реестр гейтов и стадий; коммент — отображение, не управление.

 ## База данных (SQLite)
--- a/src/qg/checks.py
+++ b/src/qg/checks.py
@@ -179,15 +179,24 @@ _TESTS_POSITIVE_TOKENS = ("PASSED", "PASS", "READY-TO-DEPLOY", "READY_TO_DEPLOY"

 def _parse_tests_verdict(content: str) -> tuple[bool, str]:
    """Map a 13-test-report.md body to a quality-gate verdict by reading ONLY the
-    machine-readable `verdict:` (and corroborating `status:`) YAML frontmatter fields.
+    machine-readable `verdict:` / `status:` / `result:` YAML frontmatter fields.
+
+    ORCH-017: the tester agent prompt (`.openclaw/agents/tester*`) documents
+    `result: PASS | FAIL` as THE machine-readable field, but this gate previously
+    read only `verdict:`/`status:`. A tester that followed the documented contract
+    literally (e.g. ORCH-017's report: `result: PASS`, no verdict:/status:) was
+    bounced back to development with a misleading "Tests FAILED". We now treat
+    `result:` as a first-class machine field alongside verdict/status so the gate
+    matches the contract the tester is actually told to emit. (ORCH-016 only passed
+    before because its report redundantly carried both `verdict:` AND `result:`.)

    Rules:
-      - No frontmatter / bad YAML / neither field present -> (False, reason).
-      - A negative token (BLOCKED/FAILED/...) in verdict OR status -> (False) and is
+      - No frontmatter / bad YAML / none of the three fields present -> (False, reason).
+      - A negative token (BLOCKED/FAILED/...) in any field -> (False) and is
        authoritative (ET-013 main case: verdict BLOCKED wins over any prose PASS).
-      - Otherwise a positive token (PASS/PASSED/READY-TO-DEPLOY/...) in verdict OR
-        status -> (True).
-      - Anything else (unrecognized / empty verdict) -> (False, reason).
+      - Otherwise a positive token (PASS/PASSED/READY-TO-DEPLOY/...) in any field
+        -> (True).
+      - Anything else (unrecognized / empty fields) -> (False, reason).
    """
    import yaml

@@ -207,19 +216,24 @@ def _parse_tests_verdict(content: str) -> tuple[bool, str]:

    verdict = str(fm.get("verdict", "") or "").upper().strip()
    status = str(fm.get("status", "") or "").upper().strip()
+    result = str(fm.get("result", "") or "").upper().strip()

-    if not verdict and not status:
-        return False, "No machine-readable verdict/status in test report frontmatter"
+    if not verdict and not status and not result:
+        return False, "No machine-readable verdict/status/result in test report frontmatter"

-    fields = f"{verdict} {status}"
+    label = verdict or status or result
+    fields = f"{verdict} {status} {result}"
    for neg in _TESTS_NEGATIVE_TOKENS:
        if neg in fields:
-            return False, f"Test verdict: {verdict or status} ({neg})"
+            return False, f"Test verdict: {label} ({neg})"
    for pos in _TESTS_POSITIVE_TOKENS:
        if pos in fields:
-            return True, f"Test verdict: {verdict or status} (PASS)"
+            return True, f"Test verdict: {label} (PASS)"

-    return False, f"No recognized PASS verdict in frontmatter (verdict={verdict!r}, status={status!r})"
+    return False, (
+        "No recognized PASS verdict in frontmatter "
+        f"(verdict={verdict!r}, status={status!r}, result={result!r})"
+    )


 def check_analysis_approved(repo: str, work_item_id: str, branch: str | None = None) -> tuple[bool, str]:
--- a/tests/test_qg.py
+++ b/tests/test_qg.py
@@ -216,6 +216,29 @@ class TestCheckTestsPassed:
        passed, reason = check_tests_passed("enduro-trails", "ET-001")
        assert passed is True

+    def test_result_pass_only_passes(self, setup_work_item_dir):
+        # ORCH-017: the tester agent prompt documents `result: PASS | FAIL` as the
+        # machine-readable field. A report that follows that contract literally
+        # (only `result: PASS`, no verdict:/status:) MUST pass the gate. Before this
+        # fix the gate ignored `result:` and bounced such reports to development.
+        self._write(
+            setup_work_item_dir,
+            "---\ntype: test-report\nwork_item_id: ET-001\nresult: PASS\n---\n\nbody\n",
+        )
+        passed, reason = check_tests_passed("enduro-trails", "ET-001")
+        assert passed is True
+        assert "PASS" in reason
+
+    def test_result_fail_only_fails(self, setup_work_item_dir):
+        # The negative side of the documented `result: PASS | FAIL` contract.
+        self._write(
+            setup_work_item_dir,
+            "---\ntype: test-report\nresult: FAIL\n---\n\n23 passed in body\n",
+        )
+        passed, reason = check_tests_passed("enduro-trails", "ET-001")
+        assert passed is False
+        assert "FAIL" in reason.upper()
+
    def test_blocked_verdict_with_pass_in_body_fails(self, setup_work_item_dir):
        # THE ET-013 BUG: verdict BLOCKED but body is full of "PASS"/"passed".
        self._write(