diff --git a/.openclaw/agents/deployer.md b/.openclaw/agents/deployer.md index e668931..53611cb 100644 --- a/.openclaw/agents/deployer.md +++ b/.openclaw/agents/deployer.md @@ -21,10 +21,20 @@ On stage `deploy-staging` your job is to run the staging test suite and write a ### Steps: -1. Run the staging test suite against the live staging environment: +1. Run the staging test suite against the live staging environment. + **CANONICAL: run INSIDE the `orchestrator-staging` container via `docker exec`** + (ORCH-048, ADR-001) — NOT from the host: ```bash - python3 scripts/staging_check.py --base-url http://localhost:8501 --mode stub + docker exec orchestrator-staging \ + python3 /repos/orchestrator/scripts/staging_check.py \ + --base-url http://localhost:8501 --mode stub ``` + Why: the B6 registry-isolation check reads the registry from the running + instance's own process-env (`.env.staging`). Running from the host leaves + `ORCH_PROJECTS_JSON` unset → B6 falls back to the default (ET+ORCH) registry + → false FAIL → spurious rollback. The script path is `/repos/orchestrator/scripts/…` + (bind-mount); `scripts/` is NOT copied into the image, so `/app/scripts` does + not exist. Details: `docs/operations/STAGING_CHECK.md`. 2. Check the exit code: - Exit code **0** = all tests PASS → `staging_status: SUCCESS` diff --git a/CHANGELOG.md b/CHANGELOG.md index c008b1f..2e3cbfb 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -23,6 +23,7 @@ - Цепочка стадий: `... testing → deploy-staging → deploy → done` (была без `deploy-staging`). ### Fixed +- **Staging-чек B6 читает реестр из окружения работающего staging-инстанса** (ORCH-048): блок B6 «Registry: sandbox present, prod ET/ORCH absent» в `scripts/staging_check.py` давал **ложный FAIL** (`prod-ET=YES(BAD!)`, `prod-ORCH=YES(BAD!)`) при фактически исправной изоляции — единственный чек suite, который не ходил к инстансу по HTTP, а импортировал `src.projects` локально через host-path хак `sys.path.insert(0, "/repos/orchestrator")` + `importlib.reload`, строя реестр из `ORCH_PROJECTS_JSON` **process-env запускающего процесса**. При фактическом запуске деплоером с хоста переменная не задана → дефолт `_DEFAULT_PROJECTS` (ET+ORCH) → ложный FAIL → лишний откат `deploy-staging → development`. Решение (вариант «в», ADR-001): host-path хак удалён; suite канонически запускается ВНУТРИ контейнера `orchestrator-staging` через `docker exec … python3 /repos/orchestrator/scripts/staging_check.py` (`scripts/` доступен только через bind-mount, `import src.projects` резолвится через `PYTHONPATH=/app` из кода контейнера, env — `.env.staging`) → B6 читает реестр именно работающего инстанса, без HTTP-bootstrap и «курицы-яйца». Логика вердикта вынесена в чистую `_evaluate_b6(known) -> (passed, detail)` (инвариант `passed ⟺ SANDBOX ∈ known ∧ PROD_ET ∉ known ∧ PROD_ORCH ∉ known`, формат detail сохранён) + `_known_project_ids_from_registry()` / `_run_b6()` с детерминированным FAIL при недоступности источника (не ложный PASS, не необработанное исключение). Синхронно обновлены `.openclaw/agents/deployer.md` (команда стадии через `docker exec`) и `docs/operations/STAGING_CHECK.md`. `src/projects.py`, `.env*` и прочие чеки A/B4/B5/C не тронуты; реестр `QG_CHECKS` и `check_staging_status` (ADR-0003) не менялись. ADR `docs/work-items/ORCH-048/06-adr/ADR-001-b6-registry-via-in-container-run.md`. Тесты: `tests/test_staging_check_b6.py`. - **Testing-гейт `check_tests_passed` читает `result:` наравне с `verdict:`/`status:`** (ORCH-047): парсер `_parse_tests_verdict` (`src/qg/checks.py`) теперь принимает три равноправных машиночитаемых поля frontmatter `13-test-report.md` — `result:` (канон промпта тестера `.openclaw/agents/tester.md`, `result: PASS|FAIL`), плюс легаси `verdict:` и `status:` (enduro-trails ET-001..ET-014); достаточно любого одного непустого. Устраняет рассинхрон контракта: тестер честно эмитил `result: PASS` без `verdict:`/`status:`, парсер попадал в ветку «нет машинного вердикта» → откат `testing → development` в петлю до исчерпания `MAX_DEVELOPER_RETRIES` (наблюдалось на ORCH-17; ORCH-016 прошёл лишь из-за избыточного дублирования полей). Семантика приоритетов сохранена и распространена на все три поля через объединённую строку: negative-токен в любом поле авторитетен (перебивает positive), наборы токенов заморожены (обратная совместимость). Сигнатура гейта, имя и реестр `QG_CHECKS` не менялись. ADR `docs/work-items/ORCH-047/06-adr/ADR-001-result-field-in-tests-gate.md`. Тесты: `tests/test_qg.py::TestCheckTestsPassed`. - БАГ-8: провал deploy/deploy-staging → корректный откат на `development`. - Изоляция тестов от живого Plane API (PR #27): autouse-фикстура сброса settings. diff --git a/scripts/staging_check.py b/scripts/staging_check.py index 87edf59..75ba892 100644 --- a/scripts/staging_check.py +++ b/scripts/staging_check.py @@ -8,8 +8,14 @@ Checks: Block C — E2E (create task in SANDBOX → trigger pipeline via /webhook/plane → verify branch + job enqueued → CLEANUP in finally) -Usage (inside the container or with correct env set): - python3 scripts/staging_check.py [--base-url http://localhost:8501] [--mode stub|full-real] +Usage — CANONICAL: run INSIDE the orchestrator-staging container (ORCH-048, ADR-001) +so B6 reads the registry from the running instance's own env (.env.staging): + docker exec orchestrator-staging \ + python3 /repos/orchestrator/scripts/staging_check.py \ + --base-url http://localhost:8501 [--mode stub|full-real] + +Running from the host leaves ORCH_PROJECTS_JSON unset → B6 falls back to the +default (ET+ORCH) registry → false FAIL. See docs/operations/STAGING_CHECK.md. Exit code: 0 = all PASS, non-zero = at least one FAIL. @@ -214,6 +220,59 @@ SANDBOX_PROJECT_ID = "8c5a3025-4f9d-4190-b79f-fa06276bb27e" PROD_ET_PROJECT_ID = "7a79f0a9-5278-49cd-9007-9a338f238f9c" PROD_ORCH_PROJECT_ID = "8da6aa25-a60e-44d6-a1e2-d8ae59aa7d6a" +B6_LABEL = "B6 Registry: sandbox present, prod ET/ORCH absent" + + +def _evaluate_b6(known: set[str]) -> tuple[bool, str]: + """Pure verdict logic for the B6 registry-isolation check (ORCH-048). + + PASS ⟺ SANDBOX ∈ known ∧ PROD_ET ∉ known ∧ PROD_ORCH ∉ known (TR-2). + ``detail`` keeps the human-readable ``sandbox=…, prod-ET=…, prod-ORCH=…`` + format (TR-3). Isolated from any I/O so both outcomes are unit-testable + without a live staging instance or docker (02-trz §9, ADR-001). + """ + sandbox_present = SANDBOX_PROJECT_ID in known + et_absent = PROD_ET_PROJECT_ID not in known + orch_absent = PROD_ORCH_PROJECT_ID not in known + passed = sandbox_present and et_absent and orch_absent + detail = ( + f"sandbox={'YES' if sandbox_present else 'NO'}, " + f"prod-ET={'NO(good)' if et_absent else 'YES(BAD!)'}, " + f"prod-ORCH={'NO(good)' if orch_absent else 'YES(BAD!)'}" + ) + return passed, detail + + +def _known_project_ids_from_registry() -> set[str]: + """Registry of the *running staging instance* — its own process-env (ORCH-048). + + The suite is canonically run INSIDE ``orchestrator-staging`` via + ``docker exec`` (ADR-001), so ``src.projects`` resolves through the + container's ``PYTHONPATH=/app`` to ``/app/src/projects.py`` and reads + ``ORCH_PROJECTS_JSON`` from ``.env.staging``. This reflects exactly the + registry the live instance serves webhooks with — no host-path hack, no HTTP + bootstrap dependency. + """ + from src.projects import known_plane_project_ids + return known_plane_project_ids() + + +def _run_b6(results: Results) -> None: + """Run the B6 registry-isolation check and record its verdict. + + Builds the known-id set from the running instance's registry and applies + ``_evaluate_b6``. Any failure to obtain the registry yields a deterministic + FAIL with a clear detail (TR-4) — never an unhandled exception and never a + false PASS. + """ + try: + known = _known_project_ids_from_registry() + except Exception as e: + results.add(B6_LABEL, False, f"registry source unavailable: {e}") + return + passed, detail = _evaluate_b6(known) + results.add(B6_LABEL, passed, detail) + def block_b(results: Results): print(f"\n{_BOLD}[Block B] ACCESS{_RESET}") @@ -260,28 +319,11 @@ def block_b(results: Results): except Exception as e: results.add("B5 Gitea: orchestrator-sandbox accessible, push=true", False, str(e)) - # B6 — Registry: sandbox in known IDs, prod ET/ORCH NOT in known IDs - try: - # Import from inside the container (script runs in /repos/orchestrator context) - sys.path.insert(0, "/repos/orchestrator") - # Force reload to pick up container env - import importlib - if "src.projects" in sys.modules: - importlib.reload(sys.modules["src.projects"]) - from src.projects import known_plane_project_ids - known = known_plane_project_ids() - sandbox_present = SANDBOX_PROJECT_ID in known - et_absent = PROD_ET_PROJECT_ID not in known - orch_absent = PROD_ORCH_PROJECT_ID not in known - ok = sandbox_present and et_absent and orch_absent - detail = ( - f"sandbox={'YES' if sandbox_present else 'NO'}, " - f"prod-ET={'NO(good)' if et_absent else 'YES(BAD!)'}, " - f"prod-ORCH={'NO(good)' if orch_absent else 'YES(BAD!)'}" - ) - results.add("B6 Registry: sandbox present, prod ET/ORCH absent", ok, detail) - except Exception as e: - results.add("B6 Registry: sandbox present, prod ET/ORCH absent", False, str(e)) + # B6 — Registry: sandbox in known IDs, prod ET/ORCH NOT in known IDs (ORCH-048). + # Reads the registry of the running staging instance from its own process-env + # (canonical: docker exec inside orchestrator-staging — ADR-001). No host-path + # hack; deterministic FAIL if the registry source is unavailable (TR-4). + _run_b6(results) # --------------------------------------------------------------------------- diff --git a/tests/test_staging_check_b6.py b/tests/test_staging_check_b6.py new file mode 100644 index 0000000..0eb8940 --- /dev/null +++ b/tests/test_staging_check_b6.py @@ -0,0 +1,151 @@ +"""ORCH-048: unit tests for the B6 registry-isolation verdict in staging_check.py. + +B6 «Registry: sandbox present, prod ET/ORCH absent» is the staging-isolation +safety check. Its verdict logic is isolated into the pure function +``_evaluate_b6(known) -> (passed, detail)`` so both outcomes (clean staging +registry → PASS, polluted registry → FAIL) can be tested without standing up a +live staging instance or docker (02-trz §9, ADR-001). + +These tests target that pure function plus the deterministic-degradation path +(``_run_b6``) and statically assert the host-path hack is gone (TR-6 / TC-06). +""" + +import importlib.util +import pathlib + +import pytest + +# --------------------------------------------------------------------------- +# Load scripts/staging_check.py by path (scripts/ is not an importable package). +# --------------------------------------------------------------------------- +_SCRIPT_PATH = ( + pathlib.Path(__file__).resolve().parent.parent / "scripts" / "staging_check.py" +) + + +def _load_module(): + spec = importlib.util.spec_from_file_location("staging_check", _SCRIPT_PATH) + module = importlib.util.module_from_spec(spec) + spec.loader.exec_module(module) + return module + + +sc = _load_module() + +SANDBOX = sc.SANDBOX_PROJECT_ID +PROD_ET = sc.PROD_ET_PROJECT_ID +PROD_ORCH = sc.PROD_ORCH_PROJECT_ID + + +# --------------------------------------------------------------------------- +# TC-01 — clean staging registry → PASS +# --------------------------------------------------------------------------- +def test_tc01_clean_registry_passes(): + passed, detail = sc._evaluate_b6({SANDBOX}) + assert passed is True + assert "sandbox=YES" in detail + assert "prod-ET=NO(good)" in detail + assert "prod-ORCH=NO(good)" in detail + + +# --------------------------------------------------------------------------- +# TC-02 — prod-ET leaked into registry → FAIL +# --------------------------------------------------------------------------- +def test_tc02_prod_et_present_fails(): + passed, detail = sc._evaluate_b6({SANDBOX, PROD_ET}) + assert passed is False + assert "sandbox=YES" in detail + assert "prod-ET=YES(BAD!)" in detail + assert "prod-ORCH=NO(good)" in detail + + +# --------------------------------------------------------------------------- +# TC-03 — prod-ORCH leaked into registry → FAIL +# --------------------------------------------------------------------------- +def test_tc03_prod_orch_present_fails(): + passed, detail = sc._evaluate_b6({SANDBOX, PROD_ORCH}) + assert passed is False + assert "sandbox=YES" in detail + assert "prod-ET=NO(good)" in detail + assert "prod-ORCH=YES(BAD!)" in detail + + +# --------------------------------------------------------------------------- +# TC-04 — sandbox absent (empty registry) → deterministic FAIL, no exception +# --------------------------------------------------------------------------- +def test_tc04_empty_registry_fails_without_sandbox(): + passed, detail = sc._evaluate_b6(set()) + assert passed is False + assert "sandbox=NO" in detail + + +# --------------------------------------------------------------------------- +# TC-05 — both prod projects leaked → FAIL +# --------------------------------------------------------------------------- +def test_tc05_both_prod_present_fails(): + passed, detail = sc._evaluate_b6({SANDBOX, PROD_ET, PROD_ORCH}) + assert passed is False + assert "prod-ET=YES(BAD!)" in detail + assert "prod-ORCH=YES(BAD!)" in detail + + +# --------------------------------------------------------------------------- +# TC-06 — registry source no longer depends on the host-path hack (TR-6) +# --------------------------------------------------------------------------- +def test_tc06_no_host_path_hack_in_source(): + source = _SCRIPT_PATH.read_text(encoding="utf-8") + # The host-worktree path injection and the env-of-the-launcher reload that + # caused the false FAIL must be gone from the B6 mechanics. + assert 'sys.path.insert(0, "/repos/orchestrator")' not in source + assert "importlib.reload" not in source + + +def test_tc06_registry_loader_uses_src_projects(): + # The verdict input is built from src.projects.known_plane_project_ids() + # resolved via the running instance's own PYTHONPATH/env — not from a + # host-path-injected import. We verify the loader delegates to that function. + import src.projects as projects_mod + + sentinel = {"sentinel-id"} + original = projects_mod.known_plane_project_ids + projects_mod.known_plane_project_ids = lambda: sentinel + try: + known = sc._known_project_ids_from_registry() + finally: + projects_mod.known_plane_project_ids = original + assert known == sentinel + + +# --------------------------------------------------------------------------- +# TC-07 — degraded registry source → deterministic FAIL (not false PASS, not raise) +# --------------------------------------------------------------------------- +def test_tc07_source_failure_is_deterministic_fail(monkeypatch): + def _boom(): + raise RuntimeError("registry import blew up") + + monkeypatch.setattr(sc, "_known_project_ids_from_registry", _boom) + + results = sc.Results() + # Must not raise. + sc._run_b6(results) + + assert len(results._items) == 1 + label, passed, detail = results._items[0] + assert passed is False + assert "registry source unavailable" in detail + assert "registry import blew up" in detail + + +# --------------------------------------------------------------------------- +# _run_b6 happy path wiring (clean registry → PASS result recorded) +# --------------------------------------------------------------------------- +def test_run_b6_records_pass_for_clean_registry(monkeypatch): + monkeypatch.setattr( + sc, "_known_project_ids_from_registry", lambda: {SANDBOX} + ) + results = sc.Results() + sc._run_b6(results) + assert len(results._items) == 1 + _label, passed, detail = results._items[0] + assert passed is True + assert "sandbox=YES" in detail