Files
orchestrator/tests/test_orch117_plane_write_isolation.py
claude-bot 38081c1630
All checks were successful
CI / test (push) Successful in 1m8s
CI / test (pull_request) Successful in 1m6s
fix(plane): sandbox-only fail-closed guard for Plane writes from test process (ORCH-117)
Close the root class of incident ORCH-114: a pytest/worktree process performed a
REAL write (PATCH issues state=<Done> + comment) against the PRODUCTION Plane
project, because test/staging processes inherit the live Plane token
(PLANE_HEADERS/PROJECT_ID are captured at import — a post-hoc env/token swap is a
no-op) and nothing forced them to write only to the sandbox. Symmetric to the
existing _no_telegram autouse floor.

- New pure never-raise leaf src/plane_write_guard.py (decide/audit_block/
  audit_allow), wired into the 3 plane_sync write primitives (update_issue_state /
  add_comment / _set_issue_state_direct) via _guard_allows_write, AT CALL TIME,
  before any network step. Active ONLY in a test process (pytest in sys.modules /
  PYTEST_CURRENT_TEST); live + staging runtimes (uvicorn) are a strict no-op.
- In a test process: default-deny. A write is allowed iff opt-in
  (plane_test_write_enabled) AND target project in the sandbox allowlist
  (plane_test_sandbox_projects, default = the one SANDBOX id). Prod is blocked even
  with opt-in (allowlist sandbox-only); unresolved project -> block (fail-closed).
- Independent second layer: tests/conftest.py::_plane_sandbox_only autouse floor.
  Intentionally NO prod-block kill-switch (anti back-door, NFR-6).
- Audit: block -> loud ERROR; sandbox-allow -> INFO.
- Bypass fixtures for the 3 (+1) pre-existing tests that assert on the mocked
  write primitive's httpx call (header/URL/state logic), the guard is no Quality
  Gate: STAGE_TRANSITIONS / QG_CHECKS / check_* / machine-verdict / DB schema
  untouched.
- Tests: tests/test_orch117_plane_write_isolation.py (TC-01 mandatory ORCH-114
  regression + TC-02..TC-14). Docs: CLAUDE.md, architecture/README.md,
  operations/INFRA.md, .env.example, CHANGELOG.md.

Refs: ORCH-117
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 21:16:28 +03:00

288 lines
15 KiB
Python

"""ORCH-117 (adr-0046): sandbox-only fail-closed isolation of Plane WRITES.
Regression of the ORCH-114 incident: a pytest/worktree process performed a REAL
``PATCH …/issues/… state=<Done>`` + comment against the PRODUCTION Plane project,
because test/staging processes inherit the live Plane token and nothing forced them
to write only to the sandbox. This suite pins the fix (``src/plane_write_guard.py``
врезка in the three ``plane_sync`` write primitives + the conftest floor).
Covers TC-01…TC-14 (see docs/work-items/ORCH-117/04-test-plan.yaml). httpx is mocked
throughout — there are NO real network calls (a prod write is the very thing the fix
forbids). The autouse conftest fixture ``_plane_sandbox_only`` sets the safe floor
(opt-in OFF, sandbox allowlist = the one SANDBOX id) for the whole suite; ALLOW-path
tests re-enable the opt-in in their own monkeypatch AFTER it (the documented pattern).
TC-01 is the MANDATORY incident regression: it is RED before the fix (без the
guard врезка the call reaches ``httpx.patch``/``httpx.post``) and GREEN after.
"""
import logging
import os
# Match the env-default convention of the other plane suites so config loads cleanly.
os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token")
os.environ.setdefault("ORCH_PLANE_WORKSPACE_SLUG", "test-ws")
os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
from unittest.mock import MagicMock, patch # noqa: E402
import pytest # noqa: E402
from src import config as _cfg # noqa: E402
from src import plane_sync as PS # noqa: E402
from src import plane_write_guard as PWG # noqa: E402
# Project ids (verified literals — TRZ §3 / ADR-001 / test-plan notes).
PROD = "7a79f0a9-5278-49cd-9007-9a338f238f9c" # a live (non-sandbox) project.
SANDBOX = "8c5a3025-4f9d-4190-b79f-fa06276bb27e" # the one allowed sandbox project.
# --------------------------------------------------------------------------- #
# Helpers / fixtures
# --------------------------------------------------------------------------- #
def _opt_in(monkeypatch, projects: str = SANDBOX):
"""Turn the sandbox-write opt-in ON (it is OFF by default via the conftest floor)."""
monkeypatch.setattr(_cfg.settings, "plane_test_write_enabled", True, raising=False)
monkeypatch.setattr(_cfg.settings, "plane_test_sandbox_projects", projects, raising=False)
def _mock_httpx():
"""Patch ``plane_sync.httpx`` so any patch/post/get is RECORDED, never sent."""
return patch.object(PS, "httpx", MagicMock())
def _resp_ok():
r = MagicMock()
r.status_code = 200
r.raise_for_status.return_value = None
return r
@pytest.fixture
def _network_stubs():
"""Stub the network helpers so an ALLOWED write would reach httpx (not the DB/API)."""
with patch.object(PS, "find_issue_id", return_value="issue-uuid"), \
patch.object(PS, "stage_to_state", return_value="state-uuid"):
yield
# --------------------------------------------------------------------------- #
# TC-01 — MANDATORY regression of the ORCH-114 incident.
# --------------------------------------------------------------------------- #
def test_tc01_notify_stage_change_prod_makes_zero_writes(monkeypatch):
"""A live prod token in PLANE_HEADERS + pytest + the incident call
``notify_stage_change('ORCH-114','deploy','done')`` against the prod project ->
ZERO real httpx.patch/post. RED before the guard врезка, GREEN after."""
# Mirror the incident: a REAL prod token is captured in the module headers.
monkeypatch.setattr(PS, "PLANE_HEADERS", {"X-API-Key": "LIVE-PROD-TOKEN"}, raising=False)
# No opt-in (default floor) — exactly a normal `pytest tests/` run.
with _mock_httpx() as mock_httpx, \
patch.object(PS, "find_issue_id", return_value="issue-uuid"), \
patch.object(PS, "stage_to_state", return_value="state-uuid"):
PS.notify_stage_change("ORCH-114", "deploy", "done", project_id=PROD)
mock_httpx.patch.assert_not_called()
mock_httpx.post.assert_not_called()
# --------------------------------------------------------------------------- #
# TC-02 / TC-03 / TC-04 — each write primitive blocks a prod target in-test.
# --------------------------------------------------------------------------- #
def test_tc02_update_issue_state_prod_blocked(monkeypatch, caplog, _network_stubs):
"""update_issue_state -> prod project -> httpx.patch NOT called; reason prod-project-in-test."""
_opt_in(monkeypatch) # opt-in ON so the BLOCK reason is the allowlist, not opt-in-off.
with _mock_httpx() as mock_httpx, caplog.at_level(logging.INFO, logger="orchestrator.plane_write_guard"):
PS.update_issue_state("ORCH-1", "done", project_id=PROD)
mock_httpx.patch.assert_not_called()
assert PWG.R_PROD_IN_TEST in caplog.text
def test_tc03_add_comment_prod_blocked(monkeypatch, _network_stubs):
"""add_comment -> prod project -> httpx.post NOT called."""
_opt_in(monkeypatch)
with _mock_httpx() as mock_httpx:
PS.add_comment("ORCH-1", "hello", project_id=PROD)
mock_httpx.post.assert_not_called()
def test_tc04_set_issue_state_direct_prod_blocked(monkeypatch, _network_stubs):
"""_set_issue_state_direct (the primitive every set_issue_* funnels into) ->
prod project -> httpx.patch NOT called."""
_opt_in(monkeypatch)
with _mock_httpx() as mock_httpx:
PS._set_issue_state_direct("ORCH-1", "state-uuid", project_id=PROD)
mock_httpx.patch.assert_not_called()
def test_tc04_set_issue_done_prod_blocked(monkeypatch):
"""set_issue_done -> _set_issue_state_direct -> prod -> blocked (covers the
public set_issue_* surface, which all reduce to the guarded primitive)."""
_opt_in(monkeypatch)
with _mock_httpx() as mock_httpx, \
patch.object(PS, "get_project_states", return_value={"done": "done-uuid"}), \
patch.object(PS, "find_issue_id", return_value="issue-uuid"):
PS.set_issue_done("ORCH-1", project_id=PROD)
mock_httpx.patch.assert_not_called()
# --------------------------------------------------------------------------- #
# TC-05 — default-deny: without opt-in, EVERY target (incl. sandbox) is blocked.
# --------------------------------------------------------------------------- #
def test_tc05_default_deny_blocks_sandbox_and_prod(_network_stubs):
"""No opt-in (conftest floor) -> sandbox AND prod both blocked."""
with _mock_httpx() as mock_httpx:
PS.update_issue_state("ORCH-1", "done", project_id=SANDBOX)
PS.update_issue_state("ORCH-1", "done", project_id=PROD)
mock_httpx.patch.assert_not_called()
# Verdict-level: the reason is opt-in-disabled for both.
assert PS.plane_write_guard.decide(SANDBOX, "state")[1] == PWG.R_OPT_IN_DISABLED
assert PS.plane_write_guard.decide(PROD, "state")[1] == PWG.R_OPT_IN_DISABLED
# --------------------------------------------------------------------------- #
# TC-06 — sandbox allow: opt-in ON + sandbox project -> real (mocked) write fires.
# --------------------------------------------------------------------------- #
def test_tc06_sandbox_optin_allows_write(monkeypatch, _network_stubs):
"""opt-in ON + SANDBOX -> httpx.patch IS called, addressed to the sandbox URL."""
_opt_in(monkeypatch)
with _mock_httpx() as mock_httpx:
mock_httpx.patch.return_value = _resp_ok()
PS.update_issue_state("ORCH-1", "done", project_id=SANDBOX)
mock_httpx.patch.assert_called_once()
url = mock_httpx.patch.call_args.args[0]
assert SANDBOX in url
assert PROD not in url
# --------------------------------------------------------------------------- #
# TC-07 — sandbox-only even with opt-in: a prod target is ALWAYS blocked.
# --------------------------------------------------------------------------- #
def test_tc07_optin_still_blocks_prod(monkeypatch):
"""opt-in ON does NOT unlock prod — the allowlist is sandbox-only (AC-3)."""
_opt_in(monkeypatch)
ok, reason = PS.plane_write_guard.decide(PROD, "state", "ORCH-1")
assert ok is False
assert reason == PWG.R_PROD_IN_TEST
# --------------------------------------------------------------------------- #
# TC-08 — fail-closed on ambiguity: empty/None target -> block.
# --------------------------------------------------------------------------- #
def test_tc08_ambiguous_target_blocked(monkeypatch):
"""opt-in ON but project_id empty/None -> block (NFR-1 'don't know => don't write')."""
_opt_in(monkeypatch)
assert PS.plane_write_guard.decide("", "state")[1] == PWG.R_AMBIGUOUS
assert PS.plane_write_guard.decide(None, "comment")[1] == PWG.R_AMBIGUOUS
assert PS.plane_write_guard.decide(" ", "state")[1] == PWG.R_AMBIGUOUS
# --------------------------------------------------------------------------- #
# TC-09 — immune to the import-time token capture (AC-7 / NFR-4).
# --------------------------------------------------------------------------- #
def test_tc09_blocks_regardless_of_captured_token(monkeypatch, _network_stubs):
"""A REAL token in PLANE_HEADERS (captured at import) does not help: the guard
decides at CALL time on (test-process + target project), not on the token, and
does not rely on os.environ.setdefault / a settings token swap."""
monkeypatch.setattr(PS, "PLANE_HEADERS", {"X-API-Key": "LIVE-PROD-TOKEN"}, raising=False)
# No opt-in: a plain pytest run with a live token still cannot mutate prod.
with _mock_httpx() as mock_httpx:
PS.update_issue_state("ORCH-1", "done", project_id=PROD)
PS._set_issue_state_direct("ORCH-1", "state-uuid", project_id=PROD)
mock_httpx.patch.assert_not_called()
# The verdict is token-independent.
assert PS.plane_write_guard.decide(PROD, "state")[0] is False
# --------------------------------------------------------------------------- #
# TC-10 — zero regression of the LIVE runtime: not-a-test -> guard is a no-op.
# --------------------------------------------------------------------------- #
def test_tc10_live_runtime_is_noop(monkeypatch, _network_stubs):
"""Simulate a non-pytest process -> guard ALLOWs (live-runtime) and the prod
write goes out byte-for-byte (same URL/headers/payload as before ORCH-117)."""
monkeypatch.setattr(PWG, "_in_test_process", lambda: False)
monkeypatch.setattr(PS, "PLANE_HEADERS", {"X-API-Key": "LIVE-PROD-TOKEN"}, raising=False)
with _mock_httpx() as mock_httpx:
mock_httpx.patch.return_value = _resp_ok()
PS.update_issue_state("ORCH-1", "done", project_id=PROD)
mock_httpx.patch.assert_called_once()
args, kwargs = mock_httpx.patch.call_args
assert PROD in args[0]
assert kwargs["headers"] == {"X-API-Key": "LIVE-PROD-TOKEN"}
assert kwargs["json"] == {"state": "state-uuid"}
# The verdict itself is ALLOW/live-runtime.
assert PWG.decide(PROD, "state") == (True, PWG.R_LIVE_RUNTIME)
# --------------------------------------------------------------------------- #
# TC-11 — staging runtime (not pytest) writes to SANDBOX normally.
# --------------------------------------------------------------------------- #
def test_tc11_staging_writes_sandbox(monkeypatch, _network_stubs):
"""Staging is a real uvicorn process (not pytest) on the sandbox project ->
the test-process detection does NOT fire, the write to SANDBOX passes."""
monkeypatch.setattr(PWG, "_in_test_process", lambda: False)
with _mock_httpx() as mock_httpx:
mock_httpx.patch.return_value = _resp_ok()
PS.update_issue_state("ORCH-1", "done", project_id=SANDBOX)
mock_httpx.patch.assert_called_once()
assert SANDBOX in mock_httpx.patch.call_args.args[0]
# --------------------------------------------------------------------------- #
# TC-12 — audit/observability of block (loud) and allow (info).
# --------------------------------------------------------------------------- #
def test_tc12_block_audited_loudly(monkeypatch, caplog, _network_stubs):
"""A blocked write emits a structured WARNING/ERROR carrying project_id /
work_item / op / reason."""
_opt_in(monkeypatch)
with caplog.at_level(logging.INFO, logger="orchestrator.plane_write_guard"), _mock_httpx():
PS.update_issue_state("ORCH-114", "done", project_id=PROD)
blocks = [r for r in caplog.records if r.levelno >= logging.WARNING]
assert blocks, "a block must emit at least one WARNING/ERROR record"
text = caplog.text
assert PROD in text and "ORCH-114" in text
assert PWG.OP_STATE in text and PWG.R_PROD_IN_TEST in text
def test_tc12_sandbox_allow_audited_info(monkeypatch, caplog, _network_stubs):
"""An allowed sandbox write emits an INFO audit line."""
_opt_in(monkeypatch)
with caplog.at_level(logging.INFO, logger="orchestrator.plane_write_guard"), \
_mock_httpx() as mock_httpx:
mock_httpx.patch.return_value = _resp_ok()
PS.update_issue_state("ORCH-1", "done", project_id=SANDBOX)
infos = [r for r in caplog.records if r.levelno == logging.INFO and "ALLOWED" in r.message]
assert infos, "an allowed sandbox write must emit an INFO audit line"
assert SANDBOX in caplog.text
# --------------------------------------------------------------------------- #
# TC-13 — the autouse conftest floor protects the whole suite by default.
# --------------------------------------------------------------------------- #
def test_tc13_conftest_floor_default_deny():
"""Without any per-test opt-in, the floor leaves the opt-in OFF and the sandbox
allowlist pinned to the one SANDBOX id -> a representative write to prod is a
no-op (default-deny is active for every test, not just this file)."""
assert _cfg.settings.plane_test_write_enabled is False
assert _cfg.settings.plane_test_sandbox_projects == SANDBOX
with _mock_httpx() as mock_httpx, \
patch.object(PS, "find_issue_id", return_value="issue-uuid"), \
patch.object(PS, "stage_to_state", return_value="state-uuid"):
PS.update_issue_state("ORCH-2", "done", project_id=PROD)
mock_httpx.patch.assert_not_called()
# --------------------------------------------------------------------------- #
# TC-14 — kill-switch без чёрного хода (NFR-6 / FR-7 / D4 anti-drift).
# --------------------------------------------------------------------------- #
def test_tc14_no_killswitch_backdoor(monkeypatch):
"""There is intentionally NO ``plane_write_guard_enabled`` kill-switch that
re-opens a prod write from pytest. The only reversible regulator is the
sandbox-bound opt-in; even with it ON, prod stays blocked."""
# Anti-drift: the back-door config key must not exist (a future agent adding it
# would reintroduce the ORCH-114 defect — see ADR-001 D4 / TR-4).
assert not hasattr(_cfg.settings, "plane_write_guard_enabled")
# Opt-in ON is sandbox-bound, never a prod back-door.
_opt_in(monkeypatch)
assert PWG.decide(PROD, "state")[0] is False
assert PWG.decide(SANDBOX, "state")[0] is True