fix(plane): sandbox-only fail-closed guard for Plane writes from test process (ORCH-117)

Close the root class of incident ORCH-114: a pytest/worktree process performed a
REAL write (PATCH issues state=<Done> + comment) against the PRODUCTION Plane
project, because test/staging processes inherit the live Plane token
(PLANE_HEADERS/PROJECT_ID are captured at import — a post-hoc env/token swap is a
no-op) and nothing forced them to write only to the sandbox. Symmetric to the
existing _no_telegram autouse floor.

- New pure never-raise leaf src/plane_write_guard.py (decide/audit_block/
  audit_allow), wired into the 3 plane_sync write primitives (update_issue_state /
  add_comment / _set_issue_state_direct) via _guard_allows_write, AT CALL TIME,
  before any network step. Active ONLY in a test process (pytest in sys.modules /
  PYTEST_CURRENT_TEST); live + staging runtimes (uvicorn) are a strict no-op.
- In a test process: default-deny. A write is allowed iff opt-in
  (plane_test_write_enabled) AND target project in the sandbox allowlist
  (plane_test_sandbox_projects, default = the one SANDBOX id). Prod is blocked even
  with opt-in (allowlist sandbox-only); unresolved project -> block (fail-closed).
- Independent second layer: tests/conftest.py::_plane_sandbox_only autouse floor.
  Intentionally NO prod-block kill-switch (anti back-door, NFR-6).
- Audit: block -> loud ERROR; sandbox-allow -> INFO.
- Bypass fixtures for the 3 (+1) pre-existing tests that assert on the mocked
  write primitive's httpx call (header/URL/state logic), the guard is no Quality
  Gate: STAGE_TRANSITIONS / QG_CHECKS / check_* / machine-verdict / DB schema
  untouched.
- Tests: tests/test_orch117_plane_write_isolation.py (TC-01 mandatory ORCH-114
  regression + TC-02..TC-14). Docs: CLAUDE.md, architecture/README.md,
  operations/INFRA.md, .env.example, CHANGELOG.md.

Refs: ORCH-117
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-15 21:16:28 +03:00
committed by deployer
parent 77d3a66356
commit 861b5ee984
14 changed files with 679 additions and 1 deletions

View File

@@ -4,6 +4,7 @@ import logging
import time
import httpx
from .config import settings
from . import plane_write_guard
logger = logging.getLogger("orchestrator.plane_sync")
@@ -843,9 +844,30 @@ def find_issue_id(work_item_id: str, project_id: str = None) -> str | None:
return None
def _guard_allows_write(work_item_id: str, project_id: str, op: str) -> bool:
"""ORCH-117: fail-closed gate in front of every Plane WRITE (state/comment).
Returns True if the write may proceed. In the live orchestrator/staging runtime
this is always True (the guard is a no-op — no pytest in the process). In a
test/worktree process a non-sandbox / non-opt-in write is BLOCKED here (audited
loudly) and this returns False, so the calling primitive returns BEFORE any
network step (no GET, no PATCH/POST). See src/plane_write_guard.py / ORCH-114.
"""
ok, reason = plane_write_guard.decide(project_id, op, work_item_id)
if not ok:
plane_write_guard.audit_block(project_id, op, work_item_id, reason)
return False
if reason == plane_write_guard.R_SANDBOX_OPT_IN:
plane_write_guard.audit_allow(project_id, op, work_item_id, reason)
return True
def update_issue_state(work_item_id: str, stage: str, project_id: str = None):
"""Update Plane issue state based on orchestrator stage."""
project_id = _resolve_project_id(work_item_id, project_id)
# ORCH-117: fail-closed guard — block prod Plane writes from a test process.
if not _guard_allows_write(work_item_id, project_id, plane_write_guard.OP_STATE):
return
# ORCH-10: resolve state UUID for this specific project (not global dict).
state_id = stage_to_state(stage, project_id)
if not state_id:
@@ -874,6 +896,9 @@ def add_comment(work_item_id: str, text: str, project_id: str = None, author: st
``_headers_for``). GET/PATCH calls elsewhere keep using PLANE_HEADERS.
"""
project_id = _resolve_project_id(work_item_id, project_id)
# ORCH-117: fail-closed guard — block prod Plane comment-writes from a test process.
if not _guard_allows_write(work_item_id, project_id, plane_write_guard.OP_COMMENT):
return
issue_id = find_issue_id(work_item_id, project_id)
if not issue_id:
logger.warning(f"Issue not found in Plane for {work_item_id}, skipping comment")
@@ -1038,6 +1063,9 @@ def set_issue_stage_state(work_item_id: str, stage: str, project_id: str = None)
def _set_issue_state_direct(work_item_id: str, state_id: str, project_id: str = None):
"""Set issue state directly by state_id."""
project_id = _resolve_project_id(work_item_id, project_id)
# ORCH-117: fail-closed guard — block prod Plane writes from a test process.
if not _guard_allows_write(work_item_id, project_id, plane_write_guard.OP_STATE):
return
issue_id = find_issue_id(work_item_id, project_id)
if not issue_id:
logger.warning(f"Issue not found in Plane for {work_item_id}")