Close the root class of incident ORCH-114: a pytest/worktree process performed a REAL write (PATCH issues state=<Done> + comment) against the PRODUCTION Plane project, because test/staging processes inherit the live Plane token (PLANE_HEADERS/PROJECT_ID are captured at import — a post-hoc env/token swap is a no-op) and nothing forced them to write only to the sandbox. Symmetric to the existing _no_telegram autouse floor. - New pure never-raise leaf src/plane_write_guard.py (decide/audit_block/ audit_allow), wired into the 3 plane_sync write primitives (update_issue_state / add_comment / _set_issue_state_direct) via _guard_allows_write, AT CALL TIME, before any network step. Active ONLY in a test process (pytest in sys.modules / PYTEST_CURRENT_TEST); live + staging runtimes (uvicorn) are a strict no-op. - In a test process: default-deny. A write is allowed iff opt-in (plane_test_write_enabled) AND target project in the sandbox allowlist (plane_test_sandbox_projects, default = the one SANDBOX id). Prod is blocked even with opt-in (allowlist sandbox-only); unresolved project -> block (fail-closed). - Independent second layer: tests/conftest.py::_plane_sandbox_only autouse floor. Intentionally NO prod-block kill-switch (anti back-door, NFR-6). - Audit: block -> loud ERROR; sandbox-allow -> INFO. - Bypass fixtures for the 3 (+1) pre-existing tests that assert on the mocked write primitive's httpx call (header/URL/state logic), the guard is no Quality Gate: STAGE_TRANSITIONS / QG_CHECKS / check_* / machine-verdict / DB schema untouched. - Tests: tests/test_orch117_plane_write_isolation.py (TC-01 mandatory ORCH-114 regression + TC-02..TC-14). Docs: CLAUDE.md, architecture/README.md, operations/INFRA.md, .env.example, CHANGELOG.md. Refs: ORCH-117 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
189 lines
9.8 KiB
Python
189 lines
9.8 KiB
Python
"""Global pytest fixtures.
|
|
|
|
test(conftest): mute Telegram in ALL tests to stop prod leakage.
|
|
|
|
Background: a pytest run on prod was sending REAL Telegram messages to Slava,
|
|
because some tests (e.g. test_webhook_dedup advancing a stage) reach
|
|
notify_stage_change -> send_telegram, which reads the live .env
|
|
telegram_bot_token/chat_id and actually POSTs to Telegram.
|
|
|
|
This autouse fixture stubs send_telegram to a no-op for every test:
|
|
|
|
- "src.notifications.send_telegram" is the SOURCE. All the notify_* helpers in
|
|
notifications.py call the module-global send_telegram, and every other module
|
|
that does a *local* `from .notifications import send_telegram` inside a
|
|
function resolves it live at call time -> covered by patching the source.
|
|
|
|
- "src.stage_engine.send_telegram" is patched too, because stage_engine binds
|
|
send_telegram as a MODULE-LEVEL name (from .notifications import send_telegram
|
|
at import), so a patch of the source alone would not intercept its 3 direct
|
|
calls. webhooks/plane and launcher import it locally inside functions, so the
|
|
source patch already covers them; they are patched defensively with
|
|
raising=False anyway in case that ever changes.
|
|
|
|
raising=False so a module that doesn't (yet) expose the name never breaks setup.
|
|
"""
|
|
|
|
import pytest
|
|
|
|
|
|
@pytest.fixture(autouse=True)
|
|
def _no_telegram(monkeypatch):
|
|
_noop = lambda *a, **k: None # noqa: E731
|
|
# Source of truth (covers notifications.notify_* and all local re-imports).
|
|
monkeypatch.setattr("src.notifications.send_telegram", _noop, raising=False)
|
|
# Module-level binding in stage_engine (and defensive coverage elsewhere).
|
|
monkeypatch.setattr("src.stage_engine.send_telegram", _noop, raising=False)
|
|
monkeypatch.setattr("src.webhooks.plane.send_telegram", _noop, raising=False)
|
|
monkeypatch.setattr("src.agents.launcher.send_telegram", _noop, raising=False)
|
|
monkeypatch.setattr("src.queue_worker.send_telegram", _noop, raising=False)
|
|
# ORCH-053: the reconciler binds send_telegram as a MODULE-LEVEL name
|
|
# (from .notifications import send_telegram), so the source patch alone would
|
|
# not intercept its unblock notification — patch it here too.
|
|
monkeypatch.setattr("src.reconciler.send_telegram", _noop, raising=False)
|
|
yield
|
|
|
|
|
|
@pytest.fixture(autouse=True)
|
|
def _reset_webhook_secrets(monkeypatch):
|
|
"""Isolate settings singleton between test files (CI cross-file isolation).
|
|
|
|
settings is a process-wide Pydantic singleton read once at import. Different
|
|
test modules set env variables differently at import-time, so those values leak
|
|
across files when pytest collects them together (as CI does).
|
|
|
|
1. webhook secrets: reset to "" so HMAC is disabled by default. Tests that
|
|
intentionally test the 401 path (test_webhook_dedup.py:268,278) re-apply
|
|
their own monkeypatch AFTER this autouse fixture runs, which overrides the
|
|
reset for the duration of that one test only.
|
|
|
|
2. db_path: reset to the value from ORCH_DB_PATH env var (last written by the
|
|
last imported test module). Without this, test_webhook_dedup.py (imported
|
|
first, alphabetically) seeds settings.db_path = dedup.db, while
|
|
test_webhooks.py's setup_db fixture tries to remove test_orchestrator.db,
|
|
leaving the DB dirty across tests that share a branch name and causing
|
|
get_task_by_repo_branch() to return a stale row with the wrong stage.
|
|
Per-test monkeypatches in test_webhook_dedup.setup_db override this reset.
|
|
"""
|
|
import os
|
|
from src.webhooks import gitea as gitea_mod
|
|
from src.webhooks import plane as plane_mod
|
|
from src import db as db_mod
|
|
monkeypatch.setattr(gitea_mod.settings, "gitea_webhook_secret", "", raising=False)
|
|
monkeypatch.setattr(plane_mod.settings, "plane_webhook_secret", "", raising=False)
|
|
db_path_env = os.environ.get("ORCH_DB_PATH", "")
|
|
if db_path_env:
|
|
monkeypatch.setattr(db_mod.settings, "db_path", db_path_env, raising=False)
|
|
yield
|
|
|
|
|
|
@pytest.fixture(autouse=True)
|
|
def _isolate_runs_dir(monkeypatch, tmp_path):
|
|
"""ORCH-100: point settings.runs_dir at a per-test tmp dir in ALL tests.
|
|
|
|
Background: ``launcher._run_log_path(run_id)`` resolves to
|
|
``<settings.runs_dir>/<run_id>.log`` and, on a non-zero exit,
|
|
``_finalize_job`` classifies the failure by reading the *tail of that log*
|
|
(transient 429/overload/timeout -> backoff-requeue; permanent -> attempts
|
|
requeue then 'failed'). settings.runs_dir defaults to the live prod dir
|
|
``/app/data/runs``, which on the self-hosting host holds REAL accumulated
|
|
agent logs (1.log, 2.log, ...). Tests that exercise the finalize path with a
|
|
small literal run_id (e.g. test_finalize_job_requeue_then_fail uses run_id=1/2)
|
|
therefore read whatever a real prod run happened to log — and a real 2.log that
|
|
contains "429" silently flips an expected 'permanent' classification to
|
|
'transient', requeueing instead of failing. That is ambient prod pollution, not
|
|
a code fault.
|
|
|
|
Redirecting runs_dir to an empty tmp dir makes _run_log_path() resolve to a
|
|
non-existent file -> classify_log_file() returns the documented 'permanent'
|
|
default, restoring deterministic, environment-independent behaviour for the
|
|
whole suite. settings is a process-wide singleton shared by launcher
|
|
(``launcher.settings is config.settings``), so patching the source covers it.
|
|
"""
|
|
from src import config as _cfg
|
|
monkeypatch.setattr(_cfg.settings, "runs_dir", str(tmp_path), raising=False)
|
|
yield
|
|
|
|
|
|
@pytest.fixture(autouse=True)
|
|
def _disable_merge_verify(monkeypatch):
|
|
"""ORCH-071: disable the merge-verify under-gate by default in ALL tests.
|
|
|
|
The under-gate (deploy -> done) runs a deterministic merge-actor + a
|
|
post-deploy merge verification that make REAL Gitea/git calls. Leaving it ON
|
|
by default would (a) reach the network from unrelated deploy->done tests and
|
|
(b) make them pass/fail by ACCIDENT depending on whether the live Gitea still
|
|
has the historical PR merged (a hidden CI flake). We therefore default it to
|
|
its documented kill-switch OFF state (``merge_verify_enabled=False`` == 1:1
|
|
pre-ORCH-071 behaviour). Tests that specifically target the under-gate
|
|
(test_merge_verify / test_deploy_finalizer_merge_gate / test_merge_actor /
|
|
test_deploy_restart_merge_recovery) re-enable it via their own monkeypatch
|
|
AFTER this autouse fixture, scoping the feature ON to just those tests.
|
|
"""
|
|
from src import config as _cfg
|
|
monkeypatch.setattr(_cfg.settings, "merge_verify_enabled", False, raising=False)
|
|
# ORCH-073: the regression guard (check_main_regression) runs real git in
|
|
# _handle_merge_verify's confirmed branch. Default it OFF too so unrelated
|
|
# deploy->done tests stay 1:1; the dedicated ORCH-073 tests re-enable it.
|
|
monkeypatch.setattr(_cfg.settings, "regression_guard_enabled", False, raising=False)
|
|
# ORCH-082: the merge-verify ensure_open_pr врезка makes REAL Gitea calls before
|
|
# merge_pr. Default it OFF so unrelated deploy->done / merge-verify tests stay 1:1
|
|
# (no network); the dedicated ORCH-082 tests re-enable it via their own monkeypatch.
|
|
monkeypatch.setattr(
|
|
_cfg.settings, "merge_verify_autocreate_pr_enabled", False, raising=False
|
|
)
|
|
yield
|
|
|
|
|
|
@pytest.fixture(autouse=True)
|
|
def _plane_sandbox_only(monkeypatch):
|
|
"""ORCH-117: fail-closed FLOOR — no test may write to a non-sandbox Plane project.
|
|
|
|
The independent second layer of the sandbox-only Plane-write guard (ADR-001 D5),
|
|
by the same model as ``_no_telegram``: it forces the safe defaults for EVERY
|
|
test, OVERRIDING any live variable inherited from the container environment.
|
|
|
|
With the opt-in OFF, ``src/plane_write_guard.decide`` blocks ALL Plane writes
|
|
from the test process (both sandbox and prod) -> default-deny (AC-4). Even if the
|
|
runtime leaf ever erroneously returned ALLOW, this floor keeps a prod write from
|
|
a plain ``pytest tests/`` impossible. Sandbox-e2e tests that need a REAL write to
|
|
SANDBOX re-enable the opt-in in their OWN fixture AFTER this autouse (exactly as
|
|
``test_orch114_*`` / ``test_merge_verify`` re-enable their flags); the allowlist
|
|
already contains the SANDBOX id, so the write to SANDBOX passes while a prod write
|
|
still blocks (allowlist sandbox-only, AC-3).
|
|
"""
|
|
from src import config as _cfg
|
|
monkeypatch.setattr(_cfg.settings, "plane_test_write_enabled", False, raising=False)
|
|
monkeypatch.setattr(
|
|
_cfg.settings,
|
|
"plane_test_sandbox_projects",
|
|
"8c5a3025-4f9d-4190-b79f-fa06276bb27e",
|
|
raising=False,
|
|
)
|
|
yield
|
|
|
|
|
|
@pytest.fixture(autouse=True)
|
|
def _disable_transition_lease(monkeypatch):
|
|
"""ORCH-114: disable the transition-ownership lease + expected-stage CAS by
|
|
default in ALL tests.
|
|
|
|
The prod default is ON for the self-hosting repo (``transition_lease_enabled=True``,
|
|
``transition_lease_repos=""`` -> orchestrator only). Left ON, the expected-stage
|
|
CAS (``update_task_stage_cas``) would change the stage-write semantics for every
|
|
existing test that calls ``advance_stage`` / the gitea-plane webhook handlers with
|
|
repo ``orchestrator`` (a CAS write needs the task row to actually BE at the
|
|
expected stage; the bare ``update_task_stage`` did not). We therefore default the
|
|
kill-switch OFF for the whole suite (mirrors ``_disable_merge_verify`` /
|
|
``_disable_*`` precedent), which makes ``commit_stage_cas`` degenerate to the prior
|
|
unconditional ``update_task_stage`` and the lease inert -> the existing 2000+ tests
|
|
stay byte-for-byte (AC-9). The dedicated ORCH-114 test module
|
|
(``test_orch114_transition_ownership.py``) re-enables it via its own monkeypatch,
|
|
scoping the feature ON to just those tests.
|
|
"""
|
|
from src import config as _cfg
|
|
monkeypatch.setattr(
|
|
_cfg.settings, "transition_lease_enabled", False, raising=False
|
|
)
|
|
yield
|