Add the `watchdog/` package (thin Python-3.12 stdlib-only daemon) and the `orchestrator-watchdog` compose service — the brain half of the domain-0 observability pair. F1a (ORCH-099) exposes GET /metrics raw signal; F1b reads it, augments with host / container / dependency probes, runs each signal through a generalised pure decision function (decide(signal_active, prev, now, cooldown), a strict superset of disk_watchdog.decide_action) with per-signal in-memory dedup/throttle/recovery, and alerts over its OWN independent Telegram channel. Key properties (ADR-001): - Observer separated from observed: separate container; /metrics not answering is itself the master `orch_down` alarm (debounced K ticks — no flap on a hiccup). - Strictly read-only: docker.sock GET-only + mounted :ro (double guard), host paths :ro, no DB/disk writes, no process control — self-hosting-safe. - never-raise on three levels (per-source/per-tick/per-send) + WATCHDOG_ENABLED kill-switch (disabled -> inert idle-loop, not exit). - Disk anti-duplicate (D6): disk_watchdog (ORCH-063) stays sole owner of the 85% alert; sidecar carries orch_down + an opt-in 97% ceiling (default off). - NO import from src/** (C-1); src/**, STAGE_TRANSITIONS, QG_CHECKS, check_*, DB schema — untouched. env_file optional so a missing .env.watchdog never breaks `docker compose up` for the prod orchestrator. Tests: tests/watchdog/ (TC-01…TC-13) + full tests/ regression green (TC-14). Docs: CHANGELOG, .env.example canon (WATCHDOG_*); architecture README + adr-0033 authored at the architecture stage. Refs: ORCH-100 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
67 lines
2.1 KiB
Python
67 lines
2.1 KiB
Python
"""TC-12: compose invariant — orchestrator-watchdog is a separate service.
|
|
|
|
It declares its own build (watchdog/Dockerfile), restart policy, mem_limit, and
|
|
mounts docker.sock read-only (:ro). Parses the real docker-compose.yml.
|
|
"""
|
|
import pathlib
|
|
|
|
import yaml
|
|
|
|
REPO_ROOT = pathlib.Path(__file__).resolve().parents[2]
|
|
|
|
|
|
def _compose():
|
|
with open(REPO_ROOT / "docker-compose.yml") as f:
|
|
return yaml.safe_load(f)
|
|
|
|
|
|
def test_watchdog_service_declared():
|
|
svc = _compose()["services"]
|
|
assert "orchestrator-watchdog" in svc
|
|
|
|
|
|
def test_watchdog_builds_from_watchdog_dockerfile():
|
|
wd = _compose()["services"]["orchestrator-watchdog"]
|
|
build = wd["build"]
|
|
assert isinstance(build, dict)
|
|
assert build["dockerfile"] == "watchdog/Dockerfile"
|
|
assert build["context"] == "."
|
|
|
|
|
|
def test_watchdog_has_restart_and_mem_limit():
|
|
wd = _compose()["services"]["orchestrator-watchdog"]
|
|
assert wd["restart"] == "unless-stopped"
|
|
assert wd["mem_limit"] == "128m" # thin stack, not Grafana/Prometheus
|
|
|
|
|
|
def test_docker_sock_mounted_read_only():
|
|
wd = _compose()["services"]["orchestrator-watchdog"]
|
|
sock = [v for v in wd["volumes"] if "docker.sock" in v]
|
|
assert sock, "docker.sock must be mounted"
|
|
assert all(v.endswith(":ro") for v in sock), "docker.sock must be :ro"
|
|
|
|
|
|
def test_host_paths_mounted_read_only():
|
|
wd = _compose()["services"]["orchestrator-watchdog"]
|
|
# Every bind mount the watchdog uses is read-only (it only reads).
|
|
for v in wd["volumes"]:
|
|
assert v.endswith(":ro"), f"watchdog mount must be :ro: {v}"
|
|
|
|
|
|
def test_env_file_is_optional():
|
|
# A missing .env.watchdog must not break `docker compose up` (self-hosting).
|
|
wd = _compose()["services"]["orchestrator-watchdog"]
|
|
env_file = wd["env_file"]
|
|
assert isinstance(env_file, list)
|
|
assert env_file[0]["required"] is False
|
|
|
|
|
|
def test_watchdog_dockerfile_exists_and_is_stdlib_only():
|
|
df = REPO_ROOT / "watchdog" / "Dockerfile"
|
|
assert df.exists()
|
|
text = df.read_text()
|
|
# No pip install of third-party deps (stdlib-only, D1).
|
|
assert "pip install" not in text
|
|
assert "COPY requirements" not in text
|
|
assert "requirements.txt" not in text
|