admin/orchestrator

ORCH-063 — Disk-watchdog: мониторинг диска mva154 + Telegram-алерт при ≥85% #98

Merged

admin merged 7 commits from feature/ORCH-063-infra-mva154-85 into main

2026-06-09 19:13:33 +03:00

Author	SHA1	Message	Date
deploy-finalizer	2bd3bb75d4	deploy(ORCH-036): finalize SUCCESS for ORCH-063 All checks were successful CI / test (push) Successful in 30s Details CI / test (pull_request) Successful in 30s Details	2026-06-09 19:08:50 +03:00
claude-bot	efd744f766	tester(ET): auto-commit from tester run_id=488 All checks were successful CI / test (push) Successful in 35s Details CI / test (pull_request) Successful in 32s Details	2026-06-09 19:04:36 +03:00
claude-bot	fb4203b8f9	reviewer(ET): auto-commit from reviewer run_id=486	2026-06-09 19:04:36 +03:00
claude-bot	8759cb7df8	feat(disk-watchdog): host-FS fill heartbeat + Telegram alert at >=85% (ORCH-063) Adds src/disk_watchdog.py — a background daemon thread modelled on reconciler/job_reaper that measures host-FS fill via the mounted bind-paths (/repos, /app/data) with shutil.disk_usage and Telegram-alerts the operator at >= threshold (default 85%). The missing proactive signal: on 07.06.2026 the mva154 host disk silently hit 100% and stalled the whole self-hosting pipeline. - Pure decide_action(used_pct, threshold, prev, now, realert_s): alert on crossing up, cooldown re-alert, single recovery below threshold (unit-tested without a thread/timer; clock injected). - measure_paths: shutil.disk_usage per path, dedup by st_dev, per-path never-raise (a broken path never fails the tick). - Config flags ORCH_DISK_MONITOR_* with defensive validation (threshold 1..100, positive intervals -> default + warning). Kill-switch -> daemon does not start. - Additive disk_monitor block in GET /queue; start/stop in main.lifespan. - never-raise (per-path/per-tick/per-send); STAGE_TRANSITIONS/QG_CHECKS/check_*/ DB schema untouched, no migration (anti-spam state in-memory). Tests: tests/test_disk_watchdog.py (TC-01..TC-12, 18 cases); full suite green (1296). Docs: INFRA.md, .env.example, CHANGELOG.md (architecture/README.md + ADRs authored at architecture stage). Refs: ORCH-063 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-09 19:04:36 +03:00
claude-bot	4d9251c698	architect(ET): auto-commit from architect run_id=484	2026-06-09 19:04:36 +03:00
claude-bot	8ace9f880d	analyst(ET): auto-commit from analyst run_id=483	2026-06-09 19:04:36 +03:00
Slava	8c97a6ab1c	docs: init ORCH-063 business request	2026-06-09 19:04:36 +03:00