fix(notifications): tracker orphan cleanup + effort-in-line + honest done-time (ORCH-087)

Устраняет «замёрзшие» осиротевшие карточки live-трекера и доделывает строку
стадии/итоговое время.

G1 — зачистка сирот: аддитивный леджер tracker_messages(task_id, message_id,
created_at, deleted_at) + хелперы add/get_open/mark_deleted в src/db.py. bump
теперь удаляет ВСЕ незакрытые mid задачи (а не только скаляр
tasks.tracker_message_id, сохранён как BC-указатель). Новый mid в леджер только
при успешном send (BR-6); transient-delete остаётся для ретрая; «already
gone»/>48ч закрывается. Корень бага — скалярный учёт, терявший ссылку при
гонке/delete-fail+send-ok (ADR-001 G0).

G3 — deploy-цикл: ключ confirm_deploy в _LIVE_BRANCH_LABELS (без base-alias).

BR-EFF — эффорт в строке: колонка agent_runs.effort (_ensure_column,
идемпотентно), стамп фактического resolve_agent_effort в launcher._spawn в
момент запуска; рендер `· {model} · {effort}`, пустой → суффикс опускается.

BR-G5 — честное время: done-строка `⏱️ Агенты Σ · твоё {review~cap} · общее с
ожиданием {wall}` — три независимых подписанных метрики; кап
tracker_brd_review_cap_s (ORCH_TRACKER_BRD_REVIEW_CAP_S, дефолт 2ч, маркер ~).

Инварианты: STAGE_TRANSITIONS/QG_CHECKS/стадии без изменений; миграции
аддитивны/идемпотентны (enduro не трогается); never-raise,
disable_notification, plane_issue_link (ORCH-067), disable_web_page_preview
(ORCH-080) сохранены; src/reconciler.py не эродирован (ORCH-086 на месте).

Тесты: tests/test_notifications_orphans.py (TC-01..05 + never-raise),
tests/test_tracker_effort_time.py (TC-06/11..15 + confirm_deploy),
tests/test_launcher.py::TestEffortStamp (TC-09/10). Доки: CLAUDE.md
(§Нотификации), docs/architecture/README.md (Notifications), CHANGELOG.md.

Refs: ORCH-087

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-09 09:20:20 +03:00
committed by stream
parent 36c7a68722
commit a7b27f2235
11 changed files with 729 additions and 17 deletions

View File

@@ -109,6 +109,12 @@ def init_db():
# can render a short model tag per stage. Parsed from the run-log result JSON
# (modelUsage key) by the launcher monitor; NULL when unknown. Idempotent ALTER.
_ensure_column(conn, "agent_runs", "model", "TEXT")
# ORCH-087 (BR-EFF): persist the REAL --effort value sent to the Claude CLI per
# agent_runs row (low|medium|high|xhigh|max) so the tracker can render the
# resolved effort next to the model ("· opus-4-8 · xhigh"). Stamped in
# launcher._spawn right after resolve_agent_effort; NULL when no --effort flag
# was passed (resolved to "") or for historical rows. Idempotent ALTER.
_ensure_column(conn, "agent_runs", "effort", "TEXT")
# Telegram live tracker: one editable Telegram message per task. We store its
# message_id so each stage transition can editMessageText the same message
# instead of spamming a new one. Idempotent ALTER (safe on the live prod DB).
@@ -141,6 +147,27 @@ def init_db():
CREATE INDEX IF NOT EXISTS idx_job_deps_task ON job_deps(task_id);
CREATE INDEX IF NOT EXISTS idx_job_deps_depends ON job_deps(depends_on_task_id);
""")
# ORCH-087 (BR-G1, ADR-001 Р-1): authoritative ledger of EVERY tracker card
# (Telegram message_id) ever created for a task. The scalar
# tasks.tracker_message_id only ever knew the LAST mid, so any lost reference
# (delete-fail+send-ok, race, restart) orphaned older cards forever. This
# ledger lets every bump delete ALL still-open mids (deleted_at IS NULL), not
# just the last one. tasks.tracker_message_id is KEPT (current-card pointer,
# full BC). Purely ADDITIVE (CREATE TABLE/INDEX IF NOT EXISTS) -> idempotent,
# restart-safe on the live shared prod DB (enduro-trails data untouched). The
# logical FK on tasks.id is intentional (no REFERENCES, mirrors job_deps) so
# the migration cannot fail on a pre-existing DB. See 08-data-requirements.md.
conn.executescript("""
CREATE TABLE IF NOT EXISTS tracker_messages (
task_id INTEGER NOT NULL,
message_id INTEGER NOT NULL,
created_at TEXT DEFAULT (datetime('now')),
deleted_at TEXT,
PRIMARY KEY (task_id, message_id)
);
CREATE INDEX IF NOT EXISTS idx_tracker_messages_open
ON tracker_messages(task_id) WHERE deleted_at IS NULL;
""")
conn.commit()
conn.close()
@@ -301,6 +328,68 @@ def set_tracker_message_id(task_id: int, message_id: int) -> None:
conn.close()
# ---------------------------------------------------------------------------
# ORCH-087 (BR-G1): tracker_messages ledger — full accounting of every card mid
# ---------------------------------------------------------------------------
def add_tracker_message(task_id: int, message_id: int) -> None:
"""ORCH-087: record a freshly-created tracker card mid in the ledger.
Called ONLY after a successful send_telegram (new_mid is not None). INSERT OR
IGNORE keeps it idempotent: a repeat mid (race / restart replay) does not
duplicate the row or resurrect a deleted_at stamp.
"""
conn = get_db()
try:
conn.execute(
"INSERT OR IGNORE INTO tracker_messages (task_id, message_id) "
"VALUES (?, ?)",
(task_id, message_id),
)
conn.commit()
finally:
conn.close()
def get_open_tracker_messages(task_id: int) -> list[int]:
"""ORCH-087: all still-open (deleted_at IS NULL) card mids for a task.
These are the cards the next bump must clean up. Ordered oldest-first so the
oldest orphans are deleted first. Never includes the rows already marked
deleted.
"""
conn = get_db()
try:
rows = conn.execute(
"SELECT message_id FROM tracker_messages "
"WHERE task_id=? AND deleted_at IS NULL ORDER BY message_id ASC",
(task_id,),
).fetchall()
finally:
conn.close()
return [r[0] for r in rows]
def mark_tracker_message_deleted(task_id: int, message_id: int) -> None:
"""ORCH-087: stamp deleted_at on a card mid that is confirmed gone.
Called for mids that delete_telegram reported as gone (deleted now OR already
gone / >48h per _DELETE_GONE_MARKERS) so they drop out of
get_open_tracker_messages. Transient-delete mids are left untouched (NULL) for
a retry on the next bump.
"""
conn = get_db()
try:
conn.execute(
"UPDATE tracker_messages SET deleted_at=datetime('now') "
"WHERE task_id=? AND message_id=? AND deleted_at IS NULL",
(task_id, message_id),
)
conn.commit()
finally:
conn.close()
def mark_brd_review_started(task_id: int) -> None:
"""Stamp when BRD review (the human approve gate) started, if not already set.