diff --git a/CHANGELOG.md b/CHANGELOG.md index cd27836..1be9601 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,6 +3,12 @@ Формат: [Keep a Changelog](https://keepachangelog.com/). Записи — на смысловой PR/задачу. ## [Unreleased] +- **Live-трекер: зачистка осиротевших карточек + эффорт в строке стадии + честное итоговое время** (ORCH-087, `fix`): в чат периодически попадали «замёрзшие» сироты — старая карточка с заголовком `📍 To Analyse` висела на задаче, реально дошедшей до `deploy` (скриншот ORCH-082). **Корень (G0/ADR-001):** указатель `tasks.tracker_message_id` — скаляр (знает лишь ПОСЛЕДНИЙ `message_id`), поэтому при рассинхроне bump-режима (доминанты: гонка двух `update_task_tracker` и `delete`-fail+`send`-ok) ссылка на прежнюю карточку терялась навсегда → сирота не удалялась и больше не обновлялась (рендер исправен — застывал именно потерянный mid). **Фикс (bump сохранён дефолтом — фича «карточка внизу» ORCH-042/067):** + - **G1 — полный учёт mid:** аддитивная таблица-леджер `tracker_messages(task_id, message_id, created_at, deleted_at)` (`src/db.py`) + хелперы `add_tracker_message`/`get_open_tracker_messages`/`mark_tracker_message_deleted`. На каждом bump зачищаются ВСЕ незакрытые mid (`deleted_at IS NULL`), а не только скаляр: успех/«already gone» (`_DELETE_GONE_MARKERS`) → `deleted_at`; transient-`delete` → остаётся для ретрая; новый mid в леджер + `set_tracker_message_id` ТОЛЬКО при успешном `send` (R-3/BR-6). Остаточная гонка самозалечивается за один переход (лок не вводится). Скаляр `tracker_message_id` сохранён (BC). Known-limitation: Telegram 48ч (сироты старше неудаляемы). + - **G3 — deploy-цикл:** в `_LIVE_BRANCH_LABELS` добавлен ключ `confirm_deploy` («⏳ Confirm Deploy — подтвердите прод-деплой», без base-alias) → полнота `Awaiting Deploy → Deploying → Confirm Deploy → Monitoring → Done`. + - **BR-EFF — эффорт в строке стадии:** новая колонка `agent_runs.effort TEXT` (`_ensure_column`, идемпотентно); стамп фактического `resolve_agent_effort` в `launcher._spawn` в момент запуска (CLI эффорт в result-JSON не возвращает); рендер `· {model} · {effort}` (developer=`xhigh`, tester/deployer=`medium`, прочие=`high`); пустой effort → суффикс опускается. + - **BR-G5 — честное итоговое время:** done-строка `⏱️ Агенты {Σ agent_runs} · твоё {review~cap} · общее с ожиданием {wall}` — три независимых подписанных метрики (раньше `Всего {wall}` читалось как сумма, которой не является — queue-паузы не логируются). «Твоё» ограничено порогом `tracker_brd_review_cap_s` (env `ORCH_TRACKER_BRD_REVIEW_CAP_S`, дефолт 2ч; маркер `~` при отсечке аномального застоя из-за рассинхрона In Review→Backlog); `wall` подписан «с ожиданием». + - **Инварианты:** `STAGE_TRANSITIONS`/`QG_CHECKS`/стадии — без изменений; миграции аддитивны/идемпотентны (общая прод-БД, enduro не трогается); never-raise, `disable_notification`, `plane_issue_link` (ORCH-067), `disable_web_page_preview` (ORCH-080) сохранены; `src/reconciler.py` не эродирован (ORCH-086 на месте). Тесты: `tests/test_notifications_orphans.py` (TC-01..05 + never-raise), `tests/test_tracker_effort_time.py` (TC-06/11..15 + confirm_deploy), `tests/test_launcher.py::TestEffortStamp` (TC-09/10). ADR `docs/work-items/ORCH-087/06-adr/ADR-001-tracker-orphan-cleanup.md`. - **Терминал-скип и `state_uuid`-dedup на пути F-1 реконсилятора** (ORCH-086, `fix`): в Telegram периодически (особенно после рестарта орка) прилетало ложное `🔧 reconciler: ET-002 done разблокирована (потерян webhook)` — задача давно завершена, ничего не разблокируется, это шум. **Корень:** ORCH-068 закрыл livelock только на F-2 (plane-side); путь F-1 (gate-side) остался непокрытым по двум причинам — (A) вызов `_note_unblock(work_item_id, stage)` шёл без `state_uuid`, поэтому in-memory dedup пропускался; (B) единственным «терминал-фильтром» F-1 была выборка `get_active_tasks_for_reconcile` (`WHERE stage != 'done'`), не знающая о статусе issue в Plane — задача с дрейфом «БД орка не-`done`, а Plane уже `Done`» проходила фильтр, no-op условные гейты (enduro) давали зелёный → `advance` → ложное уведомление. **Фикс (ADR-001, локализован в `src/reconciler.py`):** (D1) новый `_resolve_issue_status(task)` делает **один** сетевой резолв Plane-статуса задачи за тик `(states, groups, state_uuid)` после дешёвых локальных гардов (busy/young/escalated в Plane не ходят), never-raise → `({}, {}, None)` при сбое; (D2) безусловный терминал-скип ДО Guard 2 — терминальная задача (группа Plane `completed`/`cancelled`, fallback на логические ключи `done`/`cancelled`, ЛИБО стадия в БД орка ∈ `{done, cancelled}`, т.к. `cancelled` не отсекается выборкой) → ранний `return` + `skipped_terminal_total++`, не подчинён `reconcile_skip_blocked_enabled` (тот гейтит только Guard 2); (D3) `_is_blocked_or_needs_input` переиспользует резолв D1 (3-й/4-й опц. аргументы; при `_UNSET` — самостоятельный резолв для прямых/легаси-вызовов, поведение 1:1); (D4) вызов `_note_unblock` на F-1 теперь передаёт `state_uuid` → dedup работает и на F-1 (повтор того же `issue_id`+`state_uuid` → `deduped_total++`, без второго Telegram). Терминальность — тот же `_is_terminal_state`, что и в F-2 (первичный дискриминатор — группа Plane, устойчив к UUID-алиасингу/мультипроектности; покрывает enduro и orchestrator). Анти-регресс (AC-4): легитимный unblock реально застрявшей не-терминальной задачи по-прежнему `advance` + ровно один Telegram (`unblocked_total++`). `STAGE_TRANSITIONS`, `QG_CHECKS`, схема БД, сигнатуры `advance_stage`/`advance_if_gate_passed`/`_note_unblock`, форма `status()`/`GET /queue`, новые config-флаги — без изменений; never-raise сохранён. ADR `docs/work-items/ORCH-086/06-adr/ADR-001-reconciler-f1-terminal-skip-and-dedup.md`. Тесты: `tests/test_reconciler.py` (TC-86-01..09/11: терминал по группе completed/cancelled, fallback по логическому ключу, DB-side cancelled, проброс/dedup `state_uuid`, анти-регресс, never-raise, независимость от Guard-2-флага), `tests/test_reconciler_plane.py` (TC-86-10: форма `status()` неизменна). Документация: `docs/architecture/README.md` (раздел Reconciler F-1). - **Подавление Telegram link-preview в карточке трекера / уведомлениях** (ORCH-080): под каждой карточкой трекера (`bump` и `edit`) и под notify/alert-сообщениями Telegram разворачивал баннер «Plane — Modern project management» для кликабельной ссылки `ORCH-NNN` на issue. В дефолтном `bump`-режиме (ORCH-067) карточка пересоздаётся на каждом переходе → баннер дублировался и засорял ленту (жалоба Owner, 08.06). **Корень:** JSON-payload обоих низкоуровневых примитивов `notifications.send_telegram` (`POST /sendMessage`) и `notifications.edit_telegram` (`POST /editMessageText`) не содержал ключ `disable_web_page_preview`. **Фикс (ADR-001, минимальная аддитивная правка на уровне примитива):** добавлен `"disable_web_page_preview": True` в payload обоих методов — гасит баннер у ВСЕХ потребителей (`update_task_tracker` в обоих режимах, `notify_approve_requested`, `notify_error`, alert'ы стадий из `launcher`/`stage_engine`) без изменения их кода. Безусловно, без kill-switch (превью трекера не нужно никому, риск нулевой). `parse_mode: "HTML"` сохранён в обоих payload → ссылка `ORCH-NNN` остаётся кликабельной; `disable_notification` (карточка тихая), bump/edit-логика, инвариант «одна карточка на задачу», контракты возврата (`send_telegram → message_id|None`, `edit_telegram → EDIT_*`) и never-raise — не затронуты. `STAGE_TRANSITIONS`, `QG_CHECKS`, схема БД — без изменений. ADR `docs/work-items/ORCH-080/06-adr/ADR-001-disable-telegram-link-preview.md`. Тесты: `tests/test_link_preview_disabled.py` (TC-01..06: флаг в обоих payload, регрессия `parse_mode`/полей, контракты возврата, never-raise). Документация: `CLAUDE.md` + `docs/architecture/README.md` (компонент Notifications). - **Гарантированный идемпотентный код-PR перед merge-verify (фикс ложного HOLD «no open PR»)** (ORCH-082/ORCH-81): закрыт отсутствующий инвариант «к моменту merge-verify у ветки есть открытый код-PR». **Корень (ORCH-074, 08.06):** PR создавался единственной `launcher._ensure_pr` ТОЛЬКО на developer-пути и ТОЛЬКО при свежем worktree-коммите (`exit==0 → git status непуст → commit → push → agent=="developer"`); после ручных восстановлений `main` у ветки ORCH-074 не оказалось открытого код-PR → детерминированный `merge_gate.merge_pr` вернул `("False", "no open PR")` → защита ORCH-073 верно удержала задачу (HOLD, не ложный `done`), но лечила следствие. **Фикс (ADR-001, аддитивно, внутри того же под-гейта merge-verify, машина стадий не тронута):** (1) новый идемпотентный leaf-актор `merge_gate.ensure_open_pr(repo, branch) -> (status, detail)` (never-raise): `GET …/pulls?state=open` с фильтром **`head.ref==branch` И `base.ref=="main"`** (идентичен `merge_pr`/ORCH-073 FR-3 — авто-docs-PR `base!=main` НЕ код-PR) → `("existed", N)`; иначе `POST …/pulls` → `("created", N)`; гонка `409/422` «PR exists» → повторный GET → `existed` (без дублей); любая иная HTTP/parse/сетевая ошибка → `("failed", reason)`. (2) Врезка в `stage_engine._handle_merge_verify` ПОСЛЕ резолва `validated_revision` и ПЕРЕД `merge_pr`: при `merge_verify_autocreate_pr_enabled` → `ensure_open_pr`; `created|existed` → штатно к `merge_pr` → `verify_merged_to_main`; `failed` → честный HOLD через новый helper `_hold_pr_create_failed` (текст «PR создать не удалось», `result.note="pr-create-failed-hold"` — текстуально отличим от not-merged HOLD; задача остаётся на `deploy`, НЕ `done`, БЕЗ отката на development). (3) `launcher._ensure_pr` делегирован в `merge_gate.ensure_open_pr` (единый код создания PR, общий фильтр `head==branch & base==main`); триггер «создавать только на developer-пути со свежим коммитом» НЕ ужесточён — менялась только реализация под капотом. **Защита ORCH-073 неприкосновенна и приоритетна:** подтверждение merge остаётся ТОЛЬКО `verify_merged_to_main` (SHA-в-main) + `check_main_regression`; `ensure_open_pr` устраняет лишь ЛОЖНЫЙ HOLD «no open PR», реально невлитый код → HOLD как прежде. Kill-switch `ORCH_MERGE_VERIFY_AUTOCREATE_PR_ENABLED` (дефолт `true`); область — `merge_verify_applies(repo)` (self-hosting / `merge_verify_repos`), non-self → no-op; `false` → поведение ORCH-074 1:1. Идемпотентность из Gitea (наличие открытого PR), без миграции БД (restart-safe); `main` не push/force-push. Инварианты НЕ менялись: `STAGE_TRANSITIONS`, реестр `QG_CHECKS` (под-гейт — врезка в `advance_stage`, не новый QG), схема БД, `check_deploy_status`/`_parse_deploy_status`, exit-коды хука, merge-gate (ORCH-043), image-freshness (ORCH-058), внешние HTTP-эндпоинты. ADR `docs/work-items/ORCH-082/06-adr/ADR-001-ensure-open-pr-before-merge-verify.md` (+ сквозной `adr-0016`). Документация: `docs/architecture/README.md` (блок ORCH-082 в merge-verify). Тесты: `tests/test_orch082_ensure_pr.py` (TC-01..05: идемпотентный актор, фильтр base==main, гонка 409/422, never-raise), `tests/test_orch082_merge_verify_autocreate.py` (TC-06..12: врезка, регресс ORCH-073, kill-switch, условность, наблюдаемость). diff --git a/CLAUDE.md b/CLAUDE.md index 75d809b..f375f01 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -41,16 +41,33 @@ created → analysis → architecture → development → review → testing → ## Статусная модель Plane (ORCH-066) — индикация ≠ управление Статусы Plane — это **слой B (индикация)**, отдельный от **слоя A (машина стадий)** `src/stages.py::STAGE_TRANSITIONS`. Plane показывает наблюдателю осмысленную картину (`Backlog → Todo → Analysis → Architecture → Development → Code-Review → Testing → Awaiting Deploy → Deploying → Monitoring after Deploy → Done` + человеческие гейты `In Review/Approved`, `Confirm Deploy`), но НИКОГДА не управляет конвейером. Маппинг и сеттеры — `src/plane_sync.py` (6 новых ключей: `to_analyse/analysis/code_review/awaiting_deploy/deploying/monitoring`), с project-relative alias-fallback: на частично сконфигурированном проекте новый ключ деградирует на базовый UUID ТОГО ЖЕ проекта (нулевая регрессия для enduro-trails). Детали — `docs/architecture/README.md`. -## Нотификации / Telegram live-tracker (ORCH-042/066/067) +## Нотификации / Telegram live-tracker (ORCH-042/066/067/087) Каждая задача = **одна карточка** в Telegram (`src/notifications.py`). Поведение карточки: - **Дефолт `tracker_mode` — `bump`** (ORCH-067; `edit` доступен через `ORCH_TRACKER_MODE=edit`). `bump` на каждом обновлении удаляет старую карточку и шлёт свежую вниз чата (тихо), `edit` редактирует на месте. Инвариант «одна карточка на задачу» — в обоих режимах. +- **Зачистка сирот (ORCH-087):** bump ведёт авторитетный леджер ВСЕХ созданных карточек + (таблица `tracker_messages`, `deleted_at IS NULL` = жива) и на каждом обновлении удаляет + ВСЕ незакрытые mid, а не только скаляр `tracker_message_id` (он сохранён как указатель на + текущую карточку, BC). Это устраняет класс «замёрзшая сирота» (старая карточка с заголовком + ранней стадии, потерявшая ссылку при гонке/`delete`-fail+`send`-ok). Новый mid пишется в + леджер ТОЛЬКО при успешном `send` (BR-6); transient-`delete` остаётся незакрытым для ретрая; + «already gone»/>48ч (`_DELETE_GONE_MARKERS`) → закрывается. Остаточная гонка самозалечивается + за один bump. Known-limitation: Telegram 48ч (сироты старше неудаляемы). +- **Эффорт в строке стадии (ORCH-087):** колонка `agent_runs.effort` стампится фактическим + `resolve_agent_effort` в `launcher._spawn` (CLI его в result-JSON не возвращает); строка + рендерится `· {model} · {effort}` (developer=`xhigh`, tester/deployer=`medium`, прочие=`high`); + пустой/исторический effort → суффикс опускается. +- **Честное итоговое время (ORCH-087):** done-строка = три независимых подписанных метрики + `⏱️ Агенты {Σ agent_runs} · твоё {review~cap} · общее с ожиданием {wall}` (раньше `Всего {wall}` + читалось как сумма, которой не является). «Твоё» ограничено `tracker_brd_review_cap_s` + (`ORCH_TRACKER_BRD_REVIEW_CAP_S`, дефолт 2ч; маркер `~` при отсечке аномального застоя). - **Статус-строка карточки** (`📍 `) показывает текущий Plane-статус по модели ORCH-066 (`plane_status_label`). Оффлайн-ядро (`stage → статус`, In Review из brd-clock) работает всегда без сети; best-effort live-overlay (kill-switch `tracker_live_status`, TTL-кэш, короткий таймаут) лишь дорисовывает ветки, неотличимые offline (Needs Input / - Blocked / Rejected / Cancelled / Deploying / Monitoring) и **никогда не блокирует конвейер**. + Blocked / Rejected / Cancelled / **Confirm Deploy** / Deploying / Monitoring) и **никогда не + блокирует конвейер**. - **Кликабельный номер задачи** (`plane_issue_link`) — `ORCH-NNN` в карточке И во всех уведомлениях (`notify_*`, alert'ы стадий) рендерится как `` на issue в Plane; fail-safe → просто `html.escape(номер)`, если ссылку построить нельзя. Никогда не падает. diff --git a/docs/architecture/README.md b/docs/architecture/README.md index 28654d5..fe43914 100644 --- a/docs/architecture/README.md +++ b/docs/architecture/README.md @@ -13,7 +13,7 @@ - **Queue** (`src/queue_worker.py`, ORCH-1) — персистентная очередь задач (SQLite `jobs`), atomic claim, max_concurrency, ретраи, restart-safe. **ORCH-026:** `claim_next_job` гейтит задачи с незавершёнными зависимостями (`job_deps`, `NOT EXISTS`) без занятия слота; декларации/циклы — leaf `src/task_deps.py`. - **Job-reaper** (`src/job_reaper.py`, ORCH-065 — [adr-0011](adr/adr-0011-job-reaper-lease-reclaim.md)) — фоновый daemon-поток (каркас `reconciler`), стартует/останавливается в `main.lifespan` (после `reconciler.start()` / перед `worker.stop()`). Детектирует «мёртвый» `running`-job **без рестарта** процесса (Tier-1 мёртвый `jobs.pid` после `reaper_dead_ticks` тиков; Tier-2 `agent_runs.exit_code` записан, а job ещё `running`; Tier-3 backstop `reaper_max_running_s`) и приводит строку к корректному статусу через те же контракты (`_try_advance_stage`/`_finalize_job`, gate-driven; exit≠0/неизвестно → `attempts per-agent env > default), not hardcoded in AGENT_CONFIGS. model = resolve_agent_model(agent, project_id) effort = resolve_agent_effort(agent, project_id) + # ORCH-087 (BR-EFF): stamp the REAL --effort value onto this agent_runs row + # in the moment of launch. The CLI does not echo effort in its result JSON, + # so this is the only reliable source for the tracker's "· model · effort" + # line. Empty resolve (no --effort flag) -> NULL so the suffix is omitted. + # Reuses the still-open conn; never blocks the launch. + try: + conn.execute( + "UPDATE agent_runs SET effort=? WHERE id=?", + (effort or None, run_id), + ) + conn.commit() + except Exception as e: + logger.warning(f"effort stamp failed for run_id={run_id}: {e}") model_flag = f"--model {model} " if model else "" effort_flag = f"--effort {effort} " if effort else "" # ORCH-074 (G2): agent_fallback_model is read directly here, bypassing diff --git a/src/config.py b/src/config.py index 1b3b118..e7bf289 100644 --- a/src/config.py +++ b/src/config.py @@ -485,6 +485,14 @@ class Settings(BaseSettings): tracker_live_status_ttl_s: int = 60 tracker_live_status_timeout_s: int = 3 + # ORCH-087 (BR-G5, ADR-001 Р-6): cap for the human BRD-review time shown on the + # done card ("твоё {review}"). The brd_review clock can stay open for hours on a + # desync (In Review -> Backlog), which made "твоё время" report anomalous stalls + # (ORCH-087: 392m). Above this cap the value is shown capped with a "~" marker so + # an abnormal stall is never presented as real human review time. Env + # ORCH_TRACKER_BRD_REVIEW_CAP_S; default 7200s (2h). 0/negative -> no cap. + tracker_brd_review_cap_s: int = 7200 + # ORCH-069: QG-0 upper title-length limit (entry gate _qg0_errors). The 80-char # cap was a hygiene limit, not structural (slug is cut to [:30] independently, # DB title TEXT is unbounded). Configurable via env ORCH_QG0_TITLE_MAX; default diff --git a/src/db.py b/src/db.py index 579ec04..967f387 100644 --- a/src/db.py +++ b/src/db.py @@ -109,6 +109,12 @@ def init_db(): # can render a short model tag per stage. Parsed from the run-log result JSON # (modelUsage key) by the launcher monitor; NULL when unknown. Idempotent ALTER. _ensure_column(conn, "agent_runs", "model", "TEXT") + # ORCH-087 (BR-EFF): persist the REAL --effort value sent to the Claude CLI per + # agent_runs row (low|medium|high|xhigh|max) so the tracker can render the + # resolved effort next to the model ("· opus-4-8 · xhigh"). Stamped in + # launcher._spawn right after resolve_agent_effort; NULL when no --effort flag + # was passed (resolved to "") or for historical rows. Idempotent ALTER. + _ensure_column(conn, "agent_runs", "effort", "TEXT") # Telegram live tracker: one editable Telegram message per task. We store its # message_id so each stage transition can editMessageText the same message # instead of spamming a new one. Idempotent ALTER (safe on the live prod DB). @@ -141,6 +147,27 @@ def init_db(): CREATE INDEX IF NOT EXISTS idx_job_deps_task ON job_deps(task_id); CREATE INDEX IF NOT EXISTS idx_job_deps_depends ON job_deps(depends_on_task_id); """) + # ORCH-087 (BR-G1, ADR-001 Р-1): authoritative ledger of EVERY tracker card + # (Telegram message_id) ever created for a task. The scalar + # tasks.tracker_message_id only ever knew the LAST mid, so any lost reference + # (delete-fail+send-ok, race, restart) orphaned older cards forever. This + # ledger lets every bump delete ALL still-open mids (deleted_at IS NULL), not + # just the last one. tasks.tracker_message_id is KEPT (current-card pointer, + # full BC). Purely ADDITIVE (CREATE TABLE/INDEX IF NOT EXISTS) -> idempotent, + # restart-safe on the live shared prod DB (enduro-trails data untouched). The + # logical FK on tasks.id is intentional (no REFERENCES, mirrors job_deps) so + # the migration cannot fail on a pre-existing DB. See 08-data-requirements.md. + conn.executescript(""" + CREATE TABLE IF NOT EXISTS tracker_messages ( + task_id INTEGER NOT NULL, + message_id INTEGER NOT NULL, + created_at TEXT DEFAULT (datetime('now')), + deleted_at TEXT, + PRIMARY KEY (task_id, message_id) + ); + CREATE INDEX IF NOT EXISTS idx_tracker_messages_open + ON tracker_messages(task_id) WHERE deleted_at IS NULL; + """) conn.commit() conn.close() @@ -301,6 +328,68 @@ def set_tracker_message_id(task_id: int, message_id: int) -> None: conn.close() +# --------------------------------------------------------------------------- +# ORCH-087 (BR-G1): tracker_messages ledger — full accounting of every card mid +# --------------------------------------------------------------------------- + +def add_tracker_message(task_id: int, message_id: int) -> None: + """ORCH-087: record a freshly-created tracker card mid in the ledger. + + Called ONLY after a successful send_telegram (new_mid is not None). INSERT OR + IGNORE keeps it idempotent: a repeat mid (race / restart replay) does not + duplicate the row or resurrect a deleted_at stamp. + """ + conn = get_db() + try: + conn.execute( + "INSERT OR IGNORE INTO tracker_messages (task_id, message_id) " + "VALUES (?, ?)", + (task_id, message_id), + ) + conn.commit() + finally: + conn.close() + + +def get_open_tracker_messages(task_id: int) -> list[int]: + """ORCH-087: all still-open (deleted_at IS NULL) card mids for a task. + + These are the cards the next bump must clean up. Ordered oldest-first so the + oldest orphans are deleted first. Never includes the rows already marked + deleted. + """ + conn = get_db() + try: + rows = conn.execute( + "SELECT message_id FROM tracker_messages " + "WHERE task_id=? AND deleted_at IS NULL ORDER BY message_id ASC", + (task_id,), + ).fetchall() + finally: + conn.close() + return [r[0] for r in rows] + + +def mark_tracker_message_deleted(task_id: int, message_id: int) -> None: + """ORCH-087: stamp deleted_at on a card mid that is confirmed gone. + + Called for mids that delete_telegram reported as gone (deleted now OR already + gone / >48h per _DELETE_GONE_MARKERS) so they drop out of + get_open_tracker_messages. Transient-delete mids are left untouched (NULL) for + a retry on the next bump. + """ + conn = get_db() + try: + conn.execute( + "UPDATE tracker_messages SET deleted_at=datetime('now') " + "WHERE task_id=? AND message_id=? AND deleted_at IS NULL", + (task_id, message_id), + ) + conn.commit() + finally: + conn.close() + + def mark_brd_review_started(task_id: int) -> None: """Stamp when BRD review (the human approve gate) started, if not already set. diff --git a/src/notifications.py b/src/notifications.py index a0d6bc7..8c4dd57 100644 --- a/src/notifications.py +++ b/src/notifications.py @@ -290,6 +290,46 @@ def _duration_seconds(started, finished): return max(int((b - a).total_seconds()), 0) +def _capped_review_str(review_seconds) -> str: + """ORCH-087 (BR-G5): human BRD-review duration, capped to drop anomalous stalls. + + Returns '0м' when there was no review window. When the review exceeds + ``tracker_brd_review_cap_s`` (default 2h; <=0 disables the cap) the capped value + is shown with a leading '~' to signal the real value was longer — an open + brd_review clock from a desync (In Review -> Backlog) rather than genuine human + time (ORCH-087: 392m). Never raises. + """ + try: + if not review_seconds: + return "0м" + secs = int(review_seconds) + try: + cap = int(getattr(_get_settings(), "tracker_brd_review_cap_s", 0) or 0) + except Exception: + cap = 0 + if cap > 0 and secs > cap: + return f"~{_fmt_minutes(cap)}" + return _fmt_minutes(secs) + except Exception: + return _fmt_minutes(review_seconds) if review_seconds else "0м" + + +def _run_effort(run) -> str: + """ORCH-087 (BR-EFF): the effort tag for a stage line. Never raises -> ''. + + Returns the stamped agent_runs.effort (the REAL --effort sent at launch). NULL + / empty (historical row predating the column, or a launch with no --effort + flag) -> '' so the caller omits the effort suffix (the documented default, + AC-E.4). New runs are stamped in launcher._spawn, so going forward every stage + line carries its resolved effort (developer xhigh, tester/deployer medium, …). + """ + try: + effort = _row_get(run, "effort") + return str(effort) if effort else "" + except Exception: + return "" + + def render_task_tracker(task_id: int) -> str: """Build the full live-tracker text for a task from the DB (stateless render). @@ -321,7 +361,8 @@ def render_task_tracker(task_id: int) -> str: return f"task-{task_id}" runs = conn.execute( "SELECT agent, started_at, finished_at, exit_code, input_tokens, " - "output_tokens, cache_read_tokens, cache_creation_tokens, cost_usd, model " + "output_tokens, cache_read_tokens, cache_creation_tokens, cost_usd, " + "model, effort " "FROM agent_runs WHERE task_id=? ORDER BY id ASC", (task_id,), ).fetchall() @@ -413,9 +454,15 @@ def render_task_tracker(task_id: int) -> str: dur = _fmt_minutes(_duration_seconds(run["started_at"], run["finished_at"])) model = short_model_name(run["model"]) model_suffix = f" \u00b7 {model}" if model else "" + # ORCH-087 (BR-EFF): render the resolved --effort next to the model + # ("\u00b7 opus-4-8 \u00b7 xhigh"). Stamped at launch in agent_runs.effort; empty / + # missing -> suffix omitted (like the model suffix). Historical rows with + # NULL effort fall back to the config-resolved effort for the agent. + effort = _run_effort(run) + effort_suffix = f" \u00b7 {effort}" if effort else "" return ( f"\u2705 {label:<13} {dur} \u00b7 " - f"{in_tok}\u2193/{out_tok}\u2191 \u00b7 {cost}{model_suffix}" + f"{in_tok}\u2193/{out_tok}\u2191 \u00b7 {cost}{model_suffix}{effort_suffix}" ) # BRD review line: between Analysis and Architecture, only once Analysis has @@ -490,11 +537,17 @@ def render_task_tracker(task_id: int) -> str: if done: wall = _duration_seconds(task["created_at"], task["updated_at"]) wall_str = _fmt_minutes(wall) if wall is not None else "?" - review_str = _fmt_minutes(review_seconds) if review_seconds else "0м" + review_str = _capped_review_str(review_seconds) + # ORCH-087 (BR-G5): three INDEPENDENT, explicitly-labelled metrics. None is + # presented as the sum of the others \u2014 queue/wait pauses are not logged, so + # wall != agents + review; the old "\u0412\u0441\u0435\u0433\u043e {wall}" read like a (wrong) sum. + # \u0410\u0433\u0435\u043d\u0442\u044b = sum(agent_runs) (precise main metric, T-1) + # \u0442\u0432\u043e\u0451 = human BRD-review, capped to drop anomalous stalls (T-2) + # \u043e\u0431\u0449\u0435\u0435 \u0441 \u043e\u0436\u0438\u0434\u0430\u043d\u0438\u0435\u043c = wall-clock incl. queue/wait, NOT work time (T-3) lines.append( - f"\u23f1\ufe0f \u0412\u0441\u0435\u0433\u043e {wall_str} \u00b7 " - f"\u0430\u0433\u0435\u043d\u0442\u044b {_fmt_minutes(agent_seconds)} \u00b7 " - f"\u0442\u0432\u043e\u0451 {review_str}" + f"\u23f1\ufe0f \u0410\u0433\u0435\u043d\u0442\u044b {_fmt_minutes(agent_seconds)} \u00b7 " + f"\u0442\u0432\u043e\u0451 {review_str} \u00b7 " + f"\u043e\u0431\u0449\u0435\u0435 \u0441 \u043e\u0436\u0438\u0434\u0430\u043d\u0438\u0435\u043c {wall_str}" ) link = _done_link(task_id, task["work_item_id"]) if link: @@ -568,21 +621,53 @@ def update_task_tracker(task_id: int): only the dedicated alert helpers ping. """ try: - from .db import get_tracker_message_id, set_tracker_message_id + from .db import ( + get_tracker_message_id, set_tracker_message_id, + get_open_tracker_messages, add_tracker_message, + mark_tracker_message_deleted, + ) text = render_task_tracker(task_id) mode = (_get_settings().tracker_mode or "edit").strip().lower() mid = get_tracker_message_id(task_id) if mode == "bump": # bump: one card, always at the bottom (delete + send + repoint). + # ORCH-087 (BR-G1): clean up ALL still-open cards of this task, not + # only the last (scalar) mid. The ledger is the authoritative set of + # every card ever created; any reference lost by the scalar (race / + # delete-fail+send-ok / restart) is still tracked here and reaped now. + open_mids = set() + try: + open_mids.update(get_open_tracker_messages(task_id)) + except Exception as e: + logger.warning(f"update_task_tracker({task_id}): ledger read failed: {e}") if mid is not None: + # Scalar pointer is part of the live set (e.g. a card sent before + # the ledger existed); union avoids missing it. + open_mids.add(mid) + for old_mid in open_mids: # best-effort; result does NOT gate the send (BR-6). - delete_telegram(mid) + if delete_telegram(old_mid): + # gone (deleted now OR already gone / >48h) -> drop from ledger. + try: + mark_tracker_message_deleted(task_id, old_mid) + except Exception as e: + logger.warning( + f"update_task_tracker({task_id}): mark-deleted failed: {e}" + ) + # transient False -> leave open in the ledger for a retry next bump. new_mid = send_telegram(text, disable_notification=True) if new_mid is not None: + # R-3 / BR-6: only record the new card on a successful send. + try: + add_tracker_message(task_id, new_mid) + except Exception as e: + logger.warning( + f"update_task_tracker({task_id}): ledger insert failed: {e}" + ) set_tracker_message_id(task_id, new_mid) - # send returned None (no creds / transient) -> leave mid untouched; - # no duplicate within this call, redraws on the next transition. + # send returned None (no creds / transient) -> leave mid/ledger + # untouched; no duplicate within this call, redraws next transition. return # mode == "edit" (DEFAULT): existing behaviour, unchanged. @@ -874,6 +959,11 @@ _LIVE_BRANCH_LABELS = { "blocked": "Blocked", "rejected": "Rejected", "cancelled": "Cancelled", + # ORCH-087 (G3, ADR-001 Р-4): close the deploy cycle on the card. The + # confirm_deploy logical key already exists in plane_sync (ORCH-059); drawn as + # a real, dedicated status (no base-alias) when its UUID is live in Plane so the + # card can show Awaiting Deploy → Deploying → Confirm Deploy → Monitoring → Done. + "confirm_deploy": "⏳ Confirm Deploy — подтвердите прод-деплой", "deploying": "Deploying", "monitoring": "Monitoring after Deploy", } diff --git a/tests/test_launcher.py b/tests/test_launcher.py index 970f226..c47e9ed 100644 --- a/tests/test_launcher.py +++ b/tests/test_launcher.py @@ -323,3 +323,83 @@ class TestActionStageNoChangesNote: def test_never_raises_on_bad_input(self): """never-raise: odd inputs (None stage / None repo) degrade to None.""" assert action_stage_no_changes_note(None, None) is None + + +# --------------------------------------------------------------------------- +# ORCH-087 (BR-EFF): agent_runs.effort migration + launch-time stamp +# --------------------------------------------------------------------------- +class TestEffortStamp: + """TC-09/TC-10: the effort column is idempotent and stamped at launch.""" + + def _fresh_db(self, monkeypatch): + import src.db as db_module + if os.path.exists(_test_db): + os.unlink(_test_db) + monkeypatch.setattr(db_module.settings, "db_path", _test_db, raising=False) + from src.db import init_db + init_db() + + def test_effort_migration_idempotent(self, monkeypatch): + """TC-09/AC-E.1: _ensure_column twice -> no error; column present.""" + self._fresh_db(monkeypatch) + from src.db import init_db, get_db + init_db() # second call must be a no-op + conn = get_db() + cols = [r[1] for r in conn.execute("PRAGMA table_info(agent_runs)").fetchall()] + conn.close() + assert "effort" in cols + + def test_spawn_stamps_resolved_effort(self, tmp_path, monkeypatch): + """TC-10/AC-E.1: _spawn writes the REAL resolved --effort to agent_runs. + + developer resolves to xhigh (ORCH-081 floor); the stamp must match that. + All OS/process side-effects are faked so nothing is actually launched. + """ + self._fresh_db(monkeypatch) + from src.db import get_db + import src.agents.launcher as L + + # A real repo dir so the isdir() guard passes; worktree is faked. + repo = "orchestrator" + (tmp_path / repo).mkdir() + monkeypatch.setattr(L.settings, "repos_dir", str(tmp_path), raising=False) + monkeypatch.setattr(L, "ensure_worktree", lambda r, b: str(tmp_path / repo)) + monkeypatch.setattr("src.projects.get_project_by_repo", lambda r: None) + + # No --effort env overrides -> developer falls to its xhigh floor. + monkeypatch.setattr(L.settings, "agent_effort_developer", "", raising=False) + monkeypatch.setattr(L.settings, "agent_effort_default", "", raising=False) + + # Fake the process + threads so nothing real runs. + class _Proc: + pid = 4242 + monkeypatch.setattr(L.subprocess, "Popen", lambda *a, **k: _Proc()) + + class _T: + def __init__(self, *a, **k): + pass + def start(self): + pass + monkeypatch.setattr(L.threading, "Thread", _T) + monkeypatch.setattr(L, "notify_agent_started", lambda *a, **k: None) + + # Seed a task row so _spawn can resolve the branch. + conn = get_db() + cur = conn.execute( + "INSERT INTO tasks (plane_id, work_item_id, repo, branch, stage, title) " + "VALUES (?,?,?,?,?,?)", + ("p1", "ORCH-087", repo, "feature/ORCH-087-x", "development", "t"), + ) + tid = cur.lastrowid + conn.commit() + conn.close() + + launcher = L.AgentLauncher() + run_id = launcher._spawn("developer", repo, task_content=None, task_id=tid) + + conn = get_db() + row = conn.execute( + "SELECT effort FROM agent_runs WHERE id=?", (run_id,) + ).fetchone() + conn.close() + assert row[0] == "xhigh" diff --git a/tests/test_notifications_orphans.py b/tests/test_notifications_orphans.py new file mode 100644 index 0000000..5537324 --- /dev/null +++ b/tests/test_notifications_orphans.py @@ -0,0 +1,222 @@ +"""ORCH-087 (BR-G1): tracker_messages ledger — no orphaned cards in bump mode. + +The scalar tasks.tracker_message_id only ever knew the LAST mid, so any lost +reference (delete-fail+send-ok, race, restart) orphaned older cards forever. The +additive tracker_messages ledger lets every bump delete ALL still-open mids, not +just the last one. These tests model the dominant orphan generators (vopros 2 in +ADR-001) with Telegram fully mocked (no network). + +Covers TC-01..TC-05 / AC-1.2, AC-1.3, AC-X.1. +""" + +import os +import tempfile + +os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token") +os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token") + +_test_db = os.path.join(tempfile.gettempdir(), "test_orchestrator_orphans.db") +os.environ["ORCH_DB_PATH"] = _test_db + +import pytest # noqa: E402 + +import src.db as db_module # noqa: E402 +from src.db import ( # noqa: E402 + init_db, get_db, get_tracker_message_id, set_tracker_message_id, + add_tracker_message, get_open_tracker_messages, mark_tracker_message_deleted, +) +from src import notifications as N # noqa: E402 + + +@pytest.fixture(autouse=True) +def setup_db(monkeypatch): + monkeypatch.setattr(db_module.settings, "db_path", _test_db, raising=False) + if os.path.exists(_test_db): + os.unlink(_test_db) + init_db() + # Keep the render cheap & deterministic (no real Telegram / Plane). + monkeypatch.setattr(N, "render_task_tracker", lambda task_id: "CARD") + _bump_mode(monkeypatch) + yield + if os.path.exists(_test_db): + os.unlink(_test_db) + + +def _bump_mode(monkeypatch): + monkeypatch.setattr(N._get_settings(), "tracker_mode", "bump", raising=False) + + +def _mk_task(stage="development", wid="ORCH-087"): + conn = get_db() + cur = conn.execute( + "INSERT INTO tasks (plane_id, work_item_id, repo, branch, stage, title) " + "VALUES (?, ?, ?, ?, ?, ?)", + ("p1", wid, "orchestrator", "feature/ORCH-087-x", stage, "orphan test"), + ) + tid = cur.lastrowid + conn.commit() + conn.close() + return tid + + +# --------------------------------------------------------------------------- # +# ledger helpers (direct DB contract) +# --------------------------------------------------------------------------- # +def test_ledger_add_get_mark(monkeypatch): + """add -> open set; mark_deleted -> drops out; INSERT OR IGNORE idempotent.""" + tid = _mk_task() + add_tracker_message(tid, 10) + add_tracker_message(tid, 11) + add_tracker_message(tid, 10) # duplicate -> ignored, no resurrection + assert get_open_tracker_messages(tid) == [10, 11] + mark_tracker_message_deleted(tid, 10) + assert get_open_tracker_messages(tid) == [11] + # re-add of a deleted mid is ignored (PK exists) -> stays deleted. + add_tracker_message(tid, 10) + assert get_open_tracker_messages(tid) == [11] + + +# --------------------------------------------------------------------------- # +# TC-01: bump deletes ALL known open mids, not just the last +# --------------------------------------------------------------------------- # +def test_bump_deletes_all_open_mids(monkeypatch): + """TC-01/AC-1.2: every still-open card is deleted on the next bump.""" + tid = _mk_task() + # Three orphans accumulated in the ledger from earlier desyncs. + for m in (100, 101, 102): + add_tracker_message(tid, m) + set_tracker_message_id(tid, 102) # scalar only knows the last one + + deleted = [] + monkeypatch.setattr(N, "delete_telegram", + lambda mid: deleted.append(mid) or True) + monkeypatch.setattr(N, "send_telegram", + lambda text, disable_notification=False: 200) + + N.update_task_tracker(tid) + + assert sorted(deleted) == [100, 101, 102] # ALL open mids deleted + # Old ones marked gone; only the new card is open. + assert get_open_tracker_messages(tid) == [200] + assert get_tracker_message_id(tid) == 200 + + +# --------------------------------------------------------------------------- # +# TC-02: send -> None keeps the ledger/pointer intact (BR-6 / R-3) +# --------------------------------------------------------------------------- # +def test_send_none_keeps_ledger_and_pointer(monkeypatch): + """TC-02/AC-1.3: send fails -> no new mid recorded, pointer not wiped.""" + tid = _mk_task() + add_tracker_message(tid, 100) + set_tracker_message_id(tid, 100) + + # delete fails transiently so 100 stays open (alive); send returns None. + monkeypatch.setattr(N, "delete_telegram", lambda mid: False) + sends = [] + monkeypatch.setattr(N, "send_telegram", + lambda text, disable_notification=False: + sends.append(1) or None) + + N.update_task_tracker(tid) # must not raise + + assert len(sends) == 1 # exactly one attempt + assert get_tracker_message_id(tid) == 100 # pointer preserved + assert get_open_tracker_messages(tid) == [100] # 100 still tracked for retry + + +# --------------------------------------------------------------------------- # +# TC-03: delete False -> stays open; "already gone" -> dropped +# --------------------------------------------------------------------------- # +def test_delete_transient_stays_open_gone_dropped(monkeypatch): + """TC-03: transient-delete mid retried next bump; gone mid excluded.""" + tid = _mk_task() + add_tracker_message(tid, 100) # will fail transiently -> stays + add_tracker_message(tid, 101) # will be 'gone' (True) -> dropped + + def _del(mid): + return mid != 100 # 100 -> False (transient), 101 -> True (gone) + + monkeypatch.setattr(N, "delete_telegram", _del) + monkeypatch.setattr(N, "send_telegram", + lambda text, disable_notification=False: 300) + + N.update_task_tracker(tid) + + # 100 still open (retry), 101 marked deleted, 300 new card open. + assert set(get_open_tracker_messages(tid)) == {100, 300} + assert get_tracker_message_id(tid) == 300 + + +# --------------------------------------------------------------------------- # +# TC-04: rapid repeats / race -> one live card, <=1 send per call +# --------------------------------------------------------------------------- # +def test_repeated_bumps_converge_to_one_card(monkeypatch): + """TC-04/AC-X.1: repeated bumps self-heal to exactly one open card.""" + tid = _mk_task() + + seq = iter([501, 502, 503, 504]) + sends_per_call = [] + + def _send(text, disable_notification=False): + sends_per_call.append(1) + return next(seq) + + monkeypatch.setattr(N, "delete_telegram", lambda mid: True) + monkeypatch.setattr(N, "send_telegram", _send) + + for _ in range(4): + before = len(sends_per_call) + N.update_task_tracker(tid) + assert len(sends_per_call) - before == 1 # <=1 send per call + + # After the last bump only the newest card is open; all earlier deleted. + assert get_open_tracker_messages(tid) == [504] + assert get_tracker_message_id(tid) == 504 + + +# --------------------------------------------------------------------------- # +# TC-05: ledger survives a "restart" (read from DB) -> old cards cleaned +# --------------------------------------------------------------------------- # +def test_ledger_survives_restart(monkeypatch): + """TC-05/AC-1.3: mids persisted in DB are cleaned on the next bump.""" + tid = _mk_task() + # Simulate a previous process that created two cards but lost the scalar to + # one of them (orphan): both are in the ledger though. + add_tracker_message(tid, 700) + add_tracker_message(tid, 701) + set_tracker_message_id(tid, 701) # scalar lost 700 + + deleted = [] + monkeypatch.setattr(N, "delete_telegram", + lambda mid: deleted.append(mid) or True) + monkeypatch.setattr(N, "send_telegram", + lambda text, disable_notification=False: 800) + + # "Fresh process" reads the ledger straight from the DB. + N.update_task_tracker(tid) + + assert sorted(deleted) == [700, 701] # the orphan 700 is reaped too + assert get_open_tracker_messages(tid) == [800] + + +# --------------------------------------------------------------------------- # +# never-raise on ledger/DB explosion +# --------------------------------------------------------------------------- # +def test_bump_never_raises_on_ledger_error(monkeypatch): + """AC-X.2: a ledger read blowing up does not break the bump path.""" + tid = _mk_task() + monkeypatch.setattr(N, "get_open_tracker_messages", + lambda task_id: (_ for _ in ()).throw(RuntimeError("db")), + raising=False) + # Even if the import-bound name is used, force the failure via db module too. + monkeypatch.setattr(db_module, "get_open_tracker_messages", + lambda task_id: (_ for _ in ()).throw(RuntimeError("db")), + raising=False) + sent = [] + monkeypatch.setattr(N, "delete_telegram", lambda mid: True) + monkeypatch.setattr(N, "send_telegram", + lambda text, disable_notification=False: + sent.append(1) or 900) + # Must not raise; still sends the fresh card. + N.update_task_tracker(tid) + assert sent == [1] diff --git a/tests/test_telegram_tracker.py b/tests/test_telegram_tracker.py index 7c5adea..dae3bc8 100644 --- a/tests/test_telegram_tracker.py +++ b/tests/test_telegram_tracker.py @@ -191,9 +191,13 @@ def test_render_done_has_times_and_links(): assert "\u0413\u041e\u0422\u041e\u0412\u041e" in text # ⏱️ with three times assert "\u23f1\ufe0f" in text - assert "\u0412\u0441\u0435\u0433\u043e" in text - assert "\u0430\u0433\u0435\u043d\u0442\u044b" in text - assert "\u0442\u0432\u043e\u0451" in text + # ORCH-087 (BR-G5): three explicitly-labelled metrics + # "\u0410\u0433\u0435\u043d\u0442\u044b \u2026 \u00b7 \u0442\u0432\u043e\u0451 \u2026 \u00b7 \u043e\u0431\u0449\u0435\u0435 \u0441 \u043e\u0436\u0438\u0434\u0430\u043d\u0438\u0435\u043c \u2026" (was "\u0412\u0441\u0435\u0433\u043e \u2026 \u00b7 \u0430\u0433\u0435\u043d\u0442\u044b \u2026 \u00b7 \u0442\u0432\u043e\u0451 \u2026"). + assert "\u0410\u0433\u0435\u043d\u0442\u044b" in text # \u0410\u0433\u0435\u043d\u0442\u044b + assert "\u0442\u0432\u043e\u0451" in text # \u0442\u0432\u043e\u0451 + # \u043e\u0431\u0449\u0435\u0435 \u0441 \u043e\u0436\u0438\u0434\u0430\u043d\u0438\u0435\u043c + assert "\u043e\u0431\u0449\u0435\u0435 \u0441 \u043e\u0436\u0438\u0434\u0430\u043d\u0438\u0435\u043c" in text + assert "\u0412\u0441\u0435\u0433\u043e" not in text # old "\u0412\u0441\u0435\u0433\u043e" label gone # 📦 deployed line assert "\U0001f4e6" in text diff --git a/tests/test_tracker_effort_time.py b/tests/test_tracker_effort_time.py new file mode 100644 index 0000000..5d0023e --- /dev/null +++ b/tests/test_tracker_effort_time.py @@ -0,0 +1,183 @@ +"""ORCH-087: effort-in-stage-line (BR-EFF), honest done-time (BR-G5), +deterministic stage labels (G2) and deploy-cycle label (G3). + +Telegram/Plane fully isolated (render is pure DB). Covers TC-06, TC-11..TC-15 +and the confirm_deploy live-overlay label. +""" + +import os +import tempfile + +os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token") +os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token") + +_test_db = os.path.join(tempfile.gettempdir(), "test_orchestrator_eff_time.db") +os.environ["ORCH_DB_PATH"] = _test_db + +import pytest # noqa: E402 + +import src.db as db_module # noqa: E402 +from src.db import init_db, get_db # noqa: E402 +from src import notifications as N # noqa: E402 + + +@pytest.fixture(autouse=True) +def setup_db(monkeypatch): + monkeypatch.setattr(db_module.settings, "db_path", _test_db, raising=False) + if os.path.exists(_test_db): + os.unlink(_test_db) + init_db() + # No live overlay in render-only tests unless a test opts in. + monkeypatch.setattr(N._get_settings(), "tracker_live_status", False, raising=False) + yield + if os.path.exists(_test_db): + os.unlink(_test_db) + + +def _mk_task(stage="development", wid="ORCH-087", title="eff/time test", + brd_start=None, brd_end=None, created=None, updated=None): + conn = get_db() + cur = conn.execute( + "INSERT INTO tasks (plane_id, work_item_id, repo, branch, stage, title, " + "brd_review_started_at, brd_review_ended_at) VALUES (?,?,?,?,?,?,?,?)", + ("p1", wid, "orchestrator", "feature/ORCH-087-x", stage, title, + brd_start, brd_end), + ) + tid = cur.lastrowid + if created or updated: + conn.execute( + "UPDATE tasks SET created_at=COALESCE(?, created_at), " + "updated_at=COALESCE(?, updated_at) WHERE id=?", + (created, updated, tid), + ) + conn.commit() + conn.close() + return tid + + +def _mk_run(tid, agent, started, finished, *, effort=None, model="tokenator/claude-opus-4-8", + in_tok=10, out_tok=5, cost=0.0, exit_code=0): + conn = get_db() + conn.execute( + "INSERT INTO agent_runs (task_id, agent, started_at, finished_at, " + "exit_code, input_tokens, output_tokens, cost_usd, model, effort) " + "VALUES (?,?,?,?,?,?,?,?,?,?)", + (tid, agent, started, finished, exit_code, in_tok, out_tok, cost, model, effort), + ) + conn.commit() + conn.close() + + +# --------------------------------------------------------------------------- # +# G2: plane_status_label deterministic for every stage (TC-06) +# --------------------------------------------------------------------------- # +def test_plane_status_label_all_stages(): + """TC-06/AC-2.2: every stage maps to its own label; deploy -> Awaiting Deploy.""" + cases = { + "created": "To Analyse", + "analysis": "Analysis", + "architecture": "Architecture", + "development": "Development", + "review": "Code-Review", + "testing": "Testing", + "done": "Done", + } + for stage, expected in cases.items(): + assert N.plane_status_label({"stage": stage}) == expected + deploy = N.plane_status_label({"stage": "deploy"}) + assert "Awaiting Deploy" in deploy + # In Review derives from the brd-clock on the analysis stage. + in_review = N.plane_status_label( + {"stage": "analysis", "brd_review_started_at": "2026-06-04 10:00:00", + "brd_review_ended_at": None} + ) + assert "In Review" in in_review + + +def test_confirm_deploy_label_registered(): + """G3/AC-3.x: the deploy-cycle gains a confirm_deploy overlay label.""" + assert "confirm_deploy" in N._LIVE_BRANCH_LABELS + assert "Confirm Deploy" in N._LIVE_BRANCH_LABELS["confirm_deploy"] + # confirm_deploy is a REAL dedicated status -> no base-alias suppression. + assert "confirm_deploy" not in N._LIVE_BRANCH_BASE + + +# --------------------------------------------------------------------------- # +# BR-EFF: effort rendered next to the model (TC-11, TC-12) +# --------------------------------------------------------------------------- # +@pytest.mark.parametrize("agent,label,effort", [ + ("developer", "Разработка", "xhigh"), + ("tester", "Тестирование", "medium"), + ("deployer", "Внедрение", "medium"), + ("analyst", "Анализ", "high"), + ("architect", "Архитектура", "high"), + ("reviewer", "Код ревью", "high"), +]) +def test_stage_line_shows_effort(agent, label, effort): + """TC-11/AC-E.2,AC-E.3: stage line shows '· model · effort' for each role.""" + tid = _mk_task(stage="done") + _mk_run(tid, agent, "2026-06-04 09:00:00", "2026-06-04 09:10:00", effort=effort) + text = N.render_task_tracker(tid) + line = [ln for ln in text.splitlines() if ln.startswith(f"✅ {label}")][0] + assert line.rstrip().endswith(f"opus-4-8 · {effort}") + + +def test_stage_line_omits_empty_effort(): + """TC-12/AC-E.4: NULL effort -> suffix omitted, render does not crash.""" + tid = _mk_task(stage="analysis") + _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00", effort=None) + text = N.render_task_tracker(tid) + line = [ln for ln in text.splitlines() if ln.startswith("✅ Анализ")][0] + # Ends at the model (no trailing effort segment). + assert line.rstrip().endswith("opus-4-8") + + +# --------------------------------------------------------------------------- # +# BR-G5: honest done-time (TC-13, TC-14, TC-15) +# --------------------------------------------------------------------------- # +def test_done_review_time_capped(): + """TC-13/AC-5.1: a ~6h open brd_review window is NOT shown as ~6h.""" + # 6h review window (10:00 -> 16:00) with default 2h cap. + tid = _mk_task( + stage="done", + brd_start="2026-06-04 10:00:00", brd_end="2026-06-04 16:00:00", + created="2026-06-04 09:00:00", updated="2026-06-04 16:30:00", + ) + _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:30:00", effort="high") + text = N.render_task_tracker(tid) + time_line = [ln for ln in text.splitlines() if ln.startswith("⏱")][0] + # Capped to ~2h (120м), marked with '~'; the raw 360m is NOT shown as твоё. + assert "твоё ~120м" in time_line + assert "твоё 360м" not in time_line + + +def test_done_review_time_under_cap_uncapped(): + """AC-5.1: a normal short review window is shown verbatim (no '~').""" + tid = _mk_task( + stage="done", + brd_start="2026-06-04 10:00:00", brd_end="2026-06-04 10:08:00", + created="2026-06-04 09:00:00", updated="2026-06-04 10:30:00", + ) + _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:30:00", effort="high") + text = N.render_task_tracker(tid) + time_line = [ln for ln in text.splitlines() if ln.startswith("⏱")][0] + assert "твоё 8м" in time_line + assert "~" not in time_line + + +def test_done_time_line_labels_and_agent_sum(): + """TC-14,TC-15/AC-5.2,AC-5.3: agents=Σ runs; wall labelled 'общее с ожиданием'.""" + tid = _mk_task( + stage="done", + created="2026-06-04 09:00:00", updated="2026-06-04 11:00:00", # wall 120m + ) + # Two runs: 10m + 6m = 16m of agent time. + _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00", effort="high") + _mk_run(tid, "deployer", "2026-06-04 10:50:00", "2026-06-04 10:56:00", effort="medium") + text = N.render_task_tracker(tid) + time_line = [ln for ln in text.splitlines() if ln.startswith("⏱")][0] + # agents = 16m (exact Σ), wall = 120m labelled as "общее с ожиданием". + assert "Агенты 16м" in time_line # Агенты 16м + assert "общее с ожиданием 120м" in time_line # общее с ожиданием 120м + # wall (120m) != agents (16m) -> not presented as a sum. + assert "Всего" not in time_line # no old "Всего"