fix(notifications): tracker card — status-map completeness, rollback reflection, stage-metric summation (ORCH-091)
Three verified live-card defects in src/notifications.py (ORCH-067/087), all additive and indication-only (STAGE_TRANSITIONS / QG_CHECKS / check_* / transport / DB schema untouched; never-raise; revert = git revert): - Деф.1 (D1): _STAGE_STATUS_LABEL covered 8 of 10 STAGE_TRANSITIONS keys — deploy-staging and cancelled (ORCH-090) fell back to the misleading "To Analyse". Added deploy-staging→"Deploying (staging)", cancelled→"Cancelled"; replaced the runtime fallback for an UNMAPPED stage with a neutral capitalized label (_neutral_stage_label). created stays an explicit "To Analyse"; broken/None input degrades safely. Map completeness is asserted programmatically from STAGE_TRANSITIONS.keys() (single source of truth), not a static list. - Деф.2 (D2): the stage-row loop drew ✅ for any stage with a finished agent run regardless of position — after a rollback the card showed the absurd "✅ Внедрение + 🔄 Разработка". Added read-only _pipeline_pos from the STAGE_TRANSITIONS order and a suppression gate (✅ only when current_pos >= _pipeline_pos(stage_key)); deploy-staging→deploy normalization applied ONLY to the current position; is_active_stage untouched. - Деф.3 (D3): _stage_line took only the LAST run (ORCH-069: developer 3 runs Σ $3.98 rendered ~$0.00). It now aggregates ALL of the agent's runs with the same per-run formulas as the task totals → strict convergence with SUM(agent_runs) by task_id; model/effort/attempt come from the last run. Tests: test_tracker_status_line.py (ORCH-091 TC-01..TC-03 + updated tc06); new test_tracker_rollback_metrics.py (TC-05..TC-08). Full suite green (1370). Docs: CHANGELOG + internals.md (architecture README already updated by architect). Refs: ORCH-091 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -1,4 +1,4 @@
|
|||||||
Work item: ORCH-088
|
Work item: ORCH-091
|
||||||
Repo: orchestrator
|
Repo: orchestrator
|
||||||
Branch: feature/ORCH-088-orch-88-10-20
|
Branch: feature/ORCH-091-bug-to-analyse-stage-deploy-st
|
||||||
Stage: development
|
Stage: development
|
||||||
@@ -3,6 +3,12 @@
|
|||||||
Формат: [Keep a Changelog](https://keepachangelog.com/). Записи — на смысловой PR/задачу.
|
Формат: [Keep a Changelog](https://keepachangelog.com/). Записи — на смысловой PR/задачу.
|
||||||
|
|
||||||
## [Unreleased]
|
## [Unreleased]
|
||||||
|
- **Live-карточка трекера: полнота карты статусов, отражение откатов, суммирование метрик стадии по попыткам** (ORCH-091, `fix`): три верифицированных дефекта рендера Telegram-карточки (`src/notifications.py`, ORCH-067/087). **Аддитивно, never-raise, без нового поведения конвейера:** `STAGE_TRANSITIONS` / `QG_CHECKS` / `check_*` / транспорт нотификаций / схема БД — **не тронуты** (затронут ровно один модуль индикативного слоя); kill-switch не требуется (рендер деградирует безопасно, откат = `git revert`).
|
||||||
|
- **Деф.1 — застрявший заголовок «To Analyse» (FR-1/2/3, AC-1/2/3):** `_STAGE_STATUS_LABEL` покрывал 8 из 10 ключей `STAGE_TRANSITIONS` — `deploy-staging` и `cancelled` (ORCH-090) выпадали в дефолт-«To Analyse» (ложный «первый статус» на стадии staging-деплоя). Карта расширена: `deploy-staging → "Deploying (staging)"` (plain-стиль активной стадии, суффикс «(staging)» снимает коллизию с prod-overlay `_LIVE_BRANCH_LABELS['deploying']` и с pause-лейблом `deploy`), `cancelled → "Cancelled"` (offline-база ORCH-090, совпадает с overlay-лейблом → нет конфликта precedence). Runtime-фолбэк `plane_status_label` для **немаппленной** (будущей/неизвестной) стадии заменён с «To Analyse» на **нейтральный** капитализированный лейбл (`_neutral_stage_label`, `"deploy-staging" → "Deploy Staging"`); `created` остаётся явным ключом → честная «To Analyse»; битый/None-вход → безопасный дефолт. Полнота карты гарантируется **программно** тестом, итерирующим `STAGE_TRANSITIONS.keys()` (единый источник истины) — новая стадия без курируемого лейбла даёт красный тест; автогенерация лейблов в самом модуле запрещена (карта остаётся курируемой/человекочитаемой).
|
||||||
|
- **Деф.2 — ложная картина при откате (FR-4, AC-4):** цикл рендера выводил `✅`-строку для каждой стадии с завершённым прогоном её агента **без учёта позиции** относительно текущей — после отката (`deploy-staging → development` ORCH-043, `review → development` REQUEST_CHANGES) карточка показывала абсурд «✅ Внедрение … + 🔄 Разработка». Введён лёгкий read-only хелпер `_pipeline_pos` от **порядка `STAGE_TRANSITIONS`** (не от `_TRACKER_STAGES`, который не содержит `deploy-staging`/`cancelled` и не авторитетен по порядку); гейт подавления: `✅`-строка рисуется только если `current_pos >= _pipeline_pos(stage_key)`. Нормализация `deploy-staging → deploy` применяется **только** к вычислению текущей позиции (схлопнутая строка «Внедрение» несёт `stage_key="deploy"`); `is_active_stage` — **без изменений** (нулевой регресс активного рендера). Подавлённые откатом прогоны по-прежнему входят в тоталы задачи (намеренная семантика отката).
|
||||||
|
- **Деф.3 — занижение метрик строки стадии (FR-5, AC-5):** `_stage_line` брал ПОСЛЕДНИЙ прогон (`last_done`), теряя предыдущие попытки (верифицировано на ORCH-069: developer 3 прогона Σ $3.98 → карточка показывала ~$0.00). Теперь `_stage_line` агрегирует **ВСЕ** `agent_runs` агента стадии теми же per-run-формулами, что и блок тоталов (`Σ cost_usd`, `Σ _input_total`, `Σ output_tokens`, `Σ _duration_seconds`); модель/эффорт/«попытка N» берутся из последнего прогона (`id ASC`). Каждый агент привязан ровно к одной строке `_TRACKER_STAGES` → строгий инвариант сходимости: Σ(строк стадий) ≡ тоталы задачи ≡ `SUM(agent_runs)` по `task_id`. Формат строк/тоталов и эффорт-суффикс (ORCH-087) — байт-в-байт.
|
||||||
|
- **Совместимость/регресс (NFR-2, AC-6):** In Review (brd-clock), Awaiting Deploy (`deploy`), Done, live-overlay ветки (Needs Input / Blocked / Rejected / Cancelled / Confirm Deploy / Deploying / Monitoring), строка «Подтверждение BRD», формат строк/тоталов, эффорт-суффикс — без изменений; все существующие тесты карточки зелёные. Перед правкой кода, помеченного ORCH-067/087/090, прочитаны их ADR — инварианты (single-card, never-raise, разделение offline-ядра и live-overlay, терминал `cancelled`) сохранены.
|
||||||
|
- Тесты: `tests/test_tracker_status_line.py` (ORCH-091 TC-01..TC-03: полнота карты от `STAGE_TRANSITIONS`, staging-лейбл, нейтральный фолбэк/never-raise; обновлён `test_tc06_*` под нейтральный фолбэк), новый `tests/test_tracker_rollback_metrics.py` (TC-05..TC-08: подавление `✅` при откате + анти-регресс forward-progress/`deploy-staging`-строка; суммирование метрик developer 3 прогона ≈ $3.98; сходимость тоталов с `SUM(agent_runs)`; never-raise на NULL-таймстампах/битой стадии). Полный регресс `tests/ -q` зелёный (1370). ADR: `docs/work-items/ORCH-091/06-adr/ADR-001-tracker-status-rollback-metrics.md`. Откат: `git revert` (docs/code-only, один модуль, без миграций/kill-switch).
|
||||||
- **Отмена задачи: Plane-статус STOP (остановка агента + полный сброс) + закрытие дыры релонча** (ORCH-090, `feat`): выделенный Plane-статус **STOP** — единый декларативный механизм отмены задачи вместо ручной хирургии по БД/процессам. Вводит **новое системное терминальное состояние `cancelled`** (стадия `tasks.stage='cancelled'` + job-исход `jobs.status='cancelled'`), равноправное `done`. **Аддитивно, под kill-switch, never-raise, restart-safe:** `STAGE_TRANSITIONS` (exit-гейты рёбер) / `QG_CHECKS` / `check_*` / семантика существующих статусов — **не тронуты** (`cancelled` — терминальный сток, не новое ребро); enduro не затронут; при `stop_status_enabled=false` — нулевая регрессия.
|
- **Отмена задачи: Plane-статус STOP (остановка агента + полный сброс) + закрытие дыры релонча** (ORCH-090, `feat`): выделенный Plane-статус **STOP** — единый декларативный механизм отмены задачи вместо ручной хирургии по БД/процессам. Вводит **новое системное терминальное состояние `cancelled`** (стадия `tasks.stage='cancelled'` + job-исход `jobs.status='cancelled'`), равноправное `done`. **Аддитивно, под kill-switch, never-raise, restart-safe:** `STAGE_TRANSITIONS` (exit-гейты рёбер) / `QG_CHECKS` / `check_*` / семантика существующих статусов — **не тронуты** (`cancelled` — терминальный сток, не новое ребро); enduro не затронут; при `stop_status_enabled=false` — нулевая регрессия.
|
||||||
- **Распознавание (fail-closed):** новый логический ключ `stop` в `_PLANE_NAME_TO_KEY` (`"STOP" → "stop"`), **намеренно отсутствует** в `_DEFAULT_STATES` (по образцу `confirm_deploy`/ORCH-059) → доска без статуса STOP резолвит `None` → ветка не активируется (нет `KeyError`, нет слепой отмены). `handle_issue_updated` маршрутизирует `stop` → `handle_stop` → `stage_engine.cancel_task` (проверяется ПЕРВЫМ, до to_analyse/approved/rejected).
|
- **Распознавание (fail-closed):** новый логический ключ `stop` в `_PLANE_NAME_TO_KEY` (`"STOP" → "stop"`), **намеренно отсутствует** в `_DEFAULT_STATES` (по образцу `confirm_deploy`/ORCH-059) → доска без статуса STOP резолвит `None` → ветка не активируется (нет `KeyError`, нет слепой отмены). `handle_issue_updated` маршрутизирует `stop` → `handle_stop` → `stage_engine.cancel_task` (проверяется ПЕРВЫМ, до to_analyse/approved/rejected).
|
||||||
- **Полный сброс (вне критичного окна, AC-1..AC-4):** graceful SIGTERM активного агента через переиспользуемый каскад `launcher.stop_process` (вынесен из `_watchdog`: SIGTERM → grace → SIGKILL) по `jobs.pid`; `db.cancel_jobs_for_task` (queued/running → терминальный `cancelled`, нигде не реквью'ится — `claim_next_job` берёт только `queued`); `git_worktree.remove_worktree` + новый never-raise `src/gitea.py::delete_remote_branch` (удаляет **только** feature-ветку; `main`/`master` — явный гард-отказ; без force-push); durable `stage='cancelled'` + `cancelled_at`; **тумбстон** натуральных ключей суффиксом `#cancelled-<id>`. Docs-артефакты (`01..17`) сохраняются.
|
- **Полный сброс (вне критичного окна, AC-1..AC-4):** graceful SIGTERM активного агента через переиспользуемый каскад `launcher.stop_process` (вынесен из `_watchdog`: SIGTERM → grace → SIGKILL) по `jobs.pid`; `db.cancel_jobs_for_task` (queued/running → терминальный `cancelled`, нигде не реквью'ится — `claim_next_job` берёт только `queued`); `git_worktree.remove_worktree` + новый never-raise `src/gitea.py::delete_remote_branch` (удаляет **только** feature-ветку; `main`/`master` — явный гард-отказ; без force-push); durable `stage='cancelled'` + `cancelled_at`; **тумбстон** натуральных ключей суффиксом `#cancelled-<id>`. Docs-артефакты (`01..17`) сохраняются.
|
||||||
|
|||||||
@@ -134,6 +134,8 @@ claude.exe --print --system-prompt --allowedTools Read,Write,Edit,Bash
|
|||||||
|
|
||||||
**Текст карточки (оба режима, ORCH-042):** метка `Подтверждение BRD` (была «Ревью БРД»); после прохождения approve-gate строка BRD начинается с ✅ (ветка ожидания сохраняет ⏸️/⏳); русские display-labels стадий (`Анализ / Архитектура / Разработка / Код ревью / Тестирование / Внедрение`); финальная строка `📦 Внедрено` (было `deployed`). Меняются только отображаемые строки — ключи стадий и имена агентов (завязаны на `_STAGE_ACTIVE_AGENT`, `last_done`, БД) не трогаются.
|
**Текст карточки (оба режима, ORCH-042):** метка `Подтверждение BRD` (была «Ревью БРД»); после прохождения approve-gate строка BRD начинается с ✅ (ветка ожидания сохраняет ⏸️/⏳); русские display-labels стадий (`Анализ / Архитектура / Разработка / Код ревью / Тестирование / Внедрение`); финальная строка `📦 Внедрено` (было `deployed`). Меняются только отображаемые строки — ключи стадий и имена агентов (завязаны на `_STAGE_ACTIVE_AGENT`, `last_done`, БД) не трогаются.
|
||||||
|
|
||||||
|
**Строки стадий: отражение откатов + суммирование метрик (ORCH-091).** Цикл рендера строк стадий (`render_task_tracker` → `_stage_line`) исправлен по двум осям. (1) **Откат (Деф.2):** `✅`-строка стадии рисуется только если её позиция в конвейере `≤` текущей позиции задачи; позиция берётся из порядка `STAGE_TRANSITIONS` (read-only хелпер `_pipeline_pos`, never-raise; неизвестная стадия → «далёкое будущее» → ✅ не пере-подавляется) с нормализацией `deploy-staging → deploy` ТОЛЬКО в гейте подавления (схлопнутая строка «Внедрение» несёт `stage_key="deploy"`). После отката (`deploy-staging → development`, `review → development`) строки стадий ПОЗЖЕ текущей больше не рисуются как пройденные — пропадает абсурд «✅ Внедрение + 🔄 Разработка»; `is_active_stage` не тронут. (2) **Метрики (Деф.3):** `_stage_line` агрегирует ВСЕ `agent_runs` агента стадии (Σ cost / Σ токены / Σ время теми же per-run-формулами, что блок тоталов задачи), а не последний прогон — каждый агент привязан ровно к одной строке `_TRACKER_STAGES`, поэтому Σ(строк стадий) ≡ тоталы ≡ `SUM(agent_runs)` по `task_id`; модель/эффорт/«попытка N» берутся из последнего прогона. Прогоны, подавлённые откатом, по-прежнему входят в тоталы (намеренная семантика отката).
|
||||||
|
|
||||||
**Строка Plane-статуса и кликабельный номер (ORCH-067, слой B — индикация).** Под заголовком карточка несёт строку `📍 <Plane-статус>` по модели ORCH-066. Источник — двухслойный, контракт **never raises**:
|
**Строка Plane-статуса и кликабельный номер (ORCH-067, слой B — индикация).** Под заголовком карточка несёт строку `📍 <Plane-статус>` по модели ORCH-066. Источник — двухслойный, контракт **never raises**:
|
||||||
- **Оффлайн-ядро** `plane_status_label(task_row)` — чистая функция БЕЗ сети: `stage → статус` (`created→To Analyse`, `analysis→Analysis`, `architecture→Architecture`, `development→Development`, `review→Code-Review`, `testing→Testing`, `deploy-staging→Deploying (staging)` [ORCH-091], `deploy→⏸️ Awaiting Deploy`, `done→Done`, `cancelled→Cancelled` [ORCH-091]) + `⏸️ In Review` из brd-часов (`brd_review_started_at` задан, `…_ended_at` пуст). **ORCH-091:** карта `_STAGE_STATUS_LABEL` покрывает ВСЕ ключи `STAGE_TRANSITIONS` (полнота — тестом, не статичным списком); неизвестная/будущая стадия → нейтральный фолбэк (капитализированное имя стадии), а НЕ «To Analyse» (он остаётся лишь явным лейблом `created` и безопасной деградацией на истинно-битом входе).
|
- **Оффлайн-ядро** `plane_status_label(task_row)` — чистая функция БЕЗ сети: `stage → статус` (`created→To Analyse`, `analysis→Analysis`, `architecture→Architecture`, `development→Development`, `review→Code-Review`, `testing→Testing`, `deploy-staging→Deploying (staging)` [ORCH-091], `deploy→⏸️ Awaiting Deploy`, `done→Done`, `cancelled→Cancelled` [ORCH-091]) + `⏸️ In Review` из brd-часов (`brd_review_started_at` задан, `…_ended_at` пуст). **ORCH-091:** карта `_STAGE_STATUS_LABEL` покрывает ВСЕ ключи `STAGE_TRANSITIONS` (полнота — тестом, не статичным списком); неизвестная/будущая стадия → нейтральный фолбэк (капитализированное имя стадии), а НЕ «To Analyse» (он остаётся лишь явным лейблом `created` и безопасной деградацией на истинно-битом входе).
|
||||||
- **Live-overlay** `_live_plane_branch_override` — best-effort: дорисовывает ветви-статусы, неразличимые оффлайн (Needs Input / Blocked / Rejected / Cancelled / Deploying / Monitoring after Deploy), чтением живого Plane-статуса (`fetch_issue_state` с коротким `tracker_live_status_timeout_s`, TTL-кэш `tracker_live_status_ttl_s`, kill-switch `tracker_live_status`). Любой сбой / выключенный флаг / нехватка данных → оффлайн-метка; `⏸️ In Review` (авторитет brd-часов) overlay не консультирует. Анти-false-positive: `deploying/monitoring`, алиасящие базовый UUID на проекте без выделенного статуса (enduro), не вызывают override.
|
- **Live-overlay** `_live_plane_branch_override` — best-effort: дорисовывает ветви-статусы, неразличимые оффлайн (Needs Input / Blocked / Rejected / Cancelled / Deploying / Monitoring after Deploy), чтением живого Plane-статуса (`fetch_issue_state` с коротким `tracker_live_status_timeout_s`, TTL-кэш `tracker_live_status_ttl_s`, kill-switch `tracker_live_status`). Любой сбой / выключенный флаг / нехватка данных → оффлайн-метка; `⏸️ In Review` (авторитет brd-часов) overlay не консультирует. Анти-false-positive: `deploying/monitoring`, алиасящие базовый UUID на проекте без выделенного статуса (enduro), не вызывают override.
|
||||||
|
|||||||
@@ -254,6 +254,28 @@ _STAGE_ACTIVE_AGENT = {
|
|||||||
"deploy": "deployer",
|
"deploy": "deployer",
|
||||||
}
|
}
|
||||||
|
|
||||||
|
# ORCH-091 (D2): pipeline order is read (read-only) from the single source of
|
||||||
|
# truth src/stages.py::STAGE_TRANSITIONS — NOT from _TRACKER_STAGES (which lacks
|
||||||
|
# deploy-staging/cancelled and is not authoritative about ordering, NFR-3). Used
|
||||||
|
# to suppress the "✅ <stage>" line for a stage positioned AFTER the task's
|
||||||
|
# current stage (a rollback, e.g. deploy-staging -> development), which otherwise
|
||||||
|
# rendered the absurd "✅ Внедрение … + 🔄 Разработка".
|
||||||
|
from .stages import STAGE_TRANSITIONS # noqa: E402
|
||||||
|
|
||||||
|
_PIPELINE_ORDER = list(STAGE_TRANSITIONS.keys())
|
||||||
|
|
||||||
|
|
||||||
|
def _pipeline_pos(stage) -> int:
|
||||||
|
"""Index of ``stage`` in the pipeline order; unknown -> "far future".
|
||||||
|
|
||||||
|
Never raises. An unknown/broken stage maps past the end so it is never
|
||||||
|
spuriously suppressed (degrades to the pre-ORCH-091 behaviour: ✅ kept).
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
return _PIPELINE_ORDER.index(stage)
|
||||||
|
except (ValueError, TypeError):
|
||||||
|
return len(_PIPELINE_ORDER)
|
||||||
|
|
||||||
|
|
||||||
def _fmt_minutes(seconds) -> str:
|
def _fmt_minutes(seconds) -> str:
|
||||||
"""Render a duration in whole minutes: 0..59s -> '<1м', else '<n>м'."""
|
"""Render a duration in whole minutes: 0..59s -> '<1м', else '<n>м'."""
|
||||||
@@ -442,23 +464,42 @@ def render_task_tracker(task_id: int) -> str:
|
|||||||
except Exception:
|
except Exception:
|
||||||
pass
|
pass
|
||||||
|
|
||||||
def _stage_line(label, run):
|
def _stage_line(label, stage_runs):
|
||||||
usage = {
|
# ORCH-091 (D3): aggregate ALL of the stage agent's runs (retries
|
||||||
"input_tokens": run["input_tokens"],
|
# included) with the SAME per-run formulas as the task totals block
|
||||||
"cache_read_tokens": run["cache_read_tokens"],
|
# (:388-404) -> the stage line converges with SUM(agent_runs) instead of
|
||||||
"cache_creation_tokens": run["cache_creation_tokens"],
|
# showing only the last run (which understated a multi-attempt stage:
|
||||||
}
|
# ORCH-069 developer \u03a3 $3.98 rendered as ~$0.00). Each agent maps to
|
||||||
in_tok = fmt_tokens(_input_total(usage))
|
# exactly one _TRACKER_STAGES row, so \u03a3(stage lines) \u2261 task totals.
|
||||||
out_tok = fmt_tokens(run["output_tokens"])
|
in_sum = 0
|
||||||
cost = fmt_cost(run["cost_usd"])
|
out_sum = 0
|
||||||
dur = _fmt_minutes(_duration_seconds(run["started_at"], run["finished_at"]))
|
cost_sum = 0.0
|
||||||
model = short_model_name(run["model"])
|
dur_sum = 0
|
||||||
|
for run in stage_runs:
|
||||||
|
usage = {
|
||||||
|
"input_tokens": run["input_tokens"],
|
||||||
|
"cache_read_tokens": run["cache_read_tokens"],
|
||||||
|
"cache_creation_tokens": run["cache_creation_tokens"],
|
||||||
|
}
|
||||||
|
in_sum += _input_total(usage)
|
||||||
|
out_sum += int(run["output_tokens"] or 0)
|
||||||
|
cost_sum += float(run["cost_usd"] or 0.0)
|
||||||
|
d = _duration_seconds(run["started_at"], run["finished_at"])
|
||||||
|
if d is not None:
|
||||||
|
dur_sum += d
|
||||||
|
in_tok = fmt_tokens(in_sum)
|
||||||
|
out_tok = fmt_tokens(out_sum)
|
||||||
|
cost = fmt_cost(cost_sum)
|
||||||
|
dur = _fmt_minutes(dur_sum)
|
||||||
|
# Model/effort/"\u043f\u043e\u043f\u044b\u0442\u043a\u0430 N" come from the LAST run (agent_runs are id ASC).
|
||||||
|
last = stage_runs[-1] if stage_runs else None
|
||||||
|
model = short_model_name(last["model"]) if last is not None else ""
|
||||||
model_suffix = f" \u00b7 {model}" if model else ""
|
model_suffix = f" \u00b7 {model}" if model else ""
|
||||||
# ORCH-087 (BR-EFF): render the resolved --effort next to the model
|
# ORCH-087 (BR-EFF): render the resolved --effort next to the model
|
||||||
# ("\u00b7 opus-4-8 \u00b7 xhigh"). Stamped at launch in agent_runs.effort; empty /
|
# ("\u00b7 opus-4-8 \u00b7 xhigh"). Stamped at launch in agent_runs.effort; empty /
|
||||||
# missing -> suffix omitted (like the model suffix). Historical rows with
|
# missing -> suffix omitted (like the model suffix). Historical rows with
|
||||||
# NULL effort fall back to the config-resolved effort for the agent.
|
# NULL effort fall back to the config-resolved effort for the agent.
|
||||||
effort = _run_effort(run)
|
effort = _run_effort(last) if last is not None else ""
|
||||||
effort_suffix = f" \u00b7 {effort}" if effort else ""
|
effort_suffix = f" \u00b7 {effort}" if effort else ""
|
||||||
return (
|
return (
|
||||||
f"\u2705 {label:<13} {dur} \u00b7 "
|
f"\u2705 {label:<13} {dur} \u00b7 "
|
||||||
@@ -471,6 +512,14 @@ def render_task_tracker(task_id: int) -> str:
|
|||||||
brd_ended = task["brd_review_ended_at"]
|
brd_ended = task["brd_review_ended_at"]
|
||||||
review_seconds = _duration_seconds(brd_started, brd_ended)
|
review_seconds = _duration_seconds(brd_started, brd_ended)
|
||||||
|
|
||||||
|
# ORCH-091 (D2): the task's current position in the pipeline, used to suppress
|
||||||
|
# \u2705-lines for stages POSITIONED AFTER it (a rollback). The deploy-staging ->
|
||||||
|
# deploy normalization is applied ONLY here (not to is_active_stage): the
|
||||||
|
# collapsed "\u0412\u043d\u0435\u0434\u0440\u0435\u043d\u0438\u0435" row carries stage_key="deploy" (pos 7); on
|
||||||
|
# stage='deploy-staging' (pos 6) the row would otherwise be wrongly suppressed.
|
||||||
|
effective_stage = "deploy" if stage == "deploy-staging" else stage
|
||||||
|
current_pos = _pipeline_pos(effective_stage)
|
||||||
|
|
||||||
for stage_key, label, agent in _TRACKER_STAGES:
|
for stage_key, label, agent in _TRACKER_STAGES:
|
||||||
run = last_done.get(agent)
|
run = last_done.get(agent)
|
||||||
# The stage is "in progress" only when it is the task's current stage AND
|
# The stage is "in progress" only when it is the task's current stage AND
|
||||||
@@ -500,9 +549,14 @@ def render_task_tracker(task_id: int) -> str:
|
|||||||
lines.append(
|
lines.append(
|
||||||
f"\U0001f504 {label:<13} \u2026 \u00b7 \u0438\u0434\u0451\u0442"
|
f"\U0001f504 {label:<13} \u2026 \u00b7 \u0438\u0434\u0451\u0442"
|
||||||
)
|
)
|
||||||
elif run is not None:
|
elif run is not None and current_pos >= _pipeline_pos(stage_key):
|
||||||
lines.append(_stage_line(label, run))
|
# ORCH-091 (D2): show ✅ only for stages AT or BEFORE the current
|
||||||
# else: not started yet -> not shown.
|
# position. A finished run on a stage POSITIONED AFTER the current one
|
||||||
|
# (rollback, e.g. deploy-staging->development) is suppressed — its runs
|
||||||
|
# still count in the task totals (intended rollback semantics). Pass the
|
||||||
|
# FULL run list so the line aggregates all attempts (D3).
|
||||||
|
lines.append(_stage_line(label, agent_runs))
|
||||||
|
# else: not started yet, or rolled back past -> not shown.
|
||||||
|
|
||||||
# Insert the BRD review line right after Analysis.
|
# Insert the BRD review line right after Analysis.
|
||||||
if stage_key == "analysis" and brd_started:
|
if stage_key == "analysis" and brd_started:
|
||||||
@@ -944,8 +998,16 @@ _STAGE_STATUS_LABEL = {
|
|||||||
"development": "Development",
|
"development": "Development",
|
||||||
"review": "Code-Review",
|
"review": "Code-Review",
|
||||||
"testing": "Testing",
|
"testing": "Testing",
|
||||||
|
# ORCH-091 (D1): deploy-staging was missing -> the card froze on "To Analyse".
|
||||||
|
# Plain-style active label (like Analysis/Testing, no ⏸️ pause marker); the
|
||||||
|
# "(staging)" suffix keeps it distinct from the prod-overlay "Deploying"
|
||||||
|
# (_LIVE_BRANCH_LABELS['deploying']) and from the deploy stage's pause label.
|
||||||
|
"deploy-staging": "Deploying (staging)",
|
||||||
"deploy": "⏸️ Awaiting Deploy — ожидание Confirm Deploy",
|
"deploy": "⏸️ Awaiting Deploy — ожидание Confirm Deploy",
|
||||||
"done": "Done",
|
"done": "Done",
|
||||||
|
# ORCH-091 (D1): offline base for the ORCH-090 system-terminal. Matches the
|
||||||
|
# overlay label _LIVE_BRANCH_LABELS['cancelled'] -> no precedence conflict.
|
||||||
|
"cancelled": "Cancelled",
|
||||||
}
|
}
|
||||||
_DEFAULT_STATUS_LABEL = "To Analyse"
|
_DEFAULT_STATUS_LABEL = "To Analyse"
|
||||||
_IN_REVIEW_LABEL = (
|
_IN_REVIEW_LABEL = (
|
||||||
@@ -987,6 +1049,25 @@ def _row_get(row, key, default=None):
|
|||||||
return default
|
return default
|
||||||
|
|
||||||
|
|
||||||
|
def _neutral_stage_label(stage) -> str:
|
||||||
|
"""ORCH-091 (D1): neutral fallback for a stage NOT in _STAGE_STATUS_LABEL.
|
||||||
|
|
||||||
|
A genuinely unknown / future / broken stage gets a capitalized stage name
|
||||||
|
("deploy-staging" -> "Deploy Staging") instead of the misleading "To Analyse"
|
||||||
|
(which read as a false "first status"). Empty / unparseable -> the safe
|
||||||
|
_DEFAULT_STATUS_LABEL. Never raises. NOTE: the curated map stays the source of
|
||||||
|
human-meaningful labels; this is only the safety net for unmapped stages
|
||||||
|
(FR-3 / AC-3).
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
s = str(stage).strip()
|
||||||
|
if not s:
|
||||||
|
return _DEFAULT_STATUS_LABEL
|
||||||
|
return s.replace("-", " ").title()
|
||||||
|
except Exception:
|
||||||
|
return _DEFAULT_STATUS_LABEL
|
||||||
|
|
||||||
|
|
||||||
def plane_status_label(task_row) -> str:
|
def plane_status_label(task_row) -> str:
|
||||||
"""ORCH-067 (Р-1, layer 1): current Plane status label for the card header.
|
"""ORCH-067 (Р-1, layer 1): current Plane status label for the card header.
|
||||||
|
|
||||||
@@ -1006,7 +1087,13 @@ def plane_status_label(task_row) -> str:
|
|||||||
ended = _row_get(task_row, "brd_review_ended_at")
|
ended = _row_get(task_row, "brd_review_ended_at")
|
||||||
if started and not ended:
|
if started and not ended:
|
||||||
return _IN_REVIEW_LABEL
|
return _IN_REVIEW_LABEL
|
||||||
return _STAGE_STATUS_LABEL.get(stage, _DEFAULT_STATUS_LABEL)
|
# ORCH-091 (D1/FR-3): a mapped stage keeps its curated label; an UNMAPPED
|
||||||
|
# (future/unknown) stage degrades to a neutral capitalized label, NOT the
|
||||||
|
# misleading "To Analyse". 'created' stays an explicit key -> "To Analyse".
|
||||||
|
label = _STAGE_STATUS_LABEL.get(stage)
|
||||||
|
if label:
|
||||||
|
return label
|
||||||
|
return _neutral_stage_label(stage)
|
||||||
except Exception:
|
except Exception:
|
||||||
return _DEFAULT_STATUS_LABEL
|
return _DEFAULT_STATUS_LABEL
|
||||||
|
|
||||||
|
|||||||
283
tests/test_tracker_rollback_metrics.py
Normal file
283
tests/test_tracker_rollback_metrics.py
Normal file
@@ -0,0 +1,283 @@
|
|||||||
|
"""ORCH-091 — Group 2 (D2/D3): rollback reflection + stage-metric summation.
|
||||||
|
|
||||||
|
Covers TC-05..TC-08 from 04-test-plan.yaml. The render path is pure DB (no
|
||||||
|
network); a temp SQLite holds tasks + agent_runs.
|
||||||
|
|
||||||
|
TC-05 / AC-4 — rollback deploy-staging->development: Development active (🔄),
|
||||||
|
Testing/Внедрение NOT shown ✅, Анализ/Архитектура stay ✅.
|
||||||
|
TC-06 / AC-5 — stage line sums ALL of an agent's runs (ORCH-069 developer
|
||||||
|
3 runs ≈ $3.98), not the last run.
|
||||||
|
TC-07 / AC-5 — task totals (💰/🔢/⏱) converge with SUM(agent_runs).
|
||||||
|
TC-08 / AC-7 — render_task_tracker never raises on broken/partial rows.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import tempfile
|
||||||
|
|
||||||
|
os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token")
|
||||||
|
os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
|
||||||
|
|
||||||
|
_test_db = os.path.join(tempfile.gettempdir(), "test_orchestrator_rollback_metrics.db")
|
||||||
|
os.environ["ORCH_DB_PATH"] = _test_db
|
||||||
|
|
||||||
|
import pytest # noqa: E402
|
||||||
|
|
||||||
|
import src.db as db_module # noqa: E402
|
||||||
|
from src.db import init_db, get_db # noqa: E402
|
||||||
|
from src import notifications as N # noqa: E402
|
||||||
|
from src.usage import fmt_cost, fmt_tokens, _input_total # noqa: E402
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture(autouse=True)
|
||||||
|
def setup_db(monkeypatch):
|
||||||
|
monkeypatch.setattr(db_module.settings, "db_path", _test_db, raising=False)
|
||||||
|
if os.path.exists(_test_db):
|
||||||
|
os.unlink(_test_db)
|
||||||
|
init_db()
|
||||||
|
# Render-only: keep the live overlay off (offline core under test).
|
||||||
|
monkeypatch.setattr(N._get_settings(), "tracker_live_status", False, raising=False)
|
||||||
|
yield
|
||||||
|
if os.path.exists(_test_db):
|
||||||
|
os.unlink(_test_db)
|
||||||
|
|
||||||
|
|
||||||
|
def _mk_task(stage="development", wid="ORCH-091", title="rollback/metrics",
|
||||||
|
created=None, updated=None):
|
||||||
|
conn = get_db()
|
||||||
|
cur = conn.execute(
|
||||||
|
"INSERT INTO tasks (plane_id, work_item_id, repo, branch, stage, title) "
|
||||||
|
"VALUES (?,?,?,?,?,?)",
|
||||||
|
("p1", wid, "orchestrator", "feature/ORCH-091-x", stage, title),
|
||||||
|
)
|
||||||
|
tid = cur.lastrowid
|
||||||
|
if created or updated:
|
||||||
|
conn.execute(
|
||||||
|
"UPDATE tasks SET created_at=COALESCE(?, created_at), "
|
||||||
|
"updated_at=COALESCE(?, updated_at) WHERE id=?",
|
||||||
|
(created, updated, tid),
|
||||||
|
)
|
||||||
|
conn.commit()
|
||||||
|
conn.close()
|
||||||
|
return tid
|
||||||
|
|
||||||
|
|
||||||
|
def _mk_run(tid, agent, started, finished, *, model="tokenator/claude-opus-4-8",
|
||||||
|
in_tok=10, out_tok=5, cache_read=0, cache_creation=0, cost=0.0,
|
||||||
|
effort=None, exit_code=0):
|
||||||
|
conn = get_db()
|
||||||
|
conn.execute(
|
||||||
|
"INSERT INTO agent_runs (task_id, agent, started_at, finished_at, "
|
||||||
|
"exit_code, input_tokens, output_tokens, cache_read_tokens, "
|
||||||
|
"cache_creation_tokens, cost_usd, model, effort) "
|
||||||
|
"VALUES (?,?,?,?,?,?,?,?,?,?,?,?)",
|
||||||
|
(tid, agent, started, finished, exit_code, in_tok, out_tok, cache_read,
|
||||||
|
cache_creation, cost, model, effort),
|
||||||
|
)
|
||||||
|
conn.commit()
|
||||||
|
conn.close()
|
||||||
|
|
||||||
|
|
||||||
|
def _stage_line(text, label):
|
||||||
|
"""The single '✅ <label> ...' line for a stage, or None."""
|
||||||
|
for ln in text.splitlines():
|
||||||
|
if ln.startswith(f"✅ {label}"):
|
||||||
|
return ln
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _has_active(text, label):
|
||||||
|
"""True if the '🔄 <label> ...' active line is present."""
|
||||||
|
return any(ln.startswith(f"🔄 {label}") for ln in text.splitlines())
|
||||||
|
|
||||||
|
|
||||||
|
# =========================================================================== #
|
||||||
|
# TC-05 / AC-4 — rollback reflection (deploy-staging -> development)
|
||||||
|
# =========================================================================== #
|
||||||
|
def test_tc05_rollback_suppresses_later_stage_checkmarks():
|
||||||
|
"""A task back on stage='development' after later stages ran: Development is
|
||||||
|
active (🔄), and Тестирование/Внедрение/Код ревью are NOT shown as ✅, while
|
||||||
|
earlier stages (Анализ/Архитектура) stay ✅."""
|
||||||
|
tid = _mk_task(stage="development")
|
||||||
|
# Earlier stages finished.
|
||||||
|
_mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00")
|
||||||
|
_mk_run(tid, "architect", "2026-06-04 09:10:00", "2026-06-04 09:20:00")
|
||||||
|
# First development pass finished, then later stages ran...
|
||||||
|
_mk_run(tid, "developer", "2026-06-04 09:20:00", "2026-06-04 09:40:00", cost=1.0)
|
||||||
|
_mk_run(tid, "reviewer", "2026-06-04 09:40:00", "2026-06-04 09:50:00")
|
||||||
|
_mk_run(tid, "tester", "2026-06-04 09:50:00", "2026-06-04 10:00:00")
|
||||||
|
_mk_run(tid, "deployer", "2026-06-04 10:00:00", "2026-06-04 10:10:00")
|
||||||
|
# ...then a rollback re-launched developer -> in-flight run (finished_at NULL).
|
||||||
|
_mk_run(tid, "developer", "2026-06-04 10:20:00", None, exit_code=None, cost=0.0)
|
||||||
|
|
||||||
|
text = N.render_task_tracker(tid)
|
||||||
|
|
||||||
|
# Development active, not a ✅.
|
||||||
|
assert _has_active(text, "Разработка"), text
|
||||||
|
# Later-than-current stages: no ✅ line (the rollback is honestly reflected).
|
||||||
|
assert _stage_line(text, "Код ревью") is None, text
|
||||||
|
assert _stage_line(text, "Тестирование") is None, text
|
||||||
|
assert _stage_line(text, "Внедрение") is None, text
|
||||||
|
# Earlier stages still ✅.
|
||||||
|
assert _stage_line(text, "Анализ") is not None, text
|
||||||
|
assert _stage_line(text, "Архитектура") is not None, text
|
||||||
|
|
||||||
|
|
||||||
|
def test_tc05_forward_progress_keeps_earlier_checkmarks():
|
||||||
|
"""Regression guard: normal forward progress (no rollback) still shows all
|
||||||
|
earlier stages ✅ — the suppression gate only fires for stages AFTER current."""
|
||||||
|
tid = _mk_task(stage="testing")
|
||||||
|
_mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00")
|
||||||
|
_mk_run(tid, "architect", "2026-06-04 09:10:00", "2026-06-04 09:20:00")
|
||||||
|
_mk_run(tid, "developer", "2026-06-04 09:20:00", "2026-06-04 09:40:00")
|
||||||
|
_mk_run(tid, "reviewer", "2026-06-04 09:40:00", "2026-06-04 09:50:00")
|
||||||
|
# tester in-flight on the testing stage.
|
||||||
|
_mk_run(tid, "tester", "2026-06-04 09:50:00", None, exit_code=None)
|
||||||
|
|
||||||
|
text = N.render_task_tracker(tid)
|
||||||
|
assert _stage_line(text, "Анализ") is not None
|
||||||
|
assert _stage_line(text, "Архитектура") is not None
|
||||||
|
assert _stage_line(text, "Разработка") is not None
|
||||||
|
assert _stage_line(text, "Код ревью") is not None
|
||||||
|
assert _has_active(text, "Тестирование")
|
||||||
|
|
||||||
|
|
||||||
|
def test_tc05_deploy_staging_keeps_deployer_row():
|
||||||
|
"""Normalization: on stage='deploy-staging' the collapsed 'Внедрение' row
|
||||||
|
(stage_key='deploy') is NOT wrongly suppressed by the rollback gate."""
|
||||||
|
tid = _mk_task(stage="deploy-staging")
|
||||||
|
_mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00")
|
||||||
|
_mk_run(tid, "architect", "2026-06-04 09:10:00", "2026-06-04 09:20:00")
|
||||||
|
_mk_run(tid, "developer", "2026-06-04 09:20:00", "2026-06-04 09:40:00")
|
||||||
|
_mk_run(tid, "reviewer", "2026-06-04 09:40:00", "2026-06-04 09:50:00")
|
||||||
|
_mk_run(tid, "tester", "2026-06-04 09:50:00", "2026-06-04 10:00:00")
|
||||||
|
# staging deploy finished (deployer agent, collapsed into Внедрение).
|
||||||
|
_mk_run(tid, "deployer", "2026-06-04 10:00:00", "2026-06-04 10:10:00")
|
||||||
|
|
||||||
|
text = N.render_task_tracker(tid)
|
||||||
|
# Внедрение must NOT be suppressed (preserved pre-ORCH-091 behaviour).
|
||||||
|
assert _stage_line(text, "Внедрение") is not None, text
|
||||||
|
assert _stage_line(text, "Тестирование") is not None, text
|
||||||
|
|
||||||
|
|
||||||
|
# =========================================================================== #
|
||||||
|
# TC-06 / AC-5 — stage-metric summation over retries (ORCH-069 fixture)
|
||||||
|
# =========================================================================== #
|
||||||
|
def test_tc06_stage_line_sums_all_developer_runs():
|
||||||
|
"""developer with 3 runs (ORCH-069: Σ ≈ $3.98) -> the 'Разработка' line shows
|
||||||
|
Σ cost / Σ tokens / Σ time, NOT the last run alone."""
|
||||||
|
tid = _mk_task(stage="review") # past development -> ✅ shown
|
||||||
|
_mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00")
|
||||||
|
_mk_run(tid, "architect", "2026-06-04 09:10:00", "2026-06-04 09:20:00")
|
||||||
|
# Three developer attempts: $1.50 + $2.00 + $0.48 = $3.98; 30m total.
|
||||||
|
_mk_run(tid, "developer", "2026-06-04 09:20:00", "2026-06-04 09:30:00",
|
||||||
|
cost=1.50, in_tok=100, out_tok=40, cache_read=10)
|
||||||
|
_mk_run(tid, "developer", "2026-06-04 09:30:00", "2026-06-04 09:45:00",
|
||||||
|
cost=2.00, in_tok=200, out_tok=60, cache_creation=20)
|
||||||
|
_mk_run(tid, "developer", "2026-06-04 09:45:00", "2026-06-04 09:50:00",
|
||||||
|
cost=0.48, in_tok=50, out_tok=10)
|
||||||
|
|
||||||
|
text = N.render_task_tracker(tid)
|
||||||
|
line = _stage_line(text, "Разработка")
|
||||||
|
assert line is not None, text
|
||||||
|
# Σ cost = $3.98 (not the last $0.48).
|
||||||
|
assert fmt_cost(3.98) in line, line
|
||||||
|
assert fmt_cost(0.48) not in line, line
|
||||||
|
# Σ output tokens = 40+60+10 = 110.
|
||||||
|
assert f"{fmt_tokens(110)}↑" in line, line
|
||||||
|
# Σ input (input+cache_read+cache_creation): (100+10)+(200+20)+50 = 380.
|
||||||
|
exp_in = _input_total({"input_tokens": 100, "cache_read_tokens": 10,
|
||||||
|
"cache_creation_tokens": 0}) \
|
||||||
|
+ _input_total({"input_tokens": 200, "cache_read_tokens": 0,
|
||||||
|
"cache_creation_tokens": 20}) \
|
||||||
|
+ _input_total({"input_tokens": 50, "cache_read_tokens": 0,
|
||||||
|
"cache_creation_tokens": 0})
|
||||||
|
assert f"{fmt_tokens(exp_in)}↓" in line, line
|
||||||
|
# Σ time = 10+15+5 = 30m.
|
||||||
|
assert " 30м " in line, line
|
||||||
|
|
||||||
|
|
||||||
|
# =========================================================================== #
|
||||||
|
# TC-07 / AC-5 — task totals converge with SUM(agent_runs)
|
||||||
|
# =========================================================================== #
|
||||||
|
def test_tc07_totals_converge_with_sum_agent_runs():
|
||||||
|
"""The 💰 totals line equals SUM(agent_runs) over cost & tokens even with
|
||||||
|
retries (the stage lines and the totals draw from the same row set)."""
|
||||||
|
tid = _mk_task(stage="review")
|
||||||
|
_mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
|
||||||
|
cost=0.20, in_tok=30, out_tok=10)
|
||||||
|
_mk_run(tid, "architect", "2026-06-04 09:10:00", "2026-06-04 09:20:00",
|
||||||
|
cost=0.30, in_tok=40, out_tok=12)
|
||||||
|
_mk_run(tid, "developer", "2026-06-04 09:20:00", "2026-06-04 09:30:00",
|
||||||
|
cost=1.50, in_tok=100, out_tok=40, cache_read=10)
|
||||||
|
_mk_run(tid, "developer", "2026-06-04 09:30:00", "2026-06-04 09:45:00",
|
||||||
|
cost=2.00, in_tok=200, out_tok=60, cache_creation=20)
|
||||||
|
|
||||||
|
# Authoritative SUM straight from the DB.
|
||||||
|
conn = get_db()
|
||||||
|
rows = conn.execute(
|
||||||
|
"SELECT input_tokens, output_tokens, cache_read_tokens, "
|
||||||
|
"cache_creation_tokens, cost_usd FROM agent_runs WHERE task_id=?",
|
||||||
|
(tid,),
|
||||||
|
).fetchall()
|
||||||
|
conn.close()
|
||||||
|
sum_cost = sum(float(r["cost_usd"] or 0) for r in rows)
|
||||||
|
sum_out = sum(int(r["output_tokens"] or 0) for r in rows)
|
||||||
|
sum_in = sum(_input_total({"input_tokens": r["input_tokens"],
|
||||||
|
"cache_read_tokens": r["cache_read_tokens"],
|
||||||
|
"cache_creation_tokens": r["cache_creation_tokens"]})
|
||||||
|
for r in rows)
|
||||||
|
|
||||||
|
text = N.render_task_tracker(tid)
|
||||||
|
totals = [ln for ln in text.splitlines() if ln.startswith("💰")][0]
|
||||||
|
assert fmt_cost(sum_cost) in totals, totals
|
||||||
|
assert f"{fmt_tokens(sum_in)}↓" in totals, totals
|
||||||
|
assert f"{fmt_tokens(sum_out)}↑" in totals, totals
|
||||||
|
|
||||||
|
|
||||||
|
def test_tc07_sum_of_stage_lines_equals_totals_on_done():
|
||||||
|
"""On a done task with retries, Σ(stage-line costs) == totals cost: each agent
|
||||||
|
maps to exactly one stage row, so no run is double-counted or dropped."""
|
||||||
|
tid = _mk_task(stage="done")
|
||||||
|
_mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00", cost=0.20)
|
||||||
|
_mk_run(tid, "developer", "2026-06-04 09:20:00", "2026-06-04 09:30:00", cost=1.50)
|
||||||
|
_mk_run(tid, "developer", "2026-06-04 09:30:00", "2026-06-04 09:45:00", cost=2.00)
|
||||||
|
_mk_run(tid, "deployer", "2026-06-04 10:00:00", "2026-06-04 10:10:00", cost=0.30)
|
||||||
|
|
||||||
|
text = N.render_task_tracker(tid)
|
||||||
|
totals = [ln for ln in text.splitlines() if ln.startswith("💰")][0]
|
||||||
|
# developer stage line = Σ $3.50 (not $2.00), totals = $4.00.
|
||||||
|
dev_line = _stage_line(text, "Разработка")
|
||||||
|
assert fmt_cost(3.50) in dev_line, dev_line
|
||||||
|
assert fmt_cost(4.00) in totals, totals
|
||||||
|
|
||||||
|
|
||||||
|
# =========================================================================== #
|
||||||
|
# TC-08 / AC-7 — render_task_tracker never raises on broken/partial rows
|
||||||
|
# =========================================================================== #
|
||||||
|
def test_tc08_render_survives_null_timestamps_and_runs():
|
||||||
|
"""NULL timestamps / partial runs -> render returns a string, never raises."""
|
||||||
|
tid = _mk_task(stage="development")
|
||||||
|
# Run with NULL started/finished and NULL token columns.
|
||||||
|
conn = get_db()
|
||||||
|
conn.execute(
|
||||||
|
"INSERT INTO agent_runs (task_id, agent, started_at, finished_at, "
|
||||||
|
"exit_code, input_tokens, output_tokens, cost_usd, model) "
|
||||||
|
"VALUES (?,?,?,?,?,?,?,?,?)",
|
||||||
|
(tid, "developer", None, None, None, None, None, None, None),
|
||||||
|
)
|
||||||
|
conn.commit()
|
||||||
|
conn.close()
|
||||||
|
text = N.render_task_tracker(tid) # must not raise
|
||||||
|
assert isinstance(text, str) and text
|
||||||
|
|
||||||
|
|
||||||
|
def test_tc08_render_survives_bogus_stage():
|
||||||
|
"""A task sitting on a truly unknown stage still renders (never-raise)."""
|
||||||
|
tid = _mk_task(stage="__bogus__")
|
||||||
|
_mk_run(tid, "developer", "2026-06-04 09:00:00", "2026-06-04 09:10:00")
|
||||||
|
text = N.render_task_tracker(tid)
|
||||||
|
assert isinstance(text, str) and text
|
||||||
|
# Unknown stage -> developer's finished run is past "far future" current pos?
|
||||||
|
# current_pos for unknown = len(order) -> every real stage_key <= it -> ✅ kept
|
||||||
|
# (degrades to pre-ORCH-091 behaviour, no spurious suppression).
|
||||||
|
assert _stage_line(text, "Разработка") is not None, text
|
||||||
@@ -110,9 +110,12 @@ def test_tc06_stage_to_plane_status(stage, expected):
|
|||||||
assert N.plane_status_label({"stage": stage}) == expected
|
assert N.plane_status_label({"stage": stage}) == expected
|
||||||
|
|
||||||
|
|
||||||
def test_tc06_unknown_stage_degrades_to_default():
|
def test_tc06_unknown_stage_degrades_to_neutral():
|
||||||
# Anything unknown -> the safe stage default (To Analyse), never an error.
|
# ORCH-091 (AC-3): a genuinely unknown stage degrades to a NEUTRAL capitalized
|
||||||
assert N.plane_status_label({"stage": "weird-stage"}) == "To Analyse"
|
# label, NOT the misleading "To Analyse". A broken row with no stage key falls
|
||||||
|
# back to 'created' -> "To Analyse" (the real first status), never an error.
|
||||||
|
assert N.plane_status_label({"stage": "weird-stage"}) == "Weird Stage"
|
||||||
|
assert N.plane_status_label({"stage": "weird-stage"}) != "To Analyse"
|
||||||
assert N.plane_status_label({}) == "To Analyse"
|
assert N.plane_status_label({}) == "To Analyse"
|
||||||
|
|
||||||
|
|
||||||
@@ -214,3 +217,68 @@ def test_tc09c_plane_status_label_never_raises():
|
|||||||
# Garbage row (None / object without keys) -> safe default, no exception.
|
# Garbage row (None / object without keys) -> safe default, no exception.
|
||||||
assert N.plane_status_label(None) == "To Analyse"
|
assert N.plane_status_label(None) == "To Analyse"
|
||||||
assert N.plane_status_label(object()) == "To Analyse"
|
assert N.plane_status_label(object()) == "To Analyse"
|
||||||
|
|
||||||
|
|
||||||
|
# =========================================================================== #
|
||||||
|
# ORCH-091 — Group 1 (D1): completeness of the status map, staging label,
|
||||||
|
# neutral fallback. Plane_status_label is pure/offline -> assert directly.
|
||||||
|
# =========================================================================== #
|
||||||
|
from src.stages import STAGE_TRANSITIONS # noqa: E402
|
||||||
|
|
||||||
|
|
||||||
|
# --------------------------------------------------------------------------- #
|
||||||
|
# ORCH-091 TC-01 / AC-1 — completeness of the map vs STAGE_TRANSITIONS
|
||||||
|
# --------------------------------------------------------------------------- #
|
||||||
|
@pytest.mark.parametrize("stage", [s for s in STAGE_TRANSITIONS if s != "created"])
|
||||||
|
def test_orch091_tc01_every_stage_has_meaningful_label(stage):
|
||||||
|
"""AC-1: for EVERY STAGE_TRANSITIONS key (bar 'created') plane_status_label
|
||||||
|
returns a non-empty label that is NOT the misleading 'To Analyse'. Completeness
|
||||||
|
is derived programmatically from STAGE_TRANSITIONS (the single source of truth),
|
||||||
|
NOT a hardcoded list — a new engine stage without a curated label fails here."""
|
||||||
|
label = N.plane_status_label({"stage": stage})
|
||||||
|
assert label, f"stage {stage!r} produced an empty label"
|
||||||
|
assert label != N._DEFAULT_STATUS_LABEL, (
|
||||||
|
f"stage {stage!r} still falls back to 'To Analyse'"
|
||||||
|
)
|
||||||
|
# The curated map must actually carry the key (not just a neutral autogen).
|
||||||
|
assert stage in N._STAGE_STATUS_LABEL, (
|
||||||
|
f"stage {stage!r} missing a curated label in _STAGE_STATUS_LABEL"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def test_orch091_tc01_created_stays_to_analyse():
|
||||||
|
# 'created' keeps the meaningful real first status.
|
||||||
|
assert N.plane_status_label({"stage": "created"}) == "To Analyse"
|
||||||
|
|
||||||
|
|
||||||
|
# --------------------------------------------------------------------------- #
|
||||||
|
# ORCH-091 TC-02 / AC-2 — staging label is meaningful and distinct
|
||||||
|
# --------------------------------------------------------------------------- #
|
||||||
|
def test_orch091_tc02_deploy_staging_label():
|
||||||
|
"""AC-2: stage='deploy-staging' -> a meaningful staging label, distinct from
|
||||||
|
'To Analyse' AND from the deploy stage's Awaiting-Deploy label."""
|
||||||
|
staging = N.plane_status_label({"stage": "deploy-staging"})
|
||||||
|
deploy = N.plane_status_label({"stage": "deploy"})
|
||||||
|
assert staging == "Deploying (staging)"
|
||||||
|
assert staging != "To Analyse"
|
||||||
|
assert staging != deploy
|
||||||
|
assert "staging" in staging.lower()
|
||||||
|
|
||||||
|
|
||||||
|
# --------------------------------------------------------------------------- #
|
||||||
|
# ORCH-091 TC-03 / AC-3 — neutral fallback for a truly unknown stage
|
||||||
|
# --------------------------------------------------------------------------- #
|
||||||
|
def test_orch091_tc03_unknown_stage_neutral_not_to_analyse():
|
||||||
|
"""AC-3: a genuinely unknown stage -> neutral capitalized label (NOT
|
||||||
|
'To Analyse'); never raises on broken/None input."""
|
||||||
|
assert N.plane_status_label({"stage": "__bogus__"}) != "To Analyse"
|
||||||
|
assert N.plane_status_label({"stage": "__bogus__"}) # non-empty
|
||||||
|
# never-raise on broken input; None/missing-key degrade to the safe default.
|
||||||
|
assert N.plane_status_label(None) == "To Analyse"
|
||||||
|
assert N.plane_status_label({"stage": None}) == "To Analyse"
|
||||||
|
assert N.plane_status_label({"stage": ""}) == "To Analyse"
|
||||||
|
|
||||||
|
|
||||||
|
def test_orch091_tc03_cancelled_offline_label():
|
||||||
|
# ORCH-090 terminal: offline base label, no longer 'To Analyse'.
|
||||||
|
assert N.plane_status_label({"stage": "cancelled"}) == "Cancelled"
|
||||||
|
|||||||
Reference in New Issue
Block a user