fix(qg): find 14-deploy-log.md in origin/main when absent in feature worktree

ET-013: deployer writes 14-deploy-log.md and merges deploy artifacts into main via a separate PR, so the log lands in origin/main, not the feature branch worktree that check_deploy_status reads via _repo_path(repo, branch). Result: every successful deploy was falsely failed (Deploy log not found) and rolled back deploy->development. Fix: when the log is absent in the worktree, fall back to reading it from origin/main on the shared clone (git fetch origin main + git show origin/main:docs/work-items/<WI>/14-deploy-log.md). Lookup order: worktree -> origin/main -> not found. Fetch/show failures degrade to not found (never raise). Does not touch the merge-gate in gitea.py. Tests: origin/main SUCCESS->PASS (ET-013 case), origin/main FAILED->FAILED, absent everywhere->not found, fetch failure->degrades no exception, worktree log short-circuits main lookup.
Merge pull request 'fix(tracker): no duplicate Telegram messages on not-modified/transient edits' (#22 ) from fix/tracker-edit-not-modified into main
2026-06-04 13:35:35 +03:00 · 2026-06-04 13:22:46 +03:00 · 2026-06-04 13:20:40 +03:00 · 2026-06-04 11:46:21 +03:00 · 2026-06-04 11:42:46 +03:00 · 2026-06-04 11:21:50 +03:00
40 changed files with 8463 additions and 570 deletions
--- a/README.md
+++ b/README.md
@@ -39,6 +39,7 @@ created → analysis → architecture → development → review → testing →
 |--------|------|----------|
 | GET | `/health` | Health check |
 | GET | `/status` | Активные задачи (stage != done) |
+| GET | `/queue` | Очередь задач (ORCH-1): counts по статусам + max_concurrency + последние 10 jobs |
 | POST | `/webhook/plane` | Plane webhook receiver |
 | POST | `/webhook/gitea` | Gitea webhook receiver |

@@ -52,8 +53,9 @@ src/
 ├── stages.py            # State machine (transitions, agents, QG)
 ├── notifications.py     # Уведомления (логирование)
 ├── plane_sync.py        # Синхронизация статусов с Plane API
+├── queue_worker.py      # ORCH-1: фоновый воркер очереди (claim → launch_job)
 ├── agents/
-│   └── launcher.py      # AgentLauncher: launch, monitor, watchdog, auto-advance
+│   └── launcher.py      # AgentLauncher: launch/launch_job, monitor, watchdog, auto-advance
 ├── webhooks/
 │   ├── plane.py         # Plane webhook handler
 │   └── gitea.py         # Gitea webhook handler (push, PR, CI status)
@@ -107,6 +109,36 @@ uvicorn src.main:app --reload --port 8500
 | `ORCH_REPOS_DIR` | Repos dir (container) | `/repos` |
 | `ORCH_HOST_REPOS_DIR` | Repos dir (host) | `/home/slin/repos` |
 | `ORCH_DB_PATH` | SQLite path | `/app/data/orchestrator.db` |
+| `ORCH_MAX_CONCURRENCY` | Сколько jobs воркер запускает параллельно (ORCH-1) | `1` |
+| `ORCH_QUEUE_POLL_INTERVAL` | Период опроса очереди воркером, сек (ORCH-1) | `2.0` |
+| `ORCH_PREFLIGHT_CACHE_TTL` | Кэш preflight (CLI/net), сек (ORCH-1 resilience) | `45` |
+| `ORCH_BACKOFF_BASE_SECONDS` | База exp-backoff для transient (429) | `10` |
+| `ORCH_BACKOFF_MAX_SECONDS` | Потолок backoff | `600` |
+| `ORCH_TRANSIENT_MAX_ATTEMPTS` | Ретраи для 429/недоступности | `5` |
+| `ORCH_BREAKER_THRESHOLD` | transient подряд до открытия breaker | `3` |
+| `ORCH_BREAKER_PAUSE_SECONDS` | Пауза при открытом breaker | `300` |
+
+## Очередь задач (ORCH-1 / F-2b)
+
+Webhook-хэндлеры больше не спавнят claude-агентов синхронно в процессе uvicorn.
+Вместо этого они кладут **job** в персистентную SQLite-таблицу `jobs`
+(`enqueue_job`, мгновенный ответ), а фоновый воркер (`src/queue_worker.py`)
+забирает jobs с учётом `ORCH_MAX_CONCURRENCY` и запускает агента (`launch_job`,
+та же Popen-логика, что и раньше).
+
+Преимущества:
+- **Рестарт-safe.** При старте jobs со статусом `running` возвращаются в `queued`
+  (queue-recovery в lifespan) — работа не теряется.
+- **Лимит параллелизма.** Воркер не превышает `ORCH_MAX_CONCURRENCY`.
+- **Ретраи.** Упавший job (exit≠0) ретраится пока `attempts < max_attempts`,
+  потом `failed` + Telegram-нотификация.
+
+Статусы job: `queued → running → done | failed`. Наблюдаемость — через `GET /queue`.
+
+**Resilience-слой:** дешёвый preflight (CLI/net, кэш, без токенов) гейтит claim;
+429/overload детектится по логу (transient vs permanent), transient ретраится с
+exp-backoff (`available_at`, Retry-After); circuit breaker паузит воркер после N
+transient подряд. Подробности: `docs/ORCH-1_JOB_QUEUE.md`.

 ## Multi-repo: реестр проектов (ORCH-6)

--- a/docs/ARCHITECTURE.md
+++ b/docs/ARCHITECTURE.md
@@ -264,9 +264,71 @@ services:

 - ~~Shared `/repos` checkout (гонки при параллельных задачах).~~ **РЕШЕНО (ORCH-2 / S-4):**
  git worktree per task/branch — см. раздел «Изоляция через git worktree» ниже.
- **In-process daemon-потоки.** Агенты запускаются в daemon-потоках uvicorn. При
-  рестарте uvicorn запущенные агенты осиротевают → ловит orphan-recovery (M-1).
-  Целевая архитектура — очередь задач (F-2b, отдельно).
+- ~~In-process daemon-потоки (рестарт → сироты, потеря работы).~~ **РЕШЕНО (ORCH-1 / F-2b):**
+  персистентная очередь jobs + фоновый воркер — см. раздел «Очередь задач (ORCH-1)» ниже.
+  Daemon-потоки monitor/watchdog остаются для одного запущенного агента, но при
+  рестарте его job возвращается в `queued` (queue-recovery) и переподхватывается.
+
+## Очередь задач (ORCH-1 / F-2b)
+
+Раньше webhook-хэндлер **синхронно** спавнил `subprocess.Popen` + 2 daemon-thread
+прямо в процессе uvicorn (8 точек вызова). Рестарт = сироты + потеря работы,
+нет лимита параллелизма, нет ретраев.
+
+### Flow
+
+```
+webhook (plane/gitea)                 background thread (queue_worker)
+        │                                        │
+  enqueue_job() ---> [ jobs table ] <--- claim_next_job()  (atomic queued->running)
+  (мгновенный          status=queued                 │
+   ответ 200)                                    launch_job(job)
+                                                       │
+                                          AgentLauncher._spawn (Popen claude)
+                                                       │
+                                          _monitor_agent (proc.wait, commit/push,
+                                                       │  advance stage)
+                                                       │
+                                          _finalize_job:
+                                            exit 0  -> mark_job done
+                                            exit !=0 & attempts<max -> requeue (queued)
+                                            exit !=0 & attempts>=max -> failed + Telegram
+```
+
+### Таблица `jobs`
+
+| Колонка | Назначение |
+|--------|------------|
+| `status` | `queued` → `running` → `done` \| `failed` |
+| `attempts` / `max_attempts` | счётчик попыток (инкремент при claim) / лимит ретраев (default 2) |
+| `run_id` | FK на `agent_runs.id` после старта |
+| `task_content` | ТЗ, которое пишется в task-файл агента |
+| `error` | последняя ошибка |
+
+`idx_jobs_status (status, id)` — быстрый FIFO-выбор queued.
+
+### Атомарный claim
+
+`claim_next_job()` делает `SELECT queued ORDER BY id LIMIT 1` → `UPDATE ... WHERE id=? AND
+status='queued'` и проверяет `rowcount`. При гонке двух тиков лишь один UPDATE
+переведёт строку в `running` (rowcount==1); проигравший берёт следующий job.
+
+### Queue-recovery (рестарт-safe)
+
+В `main.py` lifespan **после** M-1 orphan-recovery вызывается `requeue_running_jobs()`:
+jobs со статусом `running` (воркер умёр на рестарте) → возвращаются в `queued`.
+Потом стартует воркер; на shutdown — `worker.stop()` (Event.set + join).
+
+### Конфиг
+
+- `ORCH_MAX_CONCURRENCY` (default 1) — лимит параллельных jobs.
+- `ORCH_QUEUE_POLL_INTERVAL` (default 2.0) — период опроса.
+
+Наблюдаемость: `GET /queue` — counts по статусам + последние 10 jobs.
+
+> Совместимость: `launcher.launch()` (прямой синхронный запуск, `job_id=None`)
+> сохранён для обратной совместимости. Очередь использует `launch_job()`;
+> оба разделяют `_spawn()` (Popen-логика B-2 не изменена).
 - **Gitea CI не настроен.** QG развития теперь локальный (`check_tests_local`);
  Gitea CI-статусы не являются authoritative и не блокируют pipeline.
 - **Docker внутри контейнера orchestrator НЕДОСТУПЕН.** Деплой идёт только через
--- a/docs/ORCH-1_JOB_QUEUE.md
+++ b/docs/ORCH-1_JOB_QUEUE.md
@@ -0,0 +1,127 @@
+# ORCH-1 (F-2b): Persistent Job Queue
+
+**Дата:** 2026-06-02
+**Ветка:** `feature/ORCH-1-job-queue`
+**Источник:** AUDIT_2026-06-02 (B-2 / F-2b)
+
+## Проблема
+
+Агенты запускались **in-process**: `launcher.launch()` синхронно спавнил
+`subprocess.Popen` + 2 daemon-thread (`_watchdog`, `_monitor_agent`) прямо в
+процессе uvicorn, из **8 webhook-точек**. Последствия:
+
+- **Рестарт = катастрофа.** daemon-threads умирают, claude-процессы → сироты,
+  работа теряется (M-1 лишь помечал `exit=-1` и звал человека).
+- **Нет лимита параллелизма** — N webhook'ов = N одновременных claude.
+- **Нет ретраев** — упавший агент просто мёртв.
+
+## Решение
+
+Персистентная очередь задач (SQLite-таблица `jobs`) + фоновый воркер:
+
+1. Webhook-хэндлер кладёт job (`enqueue_job`) → мгновенный ответ 200.
+2. Фоновый воркер (`src/queue_worker.py`, отдельный daemon-thread) забирает
+   jobs с учётом `max_concurrency` (`claim_next_job`, атомарно) и спавнит агента
+   (`launcher.launch_job`, та же Popen-логика).
+3. По завершении `_monitor_agent` → `_finalize_job`:
+   - `exit 0` → `done`;
+   - `exit != 0` & `attempts < max_attempts` → requeue (`queued`);
+   - `exit != 0` & `attempts >= max_attempts` → `failed` + Telegram.
+
+## Что изменено
+
+| Файл | Изменение |
+|------|-----------|
+| `src/db.py` | Таблица `jobs` + индекс; хелперы `enqueue_job`, `claim_next_job` (атомарный), `mark_job`, `count_running_jobs`, `requeue_running_jobs`, `get_job`, `job_status_counts`, `recent_jobs` |
+| `src/config.py` | `max_concurrency` (env `ORCH_MAX_CONCURRENCY`, default 1), `queue_poll_interval` (env `ORCH_QUEUE_POLL_INTERVAL`, default 2.0) |
+| `src/agents/launcher.py` | `launch()` → тонкая обёртка над `_spawn()`; новый `launch_job(job)`; `_spawn()` (общий, `job_id` опционально); monitor/watchdog принимают `job_id`; новый `_finalize_job()` (статусы + ретраи). 4 внутренних advance-вызова `self.launch` → `enqueue_job` |
+| `src/webhooks/plane.py` | 4 точки `launcher.launch` → `enqueue_job` |
+| `src/webhooks/gitea.py` | 4 точки `launcher.launch` → `enqueue_job` |
+| `src/queue_worker.py` | **НОВЫЙ** — `QueueWorker` (drain loop + max_concurrency + graceful stop) |
+| `src/main.py` | lifespan: queue-recovery (`requeue_running_jobs`) после M-1, старт/останов воркера; новый `GET /queue` |
+| `tests/test_queue.py` | **НОВЫЙ** — 19 тестов (lifecycle, атомарность claim, ретраи, requeue, observability, worker max_concurrency; Popen полностью замокан) |
+
+## Атомарность claim
+
+```sql
+SELECT id FROM jobs WHERE status='queued' ORDER BY id LIMIT 1;
+UPDATE jobs SET status='running', attempts=attempts+1, started_at=datetime('now')
+  WHERE id=? AND status='queued';   -- rowcount==1 => claimed, ==0 => проиграл гонку
+```
+
+Гарантия: один job не выдаётся дважды даже при параллельных тиках воркера
+(проверено `test_concurrent_claims_no_duplicate` — 8 потоков, 20 jobs).
+
+## Сохранённые фиксы (НЕ сломаны)
+
+- **B-1** task-file write (direct `open()` в worktree) — без изменений.
+- **B-2** Popen → log_fh (no PIPE), monitor reap — без изменений, только обёрнут.
+- **M-1** orphan-recovery в `main.py` — оставлен, queue-recovery добавлен ПОСЛЕ него.
+- **ORCH-2** worktree per task — без изменений.
+- **ORCH-6** project registry/filter — без изменений.
+
+## Acceptance
+
+| # | Проверка | Статус |
+|---|----------|--------|
+| 1 | webhook кладёт job (queued) | ✅ enqueue_job |
+| 2 | воркер исполняет queued→running→done | ✅ worker + _finalize_job |
+| 3 | running ≤ max_concurrency | ✅ test_worker_respects_max_concurrency |
+| 4 | ретрай fail→queued→failed+notify | ✅ test_finalize_job_requeue_then_fail |
+| 5 | рестарт-safe (running→requeue) | ✅ requeue_running_jobs + lifespan |
+| 6 | M-1 не сломан | ✅ оставлен в lifespan |
+| 7 | тесты (new green, 9 pre-existing) | ✅ 76 passed / 9 pre-existing |
+| 8 | `/queue` | ✅ counts + recent |
+
+## Тесты
+
+```bash
+IMG=$(docker inspect orchestrator --format '{{.Config.Image}}')
+docker run --rm -v /home/slin/repos/orchestrator:/code -w /code \
+  --entrypoint python3 $IMG -m pytest tests/ -q
+# 110 passed, 9 failed (pre-existing test_webhooks 401/signature/TypeError)
+```
+
+---
+
+## Resilience-слой (ДОПОЛНЕНИЕ: preflight + 429 + backoff + circuit breaker)
+
+Надёжность очереди против недоступности CLI и rate-limit. Два РАЗНЫХ класса
+проблем лечатся по-разному.
+
+### A. Дешёвый preflight (`src/preflight.py`) — не жжёт токены
+Перед claim воркер проверяет: `os.path.exists(CLAUDE_BIN)` + `claude --version`
+(timeout 5с, токены НЕ тратит). Результат кэшируется `preflight_cache_ttl` (45с).
+FAIL → воркер НЕ claim’ит (job остаётся `queued`), ждёт. 🚫 НЕТ prompt-ping.
+
+### B. 429 — детект НА ВЫХОДЕ (`src/error_classifier.py`)
+rate-limit нельзя предсказать — классифицируем по логу прогона. `classify_log_file`
+читает хвост лога (16KB), ищет `429/rate limit/overloaded/quota/503/529/timeout/...`
+→ `transient` или `permanent`. Извлекает `Retry-After`.
+
+- **transient** (429/сеть) → backoff-ретрай с ОТДЕЛЬНЫМ `transient_attempts`
+  (лимит `transient_max_attempts=5`) — не жжёт code-fault бюджет.
+- **permanent** (code-fault) → обычные `attempts < max_attempts` (2), потом `failed`.
+
+### C. Backoff + `available_at`
+Колонки `jobs.available_at TEXT` + `jobs.transient_attempts INTEGER` (миграция
+`_ensure_column`). `claim_next_job`: `WHERE status='queued' AND (available_at IS NULL
+OR available_at <= datetime('now'))`. При transient: `available_at = now +
+min(2^n * base, max)` (base=10с, max=600с), `Retry-After` уважается (берёмся max).
+
+### D. Circuit breaker (`CircuitBreaker` в queue_worker)
+N=3 transient подряд → **open**: воркер паузит `breaker_pause_seconds=300`, ВООБЩЕ
+не дёргает CLI, Telegram-алерт. Через паузу → **half-open** (пробует 1 job);
+ожил (exit 0) → **closed**; снова transient → опять open. Состояние в памяти
+воркера, отражается в `/queue.resilience`.
+Связь launcher→breaker — через callback `launcher.on_outcome` (без import-цикла).
+
+### Конфиг (config.py)
+`preflight_cache_ttl=45`, `backoff_base_seconds=10`, `backoff_max_seconds=600`,
+`transient_max_attempts=5`, `breaker_threshold=3`, `breaker_pause_seconds=300`.
+
+### Тесты
+`tests/test_resilience.py` — 34 теста: preflight (FAIL→queued, кэш, force),
+классификатор (transient/permanent/Retry-After), backoff (рост/cap/Retry-After,
+`available_at` гейтинг), launcher transient/permanent finalize, breaker
+(open/half-open/closed/re-open, блок claim).
--- a/src/agents/launcher.py
+++ b/src/agents/launcher.py
@@ -1,10 +1,12 @@
 import subprocess
 import os
+import json
 import logging
 import threading
 import signal
+import time
 from ..config import settings
-from ..db import get_db, get_task_by_repo_branch, update_task_stage
+from ..db import get_db, get_task_by_repo_branch, update_task_stage, enqueue_job
 from ..stages import get_next_stage, get_qg_for_stage, get_agent_for_stage
 from ..git_worktree import ensure_worktree, get_worktree_path
 from ..qg.checks import QG_CHECKS
@@ -14,6 +16,62 @@ from ..plane_sync import notify_stage_change as plane_notify_stage, add_comment
 logger = logging.getLogger("orchestrator.launcher")


+def prune_run_logs(runs_dir, keep_days=30, keep_max=500, active_paths=None):
+    """L-2: best-effort rotation of per-run logs (<runs_dir>/*.log).
+
+    A log file is removed if it is older than keep_days OR it is not within the
+    keep_max most-recent logs (whichever condition is met first). Only *.log
+    files directly inside runs_dir are considered; non-.log files and
+    subdirectories are never touched. Files whose path is in active_paths (the
+    currently running log) are always kept.
+
+    Returns the number of files removed. Never raises: any error is logged and
+    swallowed so log rotation can never bring the app down.
+    """
+    removed = 0
+    try:
+        active = set()
+        for ap in (active_paths or []):
+            try:
+                active.add(os.path.realpath(ap))
+            except Exception:
+                active.add(ap)
+
+        if not os.path.isdir(runs_dir):
+            return 0
+
+        logs = []
+        for name in os.listdir(runs_dir):
+            if not name.endswith(".log"):
+                continue
+            path = os.path.join(runs_dir, name)
+            if not os.path.isfile(path):
+                continue
+            if os.path.realpath(path) in active:
+                continue
+            try:
+                mtime = os.path.getmtime(path)
+            except OSError:
+                continue
+            logs.append((path, mtime))
+
+        logs.sort(key=lambda t: t[1], reverse=True)
+
+        cutoff = time.time() - keep_days * 86400
+        for idx, (path, mtime) in enumerate(logs):
+            too_old = mtime < cutoff
+            over_max = idx >= keep_max
+            if too_old or over_max:
+                try:
+                    os.remove(path)
+                    removed += 1
+                except OSError as e:
+                    logger.warning(f"prune_run_logs: failed to remove {path}: {e}")
+    except Exception as e:
+        logger.warning(f"prune_run_logs failed for {runs_dir}: {e}")
+    return removed
+
+
 class AgentLauncher:
    """Launch Claude CLI agents directly (binary mounted into container)."""

@@ -53,11 +111,17 @@ class AgentLauncher:
    }

    CLAUDE_BIN = "/opt/claude-code/bin/claude.exe"
-    AGENT_TIMEOUT = 1800  # 30 minutes
+    # ORCH-7 (M-2): timeout is now configurable. AGENT_TIMEOUT stays as a
+    # backward-compatible alias for the default; the actual value (and per-agent
+    # overrides) live in settings and are resolved via _resolve_timeout().
+    AGENT_TIMEOUT = settings.agent_timeout_seconds

    def launch(self, agent: str, repo: str, task_content: str = None, task_id: int = None) -> int:
        """
-        Launch a Claude CLI agent.
+        Launch a Claude CLI agent directly (legacy synchronous path).
+
+        Kept for backward compatibility (direct callers / existing tests). The
+        ORCH-1 job queue uses launch_job() instead, but both share _spawn().

        Args:
            agent: Agent role (analyst, architect, developer, reviewer, tester)
@@ -68,6 +132,31 @@ class AgentLauncher:
        Returns:
            agent_run_id from DB
        """
+        return self._spawn(agent, repo, task_content, task_id, job_id=None)
+
+    def launch_job(self, job: dict) -> int:
+        """ORCH-1: launch an agent for a claimed queue job.
+
+        Same spawn path as launch(), but threads job['id'] through so the monitor
+        can update the job's status (done / requeue / failed) and link jobs.run_id
+        to the agent_runs row. Returns the agent_run_id.
+        """
+        return self._spawn(
+            job["agent"],
+            job["repo"],
+            job.get("task_content"),
+            job.get("task_id"),
+            job_id=job["id"],
+        )
+
+    def _spawn(self, agent: str, repo: str, task_content: str = None,
+               task_id: int = None, job_id: int = None) -> int:
+        """Shared spawn implementation for launch() and launch_job().
+
+        When job_id is set, the monitor/watchdog drive the jobs table status
+        (ORCH-1). The claude-CLI Popen logic (B-2) and worktree/task-file logic
+        (B-1 / ORCH-2) are unchanged.
+        """
        config = self.AGENT_CONFIGS.get(agent)
        if not config:
            raise ValueError(f"Unknown agent: {agent}")
@@ -98,6 +187,14 @@ class AgentLauncher:
        run_id = cursor.lastrowid
        conn.commit()

+        # ORCH-1: link this job to the agent_runs row and stamp started_at.
+        if job_id is not None:
+            conn.execute(
+                "UPDATE jobs SET run_id = ?, started_at = datetime('now') WHERE id = ?",
+                (run_id, job_id),
+            )
+            conn.commit()
+
        # Prepare output log path
        output_path = f"/app/data/runs/{run_id}.log"
        os.makedirs(os.path.dirname(output_path), exist_ok=True)
@@ -112,9 +209,15 @@ class AgentLauncher:

        # No git fetch/checkout here: ensure_worktree() already put the worktree on
        # the right branch. The agent simply runs inside its isolated work_path.
+        # Feature 4 (token usage): --output-format json makes claude emit a single
+        # result JSON (with usage + total_cost_usd) at the end of stdout. The log
+        # still captures it; _monitor_agent parses the trailing JSON after the run
+        # to record per-agent tokens/cost. _monitor_agent's failure handling keys
+        # off the process exit_code (not stdout shape), so this is safe.
        cmd = (
            f'cd {work_path} && '
            f'{self.CLAUDE_BIN} --print '
+            f'--output-format json '
            f'{model_flag}'
            f'"$(cat {task_file})" '
            f'--system-prompt "$(cat {system_prompt})" '
@@ -154,6 +257,7 @@ class AgentLauncher:
        t = threading.Thread(
            target=self._watchdog,
            args=(proc.pid, run_id),
+            kwargs={"job_id": job_id, "agent": agent},
            daemon=True,
        )
        t.start()
@@ -163,6 +267,7 @@ class AgentLauncher:
        m = threading.Thread(
            target=self._monitor_agent,
            args=(proc, run_id, agent, repo, agent_branch, output_path, log_fh),
+            kwargs={"job_id": job_id},
            daemon=True,
        )
        m.start()
@@ -171,26 +276,102 @@ class AgentLauncher:
        notify_agent_started(run_id, agent, task_id)
        return run_id

-    def _watchdog(self, pid: int, run_id: int, timeout: int = None):
-        """Kill agent if it exceeds timeout."""
-        import time
+    @staticmethod
+    def _resolve_timeout(agent: str = None) -> int:
+        """ORCH-7 (M-2): resolve the wall-clock timeout for an agent.
+
+        Per-agent override from settings.agent_timeout_overrides_json (a JSON object
+        like {"reviewer": 3600}) wins; otherwise the global default
+        settings.agent_timeout_seconds is used. A malformed override JSON is ignored
+        (falls back to the default) and only logged, so a bad env never bricks runs.
+        """
+        default = settings.agent_timeout_seconds
+        raw = (settings.agent_timeout_overrides_json or "").strip()
+        if agent and raw:
+            try:
+                overrides = json.loads(raw)
+                if isinstance(overrides, dict) and agent in overrides:
+                    return int(overrides[agent])
+            except (ValueError, TypeError) as e:
+                logger.warning(f"Invalid agent_timeout_overrides_json, using default: {e}")
+        return default
+
+    def _watchdog(self, pid: int, run_id: int, timeout: int = None,
+                  job_id: int = None, agent: str = None):
+        """Kill agent if it exceeds its timeout.
+
+        ORCH-1: on a timeout-kill the monitor's proc.wait() returns the kill exit
+        code and drives the job retry/fail logic, so the watchdog itself only needs
+        to terminate the process and record the agent_runs exit. job_id is accepted
+        for symmetry.
+
+        ORCH-7 (M-2): graceful shutdown. Instead of an immediate SIGKILL (which cuts
+        claude off mid-write and leaves half-written artifacts), send SIGTERM first,
+        give the process up to settings.agent_kill_grace_seconds to flush and exit on
+        its own, and only SIGKILL if it is still alive after the grace window. If the
+        process exits during the grace window, SIGKILL is NOT sent.
+        ProcessLookupError is tolerated at every step (the process may already be
+        gone). The recorded exit_code stays -9 to match the existing retry/fail
+        contract regardless of which signal actually reaped it.
+        """
        if timeout is None:
-            timeout = self.AGENT_TIMEOUT
+            timeout = self._resolve_timeout(agent)
        time.sleep(timeout)
+
+        # Phase 1: SIGTERM (graceful). If the process is already gone, we're done.
+        try:
+            os.kill(pid, signal.SIGTERM)
+            logger.warning(
+                f"Agent run_id={run_id} exceeded {timeout}s timeout: sent SIGTERM "
+                f"(pid={pid}), grace={settings.agent_kill_grace_seconds}s"
+            )
+        except ProcessLookupError:
+            logger.info(f"Agent run_id={run_id} already exited before SIGTERM")
+            return  # nothing to record: the monitor's proc.wait() owns the exit
+
+        # Phase 2: poll for graceful exit within the grace window.
+        grace = settings.agent_kill_grace_seconds
+        poll_interval = 0.5
+        waited = 0.0
+        while waited < grace:
+            time.sleep(poll_interval)
+            waited += poll_interval
+            try:
+                os.kill(pid, 0)  # signal 0 = liveness probe, does not kill
+            except ProcessLookupError:
+                logger.info(
+                    f"Agent run_id={run_id} exited gracefully after SIGTERM "
+                    f"({waited:.1f}s); no SIGKILL needed"
+                )
+                self._record_kill(run_id)
+                return
+
+        # Phase 3: still alive -> hard SIGKILL.
        try:
            os.kill(pid, signal.SIGKILL)
-            logger.warning(f"Agent run_id={run_id} killed after {timeout}s timeout")
-            conn = get_db()
-            conn.execute(
-                "UPDATE agent_runs SET finished_at=datetime('now'), exit_code=-9 WHERE id=?",
-                (run_id,),
+            logger.warning(
+                f"Agent run_id={run_id} did not exit within {grace}s grace: sent SIGKILL"
            )
-            conn.commit()
-            conn.close()
        except ProcessLookupError:
-            pass  # Already finished
+            logger.info(f"Agent run_id={run_id} exited just before SIGKILL")
+        self._record_kill(run_id)

-    def _monitor_agent(self, proc, run_id, agent, repo, branch, output_path=None, log_fh=None):
+    @staticmethod
+    def _record_kill(run_id: int):
+        """Stamp the agent_runs row as timeout-killed (exit_code=-9).
+
+        ORCH-1: -9 is the existing kill-exit contract the monitor/retry logic keys
+        off, so we keep it stable whether the reap came from SIGTERM or SIGKILL.
+        """
+        conn = get_db()
+        conn.execute(
+            "UPDATE agent_runs SET finished_at=datetime('now'), exit_code=-9 WHERE id=?",
+            (run_id,),
+        )
+        conn.commit()
+        conn.close()
+
+    def _monitor_agent(self, proc, run_id, agent, repo, branch, output_path=None, log_fh=None, job_id=None):
        """Wait for agent to finish, commit+push results, update DB.

        B-2 fix: stdout already goes straight to the log file via Popen, so we just
@@ -225,6 +406,17 @@ class AgentLauncher:

        notify_agent_finished(run_id, agent, exit_code, task_id=_task_id, duration_s=_duration_s)

+        # Feature 4: parse token usage / cost from the (json) run log and record
+        # it on the agent_runs row. Never fatal — a garbled/missing JSON records
+        # NULLs and logs a warning so a broken run can't crash the monitor.
+        try:
+            from ..usage import parse_usage_from_log, record_usage
+            _usage = parse_usage_from_log(output_path) if output_path else None
+            record_usage(run_id, _usage)
+        except Exception as e:
+            logger.warning(f"run_id={run_id}: usage accounting failed: {e}")
+            _usage = None
+
        # Commit and push any changes — in the per-branch worktree (ORCH-2 / S-4),
        # NOT in the shared /repos/<repo>. The worktree is already on `branch`
        # (ensure_worktree did the checkout), so no checkout is needed here.
@@ -296,7 +488,8 @@ class AgentLauncher:
                set_issue_blocked(_wid)
                plane_add_comment(
                    _wid,
-                    "\u274c Deploy FAILED (smoke/healthcheck). Rolled back. Developer \u043d\u0443\u0436\u0435\u043d \u0434\u043b\u044f \u0444\u0438\u043a\u0441\u0430."
+                    "\u274c Deploy FAILED (smoke/healthcheck). Rolled back. Developer \u043d\u0443\u0436\u0435\u043d \u0434\u043b\u044f \u0444\u0438\u043a\u0441\u0430.",
+                    author="deployer",
                )
                from ..notifications import send_telegram
                send_telegram(f"\U0001f6a8 {_wid}: Deploy failed! Rolled back. Needs fix.")
@@ -314,12 +507,154 @@ class AgentLauncher:
                from ..notifications import send_telegram
                send_telegram(f"\u26a0\ufe0f {_wid}: Agent {agent} failed (exit_code={exit_code}). Check logs: /app/data/runs/{run_id}.log")

+        # Feature 4: post the per-agent usage comment under that agent's bot, and
+        # — for the deployer finishing the task — the per-task usage summary.
+        if exit_code == 0:
+            try:
+                self._post_usage_comments(run_id, agent, repo, branch, _usage)
+            except Exception as e:
+                logger.warning(f"run_id={run_id}: usage comment failed: {e}")
+
        # Auto-advance stage if agent finished successfully and QG passes
        if exit_code == 0:
            self._try_advance_stage(run_id, agent, repo, branch)

+        # ORCH-1: drive the job-queue status for queue-launched jobs only.
+        # (Legacy direct launch() has job_id=None and is unaffected.)
+        if job_id is not None:
+            self._finalize_job(job_id, agent, run_id, exit_code, output_path=output_path)
+
+    def _backoff_seconds(self, transient_attempts: int, retry_after: int = None) -> int:
+        """Exponential backoff for transient failures, honouring Retry-After.
+
+        backoff = min(2^transient_attempts * base, max). If the server sent a
+        Retry-After, use the larger of the two (never poll sooner than asked).
+        """
+        base = settings.backoff_base_seconds
+        cap = settings.backoff_max_seconds
+        backoff = min((2 ** max(transient_attempts, 0)) * base, cap)
+        if retry_after is not None and retry_after > 0:
+            backoff = max(backoff, min(retry_after, cap))
+        return int(backoff)
+
+    def _finalize_job(self, job_id: int, agent: str, run_id: int, exit_code, output_path=None):
+        """ORCH-1: update the jobs row after the agent process finished.
+
+        exit_code == 0  -> done (and resets the breaker streak via on_outcome).
+        exit_code != 0  -> classify the failure from the run log tail (token-free):
+          - TRANSIENT (429/overload/network): backoff-requeue with available_at in
+            the future + a SEPARATE transient_attempts budget
+            (settings.transient_max_attempts), honouring Retry-After. Reported to
+            the breaker so it opens after N consecutive transient failures.
+          - PERMANENT (code fault): ordinary attempts < max_attempts requeue,
+            otherwise 'failed' + Telegram.
+        """
+        from ..db import get_job, mark_job
+        from ..error_classifier import classify_log_file
+        try:
+            job = get_job(job_id)
+            if not job:
+                return
+            if exit_code == 0:
+                mark_job(job_id, "done", run_id=run_id)
+                logger.info(f"Job {job_id} ({agent}) done (run_id={run_id})")
+                self._record_outcome(transient=False, recovered=True)
+                return
+
+            # Classify the failure from the agent log tail (no token cost).
+            kind, retry_after = "permanent", None
+            log_path = output_path or f"/app/data/runs/{run_id}.log"
+            try:
+                kind, retry_after = classify_log_file(log_path)
+            except Exception:
+                pass
+
+            if kind == "transient":
+                self._finalize_transient(job_id, agent, run_id, exit_code, job, retry_after)
+            else:
+                self._finalize_permanent(job_id, agent, run_id, exit_code, job)
+        except Exception as e:
+            logger.error(f"Job {job_id}: _finalize_job error: {e}")
+
+    def _finalize_transient(self, job_id, agent, run_id, exit_code, job, retry_after):
+        """Transient (429/overload/net) failure -> backoff requeue or fail when budget out."""
+        from ..db import mark_job, mark_job_transient
+        tattempts = job.get("transient_attempts", 0)
+        tmax = settings.transient_max_attempts
+        err = (f"transient (429/overload) agent {agent} exit={exit_code} "
+               f"(run_id={run_id}); retry_after={retry_after}")
+        self._record_outcome(transient=True, recovered=False)
+        if tattempts < tmax:
+            backoff = self._backoff_seconds(tattempts + 1, retry_after)
+            mark_job_transient(job_id, backoff, error=err)
+            logger.warning(
+                f"Job {job_id} ({agent}) TRANSIENT fail (exit={exit_code}), "
+                f"backoff {backoff}s, transient_attempt {tattempts + 1}/{tmax}"
+            )
+        else:
+            mark_job(job_id, "failed", run_id=run_id, error=err)
+            logger.error(
+                f"Job {job_id} ({agent}) failed after {tattempts} transient attempts"
+            )
+            self._notify_failed(job_id, agent, job, run_id,
+                                f"transient (rate-limit) after {tattempts} attempts")
+
+    def _finalize_permanent(self, job_id, agent, run_id, exit_code, job):
+        """Permanent (code-fault) failure -> normal attempts<max requeue, then fail."""
+        from ..db import mark_job
+        attempts = job.get("attempts", 0)
+        max_attempts = job.get("max_attempts", 2)
+        err = f"agent {agent} exit_code={exit_code} (run_id={run_id})"
+        self._record_outcome(transient=False, recovered=False)
+        if attempts < max_attempts:
+            mark_job(job_id, "queued", run_id=run_id, error=err)
+            logger.warning(
+                f"Job {job_id} ({agent}) failed (exit={exit_code}), "
+                f"requeued (attempt {attempts}/{max_attempts})"
+            )
+        else:
+            mark_job(job_id, "failed", run_id=run_id, error=err)
+            logger.error(
+                f"Job {job_id} ({agent}) failed permanently after "
+                f"{attempts} attempts (exit={exit_code})"
+            )
+            self._notify_failed(job_id, agent, job, run_id,
+                                f"{attempts} attempts (exit={exit_code})")
+
+    def _notify_failed(self, job_id, agent, job, run_id, why):
+        try:
+            from ..notifications import send_telegram
+            send_telegram(
+                f"\U0001f6a8 Job {job_id} ({agent}, repo {job.get('repo')}) "
+                f"failed: {why}. Logs: /app/data/runs/{run_id}.log"
+            )
+        except Exception:
+            pass
+
+    def _record_outcome(self, transient: bool, recovered: bool):
+        """Forward the run outcome to the circuit breaker (if a worker is wired).
+
+        Decoupled via a settable callback (set by QueueWorker.start) so the launcher
+        does not hard-import the worker (avoids a cycle) and tests can run the
+        launcher standalone.
+        """
+        cb = getattr(self, "on_outcome", None)
+        if cb:
+            try:
+                cb(transient=transient, recovered=recovered)
+            except Exception:
+                pass
+
    def _try_advance_stage(self, run_id: int, agent: str, repo: str, branch: str):
-        """After agent finishes successfully, check QG and advance stage if possible."""
+        """After agent finishes successfully, advance the stage via the unified engine.
+
+        ORCH-4 / M-3: the 174-line body that used to live here moved into
+        src/stage_engine.advance_stage(). This is now a thin wrapper: it looks up
+        the task by (repo, branch) and delegates. `agent` is forwarded as
+        finished_agent so the analyst/reviewer/tester/architect rollback branches
+        still trigger exactly as before. The agent-selection bug (it used to call
+        get_agent_for_stage(next_stage)) is fixed inside the engine.
+        """
        try:
            conn = get_db()
            task_row = conn.execute(
@@ -331,178 +666,82 @@ class AgentLauncher:
                return

            task_id, current_stage, work_item_id = task_row
-            qg_name = get_qg_for_stage(current_stage)
-            next_stage = get_next_stage(current_stage)
-
-            if not next_stage:
-                return
-
-            # Run QG check if defined
-            if qg_name and qg_name in QG_CHECKS:
-                check_fn = QG_CHECKS[qg_name]
-                if qg_name in ("check_analysis_approved",):
-                    # Requires human approval - post request comment if analyst just finished
-                    if agent == "analyst" and qg_name == "check_analysis_approved" and work_item_id:
-                        files_check = QG_CHECKS.get("check_analysis_complete")
-                        if files_check:
-                            files_ok, _ = files_check(repo, work_item_id, branch)
-                            if files_ok:
-                                # Full artifacts ready -> In Review
-                                from ..plane_sync import set_issue_in_review
-                                set_issue_in_review(work_item_id)
-                                plane_add_comment(
-                                    work_item_id,
-                                    "\U0001f4cb BRD/\u0422\u0417/AC/TestPlan \u0433\u043e\u0442\u043e\u0432\u044b. "
-                                    "\u041f\u0440\u043e\u0448\u0443 review \u0438 \u0440\u0435\u0430\u043a\u0446\u0438\u044e :approved: \u0434\u043b\u044f \u043f\u0440\u043e\u0434\u0432\u0438\u0436\u0435\u043d\u0438\u044f \u0432 Architecture."
-                                )
-                                notify_approve_requested(task_id)
-                                logger.info(f"Task {task_id}: analyst finished, requested :approved: in Plane")
-                            else:
-                                # Check if questions file exists (in the task worktree)
-                                import os as _os
-                                questions_path = _os.path.join(
-                                    get_worktree_path(repo, branch),
-                                    f"docs/work-items/{work_item_id}/01-questions.md"
-                                )
-                                if _os.path.isfile(questions_path):
-                                    # Analyst has questions -> Needs Input
-                                    from ..plane_sync import set_issue_needs_input
-                                    set_issue_needs_input(work_item_id)
-                                    with open(questions_path, "r") as qf:
-                                        questions_text = qf.read()
-                                    plane_add_comment(
-                                        work_item_id,
-                                        f"\u2753 Analyst \u043d\u0443\u0436\u0434\u0430\u0435\u0442\u0441\u044f \u0432 \u0443\u0442\u043e\u0447\u043d\u0435\u043d\u0438\u0438:\n\n{questions_text}"
-                                    )
-                                    from ..notifications import send_telegram
-                                    send_telegram(
-                                        f"\u2753 {work_item_id}: Analyst \u0437\u0430\u0434\u0430\u0451\u0442 \u0432\u043e\u043f\u0440\u043e\u0441\u044b. \u041e\u0442\u0432\u0435\u0442\u044c \u0432 Plane."
-                                    )
-                                else:
-                                    # No artifacts and no questions
-                                    plane_add_comment(
-                                        work_item_id,
-                                        "\u26a0\ufe0f Analyst \u0437\u0430\u0432\u0435\u0440\u0448\u0438\u043b\u0441\u044f \u0431\u0435\u0437 \u0430\u0440\u0442\u0435\u0444\u0430\u043a\u0442\u043e\u0432 \u0438 \u0431\u0435\u0437 \u0432\u043e\u043f\u0440\u043e\u0441\u043e\u0432. \u041f\u0440\u043e\u0432\u0435\u0440\u044c\u0442\u0435 \u043b\u043e\u0433."
-                                    )
-                    return
-                elif qg_name in ("check_ci_green", "check_tests_local"):
-                    # (repo, branch) signature — already worktree-aware.
-                    passed, reason = check_fn(repo, branch)
-                elif qg_name == "check_tests_passed":
-                    # Artifact check — pass branch so it reads from the worktree.
-                    passed, reason = check_fn(repo, work_item_id or "", branch)
-                else:
-                    # Other artifact checks (check_architecture_done, etc.) — worktree-aware.
-                    passed, reason = check_fn(repo, work_item_id or "", branch)
-
-                if not passed:
-                    logger.info(f"Task {task_id}: QG '{qg_name}' not passed after {agent}: {reason}")
-                    # If reviewer says REQUEST_CHANGES, rollback to development
-                    if agent == "reviewer" and "REQUEST_CHANGES" in reason:
-                        update_task_stage(task_id, "development")
-                        notify_stage_change(task_id, current_stage, "development")
-                        plane_notify_stage(work_item_id, current_stage, "development")
-                        # Count retries
-                        conn2 = get_db()
-                        retry_count = conn2.execute(
-                            "SELECT COUNT(*) FROM agent_runs WHERE task_id=? AND agent='developer'",
-                            (task_id,)
-                        ).fetchone()[0]
-                        conn2.close()
-                        if retry_count < 3:
-                            task_desc = (
-                                f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\n"
-                                f"Stage: development\nNote: REQUEST_CHANGES from reviewer "
-                                f"(attempt {retry_count+1}/3). Fix findings in "
-                                f"docs/work-items/{work_item_id}/12-review.md"
-                            )
-                            new_run = self.launch("developer", repo, task_desc, task_id=task_id)
-                            logger.info(f"Task {task_id}: reviewer REQUEST_CHANGES, relaunched developer (run_id={new_run})")
-                        else:
-                            from ..notifications import send_telegram
-                            send_telegram(f"\u26a0\ufe0f {work_item_id}: Max developer retries (3) reached. Manual intervention needed.")
-                            logger.error(f"Task {task_id}: max retries reached")
-
-                    # Task 6: Tester FAIL -> rollback to development
-                    if agent == "tester" and qg_name == "check_tests_passed" and not passed:
-                        update_task_stage(task_id, "development")
-                        notify_stage_change(task_id, current_stage, "development")
-                        plane_notify_stage(work_item_id, current_stage, "development")
-                        from ..plane_sync import set_issue_in_progress
-                        set_issue_in_progress(work_item_id)
-                        plane_add_comment(
-                            work_item_id,
-                            f"\u274c \u0422\u0435\u0441\u0442\u044b \u043d\u0435 \u043f\u0440\u043e\u0448\u043b\u0438: {reason}. Developer \u043f\u0435\u0440\u0435\u0437\u0430\u043f\u0443\u0449\u0435\u043d \u0434\u043b\u044f \u0444\u0438\u043a\u0441\u0430."
-                        )
-                        conn2 = get_db()
-                        retry_count = conn2.execute(
-                            "SELECT COUNT(*) FROM agent_runs WHERE task_id=? AND agent='developer'",
-                            (task_id,)
-                        ).fetchone()[0]
-                        conn2.close()
-                        if retry_count < 3:
-                            task_desc = (
-                                f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\n"
-                                f"Stage: development\nNote: Tests FAILED. "
-                                f"Fix failures described in docs/work-items/{work_item_id}/13-test-report.md"
-                            )
-                            new_run = self.launch("developer", repo, task_desc, task_id=task_id)
-                            logger.info(f"Task {task_id}: tester FAIL, relaunched developer (run_id={new_run})")
-                        else:
-                            from ..notifications import send_telegram
-                            from ..plane_sync import set_issue_blocked
-                            set_issue_blocked(work_item_id)
-                            send_telegram(f"\U0001f6a8 {work_item_id}: Tests still failing after 3 developer retries. Manual intervention needed.")
-
-                    # Task 8: Architect conflict -> rollback to analysis
-                    if agent == "architect" and qg_name == "check_architecture_done" and not passed:
-                        import os as _os
-                        conflict_path = _os.path.join(
-                            get_worktree_path(repo, branch),
-                            f"docs/work-items/{work_item_id}/10-conflict.md"
-                        )
-                        if _os.path.isfile(conflict_path):
-                            update_task_stage(task_id, "analysis")
-                            notify_stage_change(task_id, current_stage, "analysis")
-                            plane_notify_stage(work_item_id, current_stage, "analysis")
-                            from ..plane_sync import set_issue_in_progress
-                            set_issue_in_progress(work_item_id)
-                            with open(conflict_path, "r") as cf:
-                                conflict_text = cf.read()[:500]
-                            plane_add_comment(
-                                work_item_id,
-                                f"\u26a0\ufe0f Architect \u043d\u0430\u0448\u0451\u043b \u043a\u043e\u043d\u0444\u043b\u0438\u043a\u0442 \u0441 \u0422\u0417. \u0412\u043e\u0437\u0432\u0440\u0430\u0442 \u0432 Analysis.\n\n{conflict_text}"
-                            )
-                            task_desc = (
-                                f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\n"
-                                f"Stage: analysis\nNote: Architect conflict. Revise TRZ. "
-                                f"See docs/work-items/{work_item_id}/10-conflict.md"
-                            )
-                            new_run = self.launch("analyst", repo, task_desc, task_id=task_id)
-                            logger.info(f"Task {task_id}: architect conflict, relaunched analyst")
-                            return
-
-                    return
-            elif qg_name:
-                return
-
-            # Advance stage
-            update_task_stage(task_id, next_stage)
-            notify_stage_change(task_id, current_stage, next_stage)
-            plane_notify_stage(work_item_id, current_stage, next_stage)
-            logger.info(f"Task {task_id}: {current_stage} -> {next_stage} (auto-advance after {agent})")
-
-            # Launch next agent if defined
-            next_agent = get_agent_for_stage(next_stage)
-            if next_agent:
-                task_desc = f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\nStage: {next_stage}"
-                new_run_id = self.launch(next_agent, repo, task_desc, task_id=task_id)
-                logger.info(f"Task {task_id}: launched '{next_agent}' (run_id={new_run_id})")
-
+            from ..stage_engine import advance_stage
+            advance_stage(
+                task_id=task_id,
+                current_stage=current_stage,
+                repo=repo,
+                work_item_id=work_item_id,
+                branch=branch,
+                finished_agent=agent,
+            )
        except Exception as e:
            logger.error(f"Auto-advance failed for run_id={run_id}: {e}")


+    def _post_usage_comments(self, run_id, agent, repo, branch, usage):
+        """Feature 4: post the per-agent usage comment (and Deployer summary).
+
+        - Always (on success, with a work_item_id): a per-agent finish comment
+          with token/cost, authored by the finishing agent's Plane bot.
+        - When the deployer finishes: also a per-task summary (SUM over
+          agent_runs GROUP BY agent), authored by the deployer.
+        """
+        from ..usage import usage_comment, task_summary_comment
+        conn = get_db()
+        row = conn.execute(
+            "SELECT id, work_item_id FROM tasks WHERE repo=? AND branch=?",
+            (repo, branch),
+        ).fetchone()
+        conn.close()
+        if not row:
+            return
+        task_id, work_item_id = row[0], row[1]
+        if not work_item_id:
+            return
+        # Observability: every agent's finish comment links its artifact(s)
+        # (reviewer->12-review, tester->13-test-report, deployer->14-deploy-log,
+        # architect->ADR, developer->PR/branch). For the developer we resolve the
+        # open PR number so the link points straight at it.
+        pr_number = None
+        if agent == "developer":
+            pr_number = self._open_pr_number(repo, branch)
+        plane_add_comment(
+            work_item_id,
+            usage_comment(
+                agent,
+                usage,
+                repo=repo,
+                branch=branch,
+                work_item_id=work_item_id,
+                pr_number=pr_number,
+            ),
+            author=agent,
+        )
+        if agent == "deployer":
+            plane_add_comment(
+                work_item_id, task_summary_comment(task_id), author="deployer"
+            )
+
+    def _open_pr_number(self, repo: str, branch: str):
+        """Return the open PR number for `branch`, or None. Never raises."""
+        try:
+            import httpx
+            owner = settings.gitea_owner
+            headers = {"Authorization": f"token {settings.gitea_token}"}
+            resp = httpx.get(
+                f"{settings.gitea_url}/api/v1/repos/{owner}/{repo}/pulls",
+                params={"state": "open", "head": branch},
+                headers=headers, timeout=5,
+            )
+            if resp.status_code == 200:
+                prs = resp.json()
+                if prs:
+                    return prs[0].get("number")
+        except Exception:
+            pass
+        return None
+
    def _ensure_pr(self, repo: str, branch: str, run_id: int):
        import httpx
        owner = settings.gitea_owner
@@ -534,47 +773,6 @@ class AgentLauncher:
            logger.error(f"Failed to create PR for {branch}: {e}")
            return None

-    def _auto_merge_pr(self, repo: str, branch: str, task_id: int, work_item_id: str):
-        import httpx
-        owner = settings.gitea_owner
-        headers = {"Authorization": f"token {settings.gitea_token}"}
-        base_url = f"{settings.gitea_url}/api/v1"
-        try:
-            resp = httpx.get(
-                f"{base_url}/repos/{owner}/{repo}/pulls",
-                params={"state": "open", "head": branch},
-                headers=headers, timeout=10
-            )
-            resp.raise_for_status()
-            prs = resp.json()
-            if not prs:
-                pr_number = self._ensure_pr(repo, branch, 0)
-                if not pr_number:
-                    return False
-            else:
-                pr_number = prs[0]["number"]
-            resp = httpx.post(
-                f"{base_url}/repos/{owner}/{repo}/pulls/{pr_number}/merge",
-                json={"Do": "merge"},
-                headers=headers, timeout=30
-            )
-            if resp.status_code in (200, 204):
-                logger.info(f"PR #{pr_number} merged for {branch}")
-                update_task_stage(task_id, "done")
-                notify_stage_change(task_id, "deploy", "done")
-                plane_notify_stage(work_item_id, "deploy", "done")
-                from ..notifications import send_telegram
-                send_telegram(f"\u2705 {work_item_id}: PR #{pr_number} merged! deploy -> done. Task complete.")
-                return True
-            else:
-                logger.error(f"Merge failed for PR #{pr_number}: {resp.status_code} {resp.text}")
-                from ..notifications import send_telegram
-                send_telegram(f"\u26a0\ufe0f {work_item_id}: Auto-merge failed (HTTP {resp.status_code}). Manual merge needed.")
-                return False
-        except Exception as e:
-            logger.error(f"Auto-merge failed for {branch}: {e}")
-            return False
-
    def _write_task_file(self, repo: str, branch: str, task_file: str, content: str):
        """Write task file directly into the task's worktree.

--- a/src/config.py
+++ b/src/config.py
@@ -9,8 +9,20 @@ class Settings(BaseSettings):
    plane_webhook_secret: str = ""
    plane_project_id: str = ""

+    # Per-agent Plane bot tokens (feat: per-agent comment authorship).
+    # When set, add_comment posts under the matching bot so Plane shows the
+    # real author (Analyst/Architect/...). Empty -> fallback to plane_api_token.
+    plane_bot_analyst: str = ""
+    plane_bot_architect: str = ""
+    plane_bot_developer: str = ""
+    plane_bot_reviewer: str = ""
+    plane_bot_tester: str = ""
+    plane_bot_deployer: str = ""
+    plane_bot_stream: str = ""
+
    # Gitea
    gitea_url: str = "http://localhost:3000"
+    gitea_public_url: str = ""  # external URL for clickable links in comments; falls back to gitea_url
    gitea_token: str = ""
    gitea_webhook_secret: str = ""
    gitea_owner: str = "admin"
@@ -30,6 +42,51 @@ class Settings(BaseSettings):
    # DB
    db_path: str = "/app/data/orchestrator.db"

+    # ORCH-1 (F-2b): persistent job queue / background worker.
+    # max_concurrency  -> max agent jobs running in parallel (env ORCH_MAX_CONCURRENCY)
+    # queue_poll_interval -> worker loop poll seconds (env ORCH_QUEUE_POLL_INTERVAL)
+    max_concurrency: int = 1
+    queue_poll_interval: float = 2.0
+
+    # ORCH-1b (resilience): preflight + 429/rate-limit + backoff + circuit breaker.
+    # preflight_cache_ttl  -> cache the cheap CLI/network preflight result (seconds);
+    #                         the worker does NOT re-run `claude --version` more often
+    #                         than this (env ORCH_PREFLIGHT_CACHE_TTL).
+    # backoff_base_seconds -> base for exponential transient backoff.
+    # backoff_max_seconds  -> ceiling for the transient backoff.
+    # transient_max_attempts -> retry budget for transient (429/overload/network)
+    #                         failures, separate from code-fault `attempts`.
+    # breaker_threshold    -> consecutive transient failures that OPEN the breaker.
+    # breaker_pause_seconds -> how long the breaker stays open before half-open.
+    preflight_cache_ttl: int = 45
+    backoff_base_seconds: int = 10
+    backoff_max_seconds: int = 600
+    transient_max_attempts: int = 5
+    breaker_threshold: int = 3
+    breaker_pause_seconds: int = 300
+
+    # ORCH-7 (M-2): agent timeout + graceful kill.
+    # agent_timeout_seconds   -> default per-agent wall-clock budget; the watchdog
+    #                            kills the run after this (env ORCH_AGENT_TIMEOUT_SECONDS).
+    # agent_kill_grace_seconds-> pause between SIGTERM and SIGKILL so claude can
+    #                            flush artifacts before the hard kill
+    #                            (env ORCH_AGENT_KILL_GRACE_SECONDS).
+    # agent_timeout_overrides_json -> optional per-agent override JSON object,
+    #                            e.g. {"reviewer": 3600, "architect": 2700}
+    #                            (env ORCH_AGENT_TIMEOUT_OVERRIDES_JSON).
+    agent_timeout_seconds: int = 1800
+    agent_kill_grace_seconds: int = 20
+    agent_timeout_overrides_json: str = ""
+
+    # L-2: run-log rotation. Old per-run logs in <data>/runs/*.log are pruned at
+    # app startup (best-effort). A *.log is removed if it is older than
+    # log_keep_days OR not within the log_keep_max most-recent logs (whichever
+    # hits first). Only *.log files are touched; the active run log is skipped.
+    #   log_keep_days -> max age in days (env ORCH_LOG_KEEP_DAYS).
+    #   log_keep_max  -> max number of newest logs to retain (env ORCH_LOG_KEEP_MAX).
+    log_keep_days: int = 30
+    log_keep_max: int = 500
+

    # Telegram notifications
    telegram_bot_token: str = ""
--- a/src/db.py
+++ b/src/db.py
@@ -40,10 +40,87 @@ def init_db():
            exit_code INTEGER,
            output_path TEXT
        );
+        -- ORCH-1 (F-2b): persistent job queue. Webhook handlers enqueue a job and
+        -- return immediately; a background worker claims jobs (respecting
+        -- max_concurrency), spawns the claude agent, and updates the status.
+        -- Restart-safe: running jobs are requeued on startup (queue-recovery).
+        CREATE TABLE IF NOT EXISTS jobs (
+            id INTEGER PRIMARY KEY AUTOINCREMENT,
+            agent TEXT NOT NULL,
+            repo TEXT NOT NULL,
+            task_id INTEGER,                          -- FK tasks.id (nullable)
+            task_content TEXT,                        -- written to the agent task_file
+            status TEXT NOT NULL DEFAULT 'queued',    -- queued|running|done|failed
+            attempts INTEGER NOT NULL DEFAULT 0,
+            max_attempts INTEGER NOT NULL DEFAULT 2,
+            run_id INTEGER,                           -- agent_runs.id once started
+            error TEXT,                               -- last error message
+            transient_attempts INTEGER NOT NULL DEFAULT 0,  -- ORCH-1 resilience: 429/transient retries
+            available_at TEXT,                        -- ORCH-1 resilience: backoff gate (claim when <= now)
+            created_at TEXT DEFAULT (datetime('now')),
+            started_at TEXT,
+            finished_at TEXT
+        );
+        CREATE INDEX IF NOT EXISTS idx_jobs_status ON jobs(status, id);
    """)
+    # Lightweight migration: add resilience columns to a pre-existing jobs table
+    # (CREATE TABLE IF NOT EXISTS won't add columns to an already-created table).
+    _ensure_column(conn, "jobs", "transient_attempts", "INTEGER NOT NULL DEFAULT 0")
+    _ensure_column(conn, "jobs", "available_at", "TEXT")
+    # ORCH-5 (M-7): webhook delivery de-dup. Add events.delivery_id and a PARTIAL
+    # unique index. Partial (WHERE delivery_id IS NOT NULL) so pre-existing rows
+    # (which have NULL delivery_id) never collide with each other. Restart-safe:
+    # _ensure_column is a no-op once the column exists, and CREATE INDEX IF NOT
+    # EXISTS is a no-op once the index exists, so this is safe on the live prod DB.
+    _ensure_column(conn, "events", "delivery_id", "TEXT")
+    conn.execute(
+        "CREATE UNIQUE INDEX IF NOT EXISTS idx_events_delivery "
+        "ON events(delivery_id) WHERE delivery_id IS NOT NULL"
+    )
+    # Feature 4 (token usage): per-run token / cost accounting. Parsed from the
+    # claude --output-format json result by the launcher monitor. Idempotent
+    # ALTERs (no-op once the columns exist) so this is safe on the live prod DB.
+    _ensure_column(conn, "agent_runs", "input_tokens", "INTEGER")
+    _ensure_column(conn, "agent_runs", "output_tokens", "INTEGER")
+    _ensure_column(conn, "agent_runs", "cache_read_tokens", "INTEGER")
+    # Observability fix: also persist cache-CREATION input tokens. Claude CLI
+    # reports the real input split across input_tokens (fresh, ~tens) +
+    # cache_read_input_tokens (cache hit, millions) + cache_creation_input_tokens
+    # (writing new cache). Without this column the cache_creation slice is lost
+    # and the "X in" figure understates the true prompt size. Idempotent ALTER.
+    _ensure_column(conn, "agent_runs", "cache_creation_tokens", "INTEGER")
+    _ensure_column(conn, "agent_runs", "cost_usd", "REAL")
+    # Telegram live tracker (feat/telegram-live-tracker): persist the FULL model
+    # name (e.g. "tokenator/claude-opus-4-8") per agent_runs row so the tracker
+    # can render a short model tag per stage. Parsed from the run-log result JSON
+    # (modelUsage key) by the launcher monitor; NULL when unknown. Idempotent ALTER.
+    _ensure_column(conn, "agent_runs", "model", "TEXT")
+    # Telegram live tracker: one editable Telegram message per task. We store its
+    # message_id so each stage transition can editMessageText the same message
+    # instead of spamming a new one. Idempotent ALTER (safe on the live prod DB).
+    _ensure_column(conn, "tasks", "tracker_message_id", "INTEGER")
+    # Telegram live tracker: human-readable task title for the tracker header
+    # ("🛠️ ET-012 · <title>"). Populated from the Plane work-item name at task
+    # creation; falls back to the work_item_id when absent. Idempotent ALTER.
+    _ensure_column(conn, "tasks", "title", "TEXT")
+    # Telegram live tracker: "BRD review" is the only HUMAN gate time — the delta
+    # between "BRD ready / approve requested" and the analysis->architecture
+    # advance (human flipped Plane to Approved). Persisted on the task so the
+    # tracker can show "твоё время" without recomputing from activity history.
+    _ensure_column(conn, "tasks", "brd_review_started_at", "TEXT")
+    _ensure_column(conn, "tasks", "brd_review_ended_at", "TEXT")
+    conn.commit()
    conn.close()


+def _ensure_column(conn, table: str, column: str, decl: str):
+    """Add a column to `table` if it does not already exist (idempotent migration)."""
+    cols = [r[1] for r in conn.execute(f"PRAGMA table_info({table})").fetchall()]
+    if column not in cols:
+        conn.execute(f"ALTER TABLE {table} ADD COLUMN {column} {decl}")
+        conn.commit()
+
+
 def get_task_by_plane_id(plane_id: str) -> dict | None:
    """Find task by Plane work item ID (checks plane_id and plane_issue_id)."""
    conn = get_db()
@@ -79,6 +156,71 @@ def update_task_stage(task_id: int, stage: str):
    conn.close()


+# ---------------------------------------------------------------------------
+# Telegram live tracker helpers (feat/telegram-live-tracker)
+# ---------------------------------------------------------------------------
+
+def get_tracker_message_id(task_id: int) -> int | None:
+    """Return the stored Telegram tracker message_id for a task, or None."""
+    conn = get_db()
+    try:
+        row = conn.execute(
+            "SELECT tracker_message_id FROM tasks WHERE id=?", (task_id,)
+        ).fetchone()
+    finally:
+        conn.close()
+    return row[0] if row and row[0] is not None else None
+
+
+def set_tracker_message_id(task_id: int, message_id: int) -> None:
+    """Persist the Telegram tracker message_id for a task (idempotent overwrite)."""
+    conn = get_db()
+    try:
+        conn.execute(
+            "UPDATE tasks SET tracker_message_id=? WHERE id=?",
+            (message_id, task_id),
+        )
+        conn.commit()
+    finally:
+        conn.close()
+
+
+def mark_brd_review_started(task_id: int) -> None:
+    """Stamp when BRD review (the human approve gate) started, if not already set.
+
+    Idempotent: only sets it the first time (a retried analyst run must not reset
+    the clock). The delta to brd_review_ended_at is the only "твоё время".
+    """
+    conn = get_db()
+    try:
+        conn.execute(
+            "UPDATE tasks SET brd_review_started_at=datetime('now') "
+            "WHERE id=? AND brd_review_started_at IS NULL",
+            (task_id,),
+        )
+        conn.commit()
+    finally:
+        conn.close()
+
+
+def mark_brd_review_ended(task_id: int) -> None:
+    """Stamp when BRD review ended (analysis->architecture advance / Approved).
+
+    Idempotent: only sets it the first time and only if a start exists.
+    """
+    conn = get_db()
+    try:
+        conn.execute(
+            "UPDATE tasks SET brd_review_ended_at=datetime('now') "
+            "WHERE id=? AND brd_review_started_at IS NOT NULL "
+            "AND brd_review_ended_at IS NULL",
+            (task_id,),
+        )
+        conn.commit()
+    finally:
+        conn.close()
+
+
 def get_next_work_item_id(repo: str, prefix: str = "ET") -> str:
    """Generate next work item ID (e.g., ET-003 / ORCH-001).

@@ -105,3 +247,317 @@ def get_next_work_item_id(repo: str, prefix: str = "ET") -> str:
        next_num = 1

    return f"{prefix}-{next_num:03d}"
+
+
+def ensure_unique_work_item_id(work_item_id: str, repo: str) -> str:
+    """BUG 2a: guarantee work_item_id uniqueness within (repo) over M-6 derive.
+
+    M-6 derives the work_item_id from the Plane sequence_id. That number can
+    collide (e.g. an issue was deleted and the sequence reused, or two issues
+    map to the same number) -> the SAME ET-NNN gets handed to two different
+    tasks, which then physically share a branch/worktree slug prefix and step on
+    each other (see ET-006: task 8 and task 25).
+
+    This is a guard LAYERED ON TOP of the M-6 derive (it does NOT replace it):
+    given the derived id, if that exact <PREFIX>-NNN already exists in the tasks
+    table for this repo, walk forward (ET-007, ET-008, ...) until a free number
+    is found and return that instead. If the derived id is free, it is returned
+    unchanged.
+    """
+    if not work_item_id or "-" not in work_item_id:
+        return work_item_id
+    prefix, num_str = work_item_id.rsplit("-", 1)
+    try:
+        num = int(num_str)
+    except ValueError:
+        return work_item_id
+    width = len(num_str)
+
+    conn = get_db()
+    try:
+        candidate = work_item_id
+        while conn.execute(
+            "SELECT 1 FROM tasks WHERE repo = ? AND work_item_id = ? LIMIT 1",
+            (repo, candidate),
+        ).fetchone() is not None:
+            num += 1
+            candidate = f"{prefix}-{num:0{width}d}"
+        return candidate
+    finally:
+        conn.close()
+
+
+# ---------------------------------------------------------------------------
+# ORCH-5 (M-7): idempotent webhook event logging
+# ---------------------------------------------------------------------------
+
+def insert_event_dedup(
+    source: str, event_type: str, payload: str, delivery_id: str
+) -> bool:
+    """Idempotently log a webhook event keyed by delivery_id.
+
+    Returns True if a NEW row was inserted (caller should dispatch the event) and
+    False if this delivery_id was already present (a duplicate delivery -> caller
+    must skip dispatch/enqueue). Uses INSERT OR IGNORE against the partial UNIQUE
+    index idx_events_delivery; rowcount==1 means the row was actually inserted.
+    """
+    conn = get_db()
+    try:
+        cur = conn.execute(
+            "INSERT OR IGNORE INTO events (source, event_type, payload, delivery_id) "
+            "VALUES (?, ?, ?, ?)",
+            (source, event_type, payload, delivery_id),
+        )
+        conn.commit()
+        return cur.rowcount == 1
+    finally:
+        conn.close()
+
+
+# ---------------------------------------------------------------------------
+# ORCH-1 (F-2b): job queue helpers
+# ---------------------------------------------------------------------------
+
+def enqueue_job(
+    agent: str,
+    repo: str,
+    task_content: str | None = None,
+    task_id: int | None = None,
+    max_attempts: int = 2,
+) -> int:
+    """Enqueue a new job (status='queued'). Returns the new job id.
+
+    This is what webhook handlers call instead of launching an agent in-process:
+    it is a fast DB INSERT that returns immediately. The background worker
+    (queue_worker) picks the job up later.
+    """
+    conn = get_db()
+    cursor = conn.execute(
+        "INSERT INTO jobs (agent, repo, task_id, task_content, max_attempts) "
+        "VALUES (?, ?, ?, ?, ?)",
+        (agent, repo, task_id, task_content, max_attempts),
+    )
+    job_id = cursor.lastrowid
+    conn.commit()
+    conn.close()
+    return job_id
+
+
+def claim_next_job() -> dict | None:
+    """Atomically claim the oldest queued job and mark it 'running'.
+
+    Atomicity: the UPDATE carries the `status='queued'` guard in its WHERE clause
+    and we check `rowcount`. If two worker ticks race for the same row, only the
+    first UPDATE flips it to 'running' (rowcount==1); the loser sees rowcount==0
+    and retries the SELECT. We rely on SQLite's default per-connection transaction
+    so the SELECT+UPDATE pair is consistent. Returns the claimed job dict or None
+    when the queue is empty.
+    """
+    conn = get_db()
+    try:
+        while True:
+            row = conn.execute(
+                "SELECT id FROM jobs WHERE status='queued' "
+                "AND (available_at IS NULL OR available_at <= datetime('now')) "
+                "ORDER BY id LIMIT 1"
+            ).fetchone()
+            if not row:
+                return None
+            job_id = row["id"]
+            cur = conn.execute(
+                "UPDATE jobs SET status='running', "
+                "attempts = attempts + 1, started_at = datetime('now') "
+                "WHERE id = ? AND status='queued'",
+                (job_id,),
+            )
+            conn.commit()
+            if cur.rowcount == 1:
+                claimed = conn.execute(
+                    "SELECT * FROM jobs WHERE id = ?", (job_id,)
+                ).fetchone()
+                return dict(claimed)
+            # Lost the race for this row; loop and try the next queued job.
+    finally:
+        conn.close()
+
+
+def mark_job_transient(job_id: int, available_at_sql_offset_seconds: int,
+                       error: str | None = None) -> None:
+    """ORCH-1 resilience: requeue a job after a *transient* failure (429/overload/net).
+
+    Increments `transient_attempts` (separate from the code-fault `attempts`),
+    sets status back to 'queued', and gates re-pickup via `available_at` =
+    now + backoff seconds. started_at/finished_at are cleared.
+    """
+    conn = get_db()
+    sets = [
+        "status='queued'",
+        "transient_attempts = transient_attempts + 1",
+        "available_at = datetime('now', ?)",
+        "started_at = NULL",
+        "finished_at = NULL",
+    ]
+    params: list = [f"+{int(available_at_sql_offset_seconds)} seconds"]
+    if error is not None:
+        sets.append("error = ?")
+        params.append(error)
+    params.append(job_id)
+    conn.execute(f"UPDATE jobs SET {', '.join(sets)} WHERE id = ?", params)
+    conn.commit()
+    conn.close()
+
+
+def mark_job(
+    job_id: int,
+    status: str,
+    run_id: int | None = None,
+    error: str | None = None,
+):
+    """Update a job's status (queued|running|done|failed).
+
+    - run_id (optional): link to the agent_runs row that executed this job.
+    - error (optional): last error message (for failed/retry).
+    - 'done'/'failed' also stamp finished_at.
+    - 'queued' (requeue for retry) clears started_at/finished_at so the next
+      claim treats it as fresh.
+    """
+    conn = get_db()
+    sets = ["status = ?"]
+    params: list = [status]
+    if run_id is not None:
+        sets.append("run_id = ?")
+        params.append(run_id)
+    if error is not None:
+        sets.append("error = ?")
+        params.append(error)
+    if status in ("done", "failed"):
+        sets.append("finished_at = datetime('now')")
+    elif status == "queued":
+        sets.append("started_at = NULL")
+        sets.append("finished_at = NULL")
+    params.append(job_id)
+    conn.execute(f"UPDATE jobs SET {', '.join(sets)} WHERE id = ?", params)
+    conn.commit()
+    conn.close()
+
+
+def has_active_job_for_task(task_id: int) -> bool:
+    """True if the task already has a queued or running job.
+
+    Used by the status-only verdict model (handle_status_start) to guard against
+    double-launching an agent when a duplicate In Progress webhook arrives or a
+    job is still in flight. The events de-dup absorbs identical webhook bodies;
+    this guards against distinct webhooks while a job is pending/running.
+    """
+    conn = get_db()
+    row = conn.execute(
+        "SELECT 1 FROM jobs WHERE task_id = ? AND status IN ('queued','running') LIMIT 1",
+        (task_id,),
+    ).fetchone()
+    conn.close()
+    return row is not None
+
+
+def count_running_jobs() -> int:
+    """Number of jobs currently in 'running' status (for max_concurrency)."""
+    conn = get_db()
+    n = conn.execute(
+        "SELECT COUNT(*) FROM jobs WHERE status='running'"
+    ).fetchone()[0]
+    conn.close()
+    return int(n)
+
+
+def requeue_running_jobs() -> int:
+    """Queue-recovery: on startup, any job left 'running' belongs to a worker that
+    died on restart -> put it back to 'queued'. attempts are kept as-is (the next
+    claim does NOT re-increment beyond what is needed; claim_next_job increments on
+    pickup). Returns the number of requeued jobs.
+    """
+    conn = get_db()
+    cur = conn.execute(
+        "UPDATE jobs SET status='queued', started_at = NULL "
+        "WHERE status='running'"
+    )
+    conn.commit()
+    n = cur.rowcount
+    conn.close()
+    return int(n)
+
+
+def get_job(job_id: int) -> dict | None:
+    """Fetch a single job by id."""
+    conn = get_db()
+    row = conn.execute("SELECT * FROM jobs WHERE id = ?", (job_id,)).fetchone()
+    conn.close()
+    return dict(row) if row else None
+
+
+def job_status_counts() -> dict:
+    """Return counts grouped by status (for /queue observability)."""
+    conn = get_db()
+    rows = conn.execute(
+        "SELECT status, COUNT(*) AS n FROM jobs GROUP BY status"
+    ).fetchall()
+    conn.close()
+    counts = {"queued": 0, "running": 0, "done": 0, "failed": 0}
+    for r in rows:
+        counts[r["status"]] = r["n"]
+    return counts
+
+
+def recent_jobs(limit: int = 10) -> list[dict]:
+    """Return the most recent jobs (for /queue observability)."""
+    conn = get_db()
+    rows = conn.execute(
+        "SELECT * FROM jobs ORDER BY id DESC LIMIT ?", (limit,)
+    ).fetchall()
+    conn.close()
+    return [dict(r) for r in rows]
+
+
+# ---------------------------------------------------------------------------
+# ORCH-1b (resilience): transient backoff helpers
+# ---------------------------------------------------------------------------
+
+def requeue_job_transient(job_id: int, delay_seconds: float, error: str | None = None):
+    """ORCH-1b: requeue a job after a TRANSIENT (429/overload/network) failure.
+
+    Unlike a code-fault requeue, this:
+      - increments `transient_attempts` (a separate budget from code-fault attempts)
+      - sets `available_at = now + delay_seconds` so claim_next_job won't pick it
+        up until the backoff window elapses
+      - sets status back to 'queued' and clears started_at/finished_at
+
+    delay_seconds is computed by the caller (exp backoff, capped, Retry-After).
+    """
+    conn = get_db()
+    conn.execute(
+        "UPDATE jobs SET status='queued', "
+        "transient_attempts = transient_attempts + 1, "
+        "available_at = datetime('now', ? || ' seconds'), "
+        "started_at = NULL, finished_at = NULL, "
+        "error = COALESCE(?, error) "
+        "WHERE id = ?",
+        (f"+{int(round(delay_seconds))}", error, job_id),
+    )
+    conn.commit()
+    conn.close()
+
+
+def compute_backoff(transient_attempts: int, retry_after: float | None = None) -> float:
+    """ORCH-1b: exponential backoff (seconds) for a transient failure.
+
+    delay = min(2**transient_attempts * base, max). If the server sent a
+    Retry-After hint we honour it as a floor (use the larger of the two so we
+    never poll sooner than the server asked).
+
+    `transient_attempts` is the count AFTER this failure (i.e. how many transient
+    failures have occurred), so the first backoff uses 2**1.
+    """
+    base = getattr(settings, "backoff_base_seconds", 10)
+    cap = getattr(settings, "backoff_max_seconds", 600)
+    exp = min((2 ** max(transient_attempts, 0)) * base, cap)
+    if retry_after is not None and retry_after > 0:
+        return float(min(max(exp, retry_after), cap))
+    return float(exp)
--- a/src/error_classifier.py
+++ b/src/error_classifier.py
@@ -0,0 +1,87 @@
+"""ORCH-1 resilience: classify an agent failure as transient vs permanent.
+
+Rate limits / overload / network blips cannot be reliably predicted in advance,
+so we classify *after the run* by scanning the agent's combined stdout/stderr log
+(B-2 sends both to /app/data/runs/<run_id>.log).
+
+- transient -> 429 / rate limit / overloaded / network / quota-exhausted etc.
+              => backoff + transient retry (separate counter, larger budget).
+- permanent -> a genuine code fault / agent error
+              => normal attempts < max_attempts, then 'failed'.
+
+Also extracts a Retry-After hint (seconds) when the server provided one.
+"""
+import re
+
+# Case-insensitive substrings/patterns that signal a transient/rate-limit issue.
+_TRANSIENT_PATTERNS = [
+    r"\b429\b",
+    r"rate[\s_-]*limit",
+    r"rate_limit_error",
+    r"overloaded",
+    r"overloaded_error",
+    r"too many requests",
+    r"quota",
+    r"insufficient[_\s-]*quota",
+    r"retry[\s-]*after",
+    r"service unavailable",
+    r"\b503\b",
+    r"\b529\b",
+    r"timed out",
+    r"timeout",
+    r"connection (reset|refused|error|aborted)",
+    r"temporarily unavailable",
+    r"econnreset",
+    r"etimedout",
+]
+
+_TRANSIENT_RE = re.compile("|".join(_TRANSIENT_PATTERNS), re.IGNORECASE)
+
+# Retry-After: header style ("Retry-After: 30") or JSON ("retry_after": 30) or
+# "retry after 30 seconds". Returns the integer seconds.
+_RETRY_AFTER_RE = re.compile(
+    r"retry[\s_-]*after[\"']?\s*[:=]?\s*[\"']?\s*(\d+)",
+    re.IGNORECASE,
+)
+
+
+def classify_text(text: str) -> str:
+    """Return 'transient' or 'permanent' for a chunk of log/stderr text."""
+    if not text:
+        return "permanent"
+    return "transient" if _TRANSIENT_RE.search(text) else "permanent"
+
+
+def parse_retry_after(text: str) -> int | None:
+    """Return Retry-After seconds if present in the text, else None."""
+    if not text:
+        return None
+    m = _RETRY_AFTER_RE.search(text)
+    if m:
+        try:
+            return int(m.group(1))
+        except (TypeError, ValueError):
+            return None
+    return None
+
+
+def classify_log_file(path: str, tail_bytes: int = 16384) -> tuple[str, int | None]:
+    """Classify the tail of a log file.
+
+    Reads the last `tail_bytes` of the log (rate-limit messages appear near the
+    end) and returns (classification, retry_after_seconds_or_None).
+    On any read error, treats it as 'permanent' (no special backoff).
+    """
+    if not path:
+        return "permanent", None
+    try:
+        with open(path, "rb") as f:
+            try:
+                f.seek(-tail_bytes, 2)
+            except OSError:
+                f.seek(0)
+            data = f.read()
+        text = data.decode("utf-8", errors="replace")
+    except Exception:
+        return "permanent", None
+    return classify_text(text), parse_retry_after(text)
--- a/src/main.py
+++ b/src/main.py
@@ -51,7 +51,41 @@ async def lifespan(app: FastAPI):
        except Exception:
            pass
        log.warning(f"Recovered {len(orphan_rows)} orphaned agent runs")
-    yield
+
+    # ORCH-1 (F-2b): queue-recovery. Any job left in 'running' status belongs to a
+    # worker that died on the previous restart -> put it back to 'queued' so the
+    # worker re-picks it up (restart-safe, no lost work). Runs AFTER M-1.
+    from .db import requeue_running_jobs
+    requeued = requeue_running_jobs()
+    if requeued:
+        log.warning(f"Queue-recovery: requeued {requeued} running job(s) after restart")
+
+    # L-2: rotate old per-run logs at startup (best-effort; never fatal).
+    try:
+        import os as _os
+        from .config import settings as _settings
+        from .agents.launcher import prune_run_logs
+        _runs_dir = _os.path.join(_os.path.dirname(_settings.db_path), "runs")
+        _removed = prune_run_logs(
+            _runs_dir,
+            keep_days=_settings.log_keep_days,
+            keep_max=_settings.log_keep_max,
+        )
+        if _removed:
+            log.info(f"Log rotation: pruned {_removed} old run log(s) from {_runs_dir}")
+    except Exception as e:
+        log.warning(f"Log rotation skipped: {e}")
+
+    # Start the background job-queue worker (ORCH-1).
+    from .queue_worker import worker
+    worker.start()
+
+    try:
+        yield
+    finally:
+        # Graceful shutdown of the worker (running agents keep going; their jobs
+        # are requeued on next start via queue-recovery if the process dies).
+        worker.stop()


 app = FastAPI(title="Multi-Agent Orchestrator", lifespan=lifespan)
@@ -73,3 +107,17 @@ async def status():
    ).fetchall()
    conn.close()
    return {"active_tasks": [dict(t) for t in tasks]}
+
+
+@app.get("/queue")
+async def queue():
+    """ORCH-1: job-queue observability — status counts + recent jobs."""
+    from .db import job_status_counts, recent_jobs
+    from .queue_worker import worker
+    return {
+        "counts": job_status_counts(),
+        "max_concurrency": worker.max_concurrency,
+        "poll_interval": worker.poll_interval,
+        "resilience": worker.status(),
+        "recent": recent_jobs(10),
+    }
--- a/src/notifications.py
+++ b/src/notifications.py
@@ -1,6 +1,24 @@
-"""Notifications and logging for orchestrator events."""
+"""Notifications and logging for orchestrator events.

+feat/telegram-live-tracker (Variant B+): instead of ~15 separate Telegram
+messages per task (agent start / finish / stage transition / QG-pending / tech
+noise), the orchestrator now maintains ONE live tracker message per task that is
+edited in place (editMessageText) on every stage transition. Only events that
+NEED Slava's attention are sent as SEPARATE, notifying messages:
+
+  * approve-gate  (notify_approve_requested)  — BRD/TZ/AC ready, flip to Approved
+  * deploy failed / rolled back               — send_telegram from launcher/engine
+  * agent failed (exit_code != 0)             — send_telegram from launcher
+  * task error    (notify_error)
+
+The tracker itself is edited SILENTLY (disable_notification: true). Stage-change,
+agent-start, agent-finish and QG-pending no longer emit their own messages — they
+just refresh the tracker (or are log-only).
+"""
+
+import html
 import logging
+
 import httpx

 logger = logging.getLogger("orchestrator")
@@ -17,25 +35,115 @@ def _get_settings():
    return _settings


-def send_telegram(text: str):
-    """Send notification to Telegram. Fire-and-forget, never raises."""
+# --------------------------------------------------------------------------- #
+# Low-level Telegram primitives
+# --------------------------------------------------------------------------- #
+
+def send_telegram(text: str, disable_notification: bool = False):
+    """Send a notification to Telegram. Fire-and-forget, never raises.
+
+    Returns the Telegram message_id on success, else None (so callers that want
+    to track the message — the tracker — can store it; legacy callers ignore it).
+    """
    s = _get_settings()
    if not s.telegram_bot_token or not s.telegram_chat_id:
-        return
+        return None
    try:
        url = f"https://api.telegram.org/bot{s.telegram_bot_token}/sendMessage"
-        httpx.post(
+        resp = httpx.post(
            url,
            json={
                "chat_id": s.telegram_chat_id,
                "text": text,
                "parse_mode": "HTML",
-                "disable_notification": False,
+                "disable_notification": disable_notification,
            },
            timeout=5,
        )
+        data = resp.json()
+        if data.get("ok"):
+            return data["result"]["message_id"]
    except Exception:
        pass  # Never crash orchestrator due to notification failure
+    return None
+
+
+# edit_telegram outcome codes -> let update_task_tracker decide what to do:
+#   "ok"           edit applied -> nothing else to do
+#   "not_modified" Telegram says text is identical (400 "message is not
+#                  modified" / "exactly the same") -> success, NO new message
+#   "gone"         original message can't be edited (deleted / too old /
+#                  invalid id) -> caller must fall back to a NEW message
+#   "failed"       transient failure (network / timeout / 5xx / unknown 400)
+#                  -> caller must NOT send a new message (avoid duplicates)
+EDIT_OK = "ok"
+EDIT_NOT_MODIFIED = "not_modified"
+EDIT_GONE = "gone"
+EDIT_FAILED = "failed"
+
+# Telegram error descriptions that mean the message is permanently un-editable
+# (it is gone / orphaned) -> fall back to a fresh message.
+_GONE_MARKERS = (
+    "message to edit not found",
+    "message can't be edited",
+    "message_id_invalid",
+)
+# Telegram "nothing changed" -> treat as success, never a duplicate.
+_NOT_MODIFIED_MARKERS = (
+    "message is not modified",
+    "exactly the same",
+)
+
+
+def edit_telegram(message_id: int, text: str) -> str:
+    """Edit an existing Telegram message. Never raises.
+
+    Returns a distinguishable outcome (see EDIT_* constants) so the caller can
+    tell apart "all good" / "nothing changed" / "message gone" / "transient
+    failure" and only fall back to a NEW message when the original is truly gone.
+    """
+    s = _get_settings()
+    if not s.telegram_bot_token or not s.telegram_chat_id:
+        return EDIT_FAILED
+    try:
+        url = f"https://api.telegram.org/bot{s.telegram_bot_token}/editMessageText"
+        resp = httpx.post(
+            url,
+            json={
+                "chat_id": s.telegram_chat_id,
+                "message_id": message_id,
+                "text": text,
+                "parse_mode": "HTML",
+            },
+            timeout=5,
+        )
+        data = resp.json()
+        if data.get("ok"):
+            return EDIT_OK
+        # ok:false -> inspect the description to classify the 400.
+        desc = str(data.get("description") or "").lower()
+        if any(m in desc for m in _NOT_MODIFIED_MARKERS):
+            # Text is identical between transitions (e.g. repeat review cycle
+            # renders the same line). Nothing to do, NOT a duplicate.
+            logger.debug(
+                f"edit_telegram(mid={message_id}): not modified, skipping"
+            )
+            return EDIT_NOT_MODIFIED
+        if any(m in desc for m in _GONE_MARKERS):
+            logger.warning(
+                f"edit_telegram(mid={message_id}): message gone ({desc!r}), "
+                f"will fall back to a new message"
+            )
+            return EDIT_GONE
+        # Unknown 400 / other non-ok -> transient/unknown, do NOT duplicate.
+        logger.warning(
+            f"edit_telegram(mid={message_id}): edit failed ({desc!r})"
+        )
+        return EDIT_FAILED
+    except Exception as e:
+        # Network / timeout / 5xx -> transient, do NOT duplicate.
+        logger.warning(f"edit_telegram(mid={message_id}): transient error: {e}")
+        return EDIT_FAILED


 def _get_work_item_id(task_id: int) -> str:
@@ -50,26 +158,355 @@ def _get_work_item_id(task_id: int) -> str:
        return f"task-{task_id}"


+# --------------------------------------------------------------------------- #
+# Live task tracker
+# --------------------------------------------------------------------------- #
+
+# Pipeline stages shown in the tracker, in order, with their display label and
+# the agent whose agent_runs rows describe that stage's work. "Ревью БРД" is NOT
+# an agent stage — it is the human approve gate rendered between Analysis and
+# Architecture from the task's brd_review_* timestamps.
+_TRACKER_STAGES = [
+    ("analysis", "Analysis", "analyst"),
+    ("architecture", "Architecture", "architect"),
+    ("development", "Development", "developer"),
+    ("review", "Review", "reviewer"),
+    ("testing", "Testing", "tester"),
+    ("deploy", "Deploy", "deployer"),
+]
+
+# Map a pipeline stage -> the agent that is RUNNING while the task sits in it.
+# (development is entered after architecture finishes, etc.) Used to render the
+# "🔄 <Stage> … идёт" line for the currently-active stage.
+_BRD_LABEL = "\u0420\u0435\u0432\u044c\u044e \u0411\u0420\u0414"  # "Ревью БРД"
+
+_STAGE_ACTIVE_AGENT = {
+    "analysis": "analyst",
+    "architecture": "architect",
+    "development": "developer",
+    "review": "reviewer",
+    "testing": "tester",
+    "deploy": "deployer",
+}
+
+
+def _fmt_minutes(seconds) -> str:
+    """Render a duration in whole minutes: 0..59s -> '<1м', else '<n>м'."""
+    try:
+        seconds = int(seconds or 0)
+    except (TypeError, ValueError):
+        seconds = 0
+    if seconds <= 0:
+        return "0м"
+    if seconds < 60:
+        return "<1м"
+    return f"{seconds // 60}\u043c"
+
+
+def _parse_sql_ts(ts):
+    """Parse a SQLite 'YYYY-MM-DD HH:MM:SS' UTC timestamp -> aware datetime/None."""
+    if not ts:
+        return None
+    from datetime import datetime, timezone
+    for fmt in ("%Y-%m-%d %H:%M:%S", "%Y-%m-%dT%H:%M:%S"):
+        try:
+            return datetime.strptime(str(ts)[:19], fmt).replace(tzinfo=timezone.utc)
+        except (ValueError, TypeError):
+            continue
+    return None
+
+
+def _duration_seconds(started, finished):
+    """Seconds between two SQL timestamps; None if either is missing/unparseable."""
+    a = _parse_sql_ts(started)
+    b = _parse_sql_ts(finished)
+    if a is None or b is None:
+        return None
+    return max(int((b - a).total_seconds()), 0)
+
+
+def render_task_tracker(task_id: int) -> str:
+    """Build the full live-tracker text for a task from the DB (stateless render).
+
+    Pulls the task header (work_item_id, title, stage), every agent_runs row, and
+    the BRD-review timestamps, then renders:
+      - one '✅ <Stage> <dur> · <in>↓/<out>↑ · <cost> · <model>' line per finished
+        stage (latest run per stage),
+      - the '⏸️ Ревью БРД <dur> · твоё время[ ⏳]' line between Analysis/Architecture,
+      - a '🔄 <Stage> … идёт' line for the active (in-progress) stage,
+      - the '💰 <in>↓ / <out>↑ · <cost>' totals,
+      - on done: '⏱️ Всего .. · агенты .. · твоё ..' and a '🔗 PR / 📦' line.
+
+    Never raises (returns a minimal fallback string on error).
+    """
+    from .db import get_db
+    from .usage import fmt_tokens, fmt_cost, _input_total, short_model_name
+
+    try:
+        conn = get_db()
+        task = conn.execute(
+            "SELECT id, work_item_id, title, stage, created_at, updated_at, "
+            "brd_review_started_at, brd_review_ended_at "
+            "FROM tasks WHERE id=?",
+            (task_id,),
+        ).fetchone()
+        if not task:
+            conn.close()
+            return f"task-{task_id}"
+        runs = conn.execute(
+            "SELECT agent, started_at, finished_at, exit_code, input_tokens, "
+            "output_tokens, cache_read_tokens, cache_creation_tokens, cost_usd, model "
+            "FROM agent_runs WHERE task_id=? ORDER BY id ASC",
+            (task_id,),
+        ).fetchall()
+        conn.close()
+    except Exception as e:
+        logger.warning(f"render_task_tracker({task_id}) DB error: {e}")
+        return f"task-{task_id}"
+
+    work_item_id = task["work_item_id"] or f"task-{task_id}"
+    title = task["title"] or work_item_id
+    stage = task["stage"] or "created"
+    done = stage == "done"
+
+    # Latest completed run per agent (a stage may have multiple runs on retry;
+    # we show the most recent FINISHED, successful run for the stage line).
+    last_done = {}
+    agent_runs_by_agent = {}
+    for r in runs:
+        agent_runs_by_agent.setdefault(r["agent"], []).append(r)
+        if r["finished_at"] and (r["exit_code"] == 0 or r["exit_code"] is None):
+            last_done[r["agent"]] = r
+
+    # Totals across ALL runs (every input/output token + cost counts).
+    total_in = 0
+    total_out = 0
+    total_cost = 0.0
+    agent_seconds = 0
+    for r in runs:
+        usage = {
+            "input_tokens": r["input_tokens"],
+            "cache_read_tokens": r["cache_read_tokens"],
+            "cache_creation_tokens": r["cache_creation_tokens"],
+        }
+        total_in += _input_total(usage)
+        total_out += int(r["output_tokens"] or 0)
+        total_cost += float(r["cost_usd"] or 0.0)
+        d = _duration_seconds(r["started_at"], r["finished_at"])
+        if d is not None:
+            agent_seconds += d
+
+    esc_title = html.escape(title)
+    header = (
+        f"\U0001f389 {html.escape(work_item_id)} \u00b7 {esc_title} \u2014 \u0413\u041e\u0422\u041e\u0412\u041e"
+        if done
+        else f"\U0001f6e0\ufe0f {html.escape(work_item_id)} \u00b7 {esc_title}"
+    )
+    bar = "\u2501" * 22
+    lines = [header, bar]
+
+    def _stage_line(label, run):
+        usage = {
+            "input_tokens": run["input_tokens"],
+            "cache_read_tokens": run["cache_read_tokens"],
+            "cache_creation_tokens": run["cache_creation_tokens"],
+        }
+        in_tok = fmt_tokens(_input_total(usage))
+        out_tok = fmt_tokens(run["output_tokens"])
+        cost = fmt_cost(run["cost_usd"])
+        dur = _fmt_minutes(_duration_seconds(run["started_at"], run["finished_at"]))
+        model = short_model_name(run["model"])
+        model_suffix = f" \u00b7 {model}" if model else ""
+        return (
+            f"\u2705 {label:<13} {dur} \u00b7 "
+            f"{in_tok}\u2193/{out_tok}\u2191 \u00b7 {cost}{model_suffix}"
+        )
+
+    # BRD review line: between Analysis and Architecture, only once Analysis has
+    # produced a run (i.e. the gate is live). Time = human review delta.
+    brd_started = task["brd_review_started_at"]
+    brd_ended = task["brd_review_ended_at"]
+    review_seconds = _duration_seconds(brd_started, brd_ended)
+
+    for stage_key, label, agent in _TRACKER_STAGES:
+        run = last_done.get(agent)
+        # The stage is "in progress" only when it is the task's current stage AND
+        # there is an unfinished run for its agent (the agent is actually still
+        # working). A finished run with no in-flight run -> show the \u2705 result,
+        # even if the task still sits in that stage (just-finished snapshot).
+        agent_runs = agent_runs_by_agent.get(agent, [])
+        has_inflight = any(ar["finished_at"] is None for ar in agent_runs)
+        is_active_stage = (
+            _STAGE_ACTIVE_AGENT.get(stage) == agent
+            and stage == stage_key
+            and (has_inflight or run is None)
+        )
+        if is_active_stage:
+            # Live "\U0001f504 ... \u0438\u0434\u0451\u0442" line. Count how many times THIS stage's
+            # agent has run for this task; a 2nd+ run means we're re-doing the
+            # stage (e.g. review->development->review), so show "\u043f\u043e\u043f\u044b\u0442\u043a\u0430 N"
+            # to make the text change between cycles and to honestly show Slava
+            # the stage is being re-worked.
+            attempt = len(agent_runs)
+            if attempt >= 2:
+                lines.append(
+                    f"\U0001f504 {label} \u00b7 \u043f\u043e\u043f\u044b\u0442\u043a\u0430 {attempt} "
+                    f"\u2026 \u0438\u0434\u0451\u0442"
+                )
+            else:
+                lines.append(
+                    f"\U0001f504 {label:<13} \u2026   \u00b7 \u0438\u0434\u0451\u0442"
+                )
+        elif run is not None:
+            lines.append(_stage_line(label, run))
+        # else: not started yet -> not shown.
+
+        # Insert the BRD review line right after Analysis.
+        if stage_key == "analysis" and brd_started:
+            brd_label = f"{_BRD_LABEL:<13}"
+            if review_seconds is not None:
+                dur = _fmt_minutes(review_seconds)
+                lines.append(
+                    f"\u23f8\ufe0f {brd_label} {dur} \u00b7 \u0442\u0432\u043e\u0451 \u0432\u0440\u0435\u043c\u044f"
+                )
+            else:
+                # Still waiting on the human (ended not stamped yet).
+                from datetime import datetime, timezone
+                start_dt = _parse_sql_ts(brd_started)
+                waited = None
+                if start_dt is not None:
+                    waited = int(
+                        (datetime.now(timezone.utc) - start_dt).total_seconds()
+                    )
+                dur = _fmt_minutes(waited) if waited is not None else "\u2026"
+                lines.append(
+                    f"\u23f8\ufe0f {brd_label} {dur} \u00b7 \u0442\u0432\u043e\u0451 \u0432\u0440\u0435\u043c\u044f \u23f3"
+                )
+
+    lines.append(bar)
+    lines.append(
+        f"\U0001f4b0 {fmt_tokens(total_in)}\u2193 / {fmt_tokens(total_out)}\u2191 \u00b7 "
+        f"{fmt_cost(total_cost)}"
+    )
+
+    if done:
+        wall = _duration_seconds(task["created_at"], task["updated_at"])
+        wall_str = _fmt_minutes(wall) if wall is not None else "?"
+        review_str = _fmt_minutes(review_seconds) if review_seconds else "0м"
+        lines.append(
+            f"\u23f1\ufe0f \u0412\u0441\u0435\u0433\u043e {wall_str} \u00b7 "
+            f"\u0430\u0433\u0435\u043d\u0442\u044b {_fmt_minutes(agent_seconds)} \u00b7 "
+            f"\u0442\u0432\u043e\u0451 {review_str}"
+        )
+        link = _done_link(task_id, task["work_item_id"])
+        if link:
+            lines.append(link)
+
+    return "\n".join(lines)
+
+
+def _done_link(task_id: int, work_item_id) -> str | None:
+    """Build the final '🔗 PR #n · 📦 deployed' line. Never raises -> None."""
+    try:
+        from .config import settings
+        from .db import get_db
+        conn = get_db()
+        row = conn.execute(
+            "SELECT repo, branch FROM tasks WHERE id=?", (task_id,)
+        ).fetchone()
+        conn.close()
+        if not row:
+            return None
+        repo, branch = row["repo"], row["branch"]
+        pr_part = None
+        try:
+            owner = settings.gitea_owner
+            headers = {"Authorization": f"token {settings.gitea_token}"}
+            resp = httpx.get(
+                f"{settings.gitea_url}/api/v1/repos/{owner}/{repo}/pulls",
+                params={"state": "all", "head": branch},
+                headers=headers, timeout=5,
+            )
+            if resp.status_code == 200:
+                prs = resp.json()
+                if prs:
+                    pr_part = f"\U0001f517 PR #{prs[0].get('number')}"
+        except Exception:
+            pr_part = None
+        parts = []
+        if pr_part:
+            parts.append(pr_part)
+        parts.append("\U0001f4e6 deployed")
+        return " \u00b7 ".join(parts)
+    except Exception:
+        return None
+
+
+def update_task_tracker(task_id: int):
+    """Render + push the live tracker for a task. Never raises.
+
+    First call (no stored tracker_message_id): sendMessage (silent) and store the
+    returned message_id. Subsequent calls: editMessageText the stored message.
+    A NEW message is sent ONLY when the original is truly gone (deleted / too old
+    / invalid id). On "not modified" (text unchanged) or transient failures
+    (network / timeout / 5xx / unknown 400) we do NOT send a new message — that
+    is exactly what produced duplicate trackers and orphaned (lagging) messages.
+    The tracker is always sent with disable_notification so it never pings —
+    only the dedicated alert helpers ping.
+    """
+    try:
+        from .db import get_tracker_message_id, set_tracker_message_id
+        text = render_task_tracker(task_id)
+        mid = get_tracker_message_id(task_id)
+        if mid is not None:
+            result = edit_telegram(mid, text)
+            if result in (EDIT_OK, EDIT_NOT_MODIFIED):
+                # Edited in place (or nothing to change) -> done, no duplicate.
+                return
+            if result == EDIT_FAILED:
+                # Transient -> don't duplicate; tracker redraws next transition.
+                logger.debug(
+                    f"update_task_tracker({task_id}): edit failed transiently, "
+                    f"keeping message {mid}"
+                )
+                return
+            # result == EDIT_GONE -> the stored message is gone; fall through
+            # to send a fresh one and re-point tracker_message_id at it.
+        new_mid = send_telegram(text, disable_notification=True)
+        if new_mid is not None:
+            set_tracker_message_id(task_id, new_mid)
+    except Exception as e:
+        logger.warning(f"update_task_tracker({task_id}) failed: {e}")
+
+
+# --------------------------------------------------------------------------- #
+# Stage / agent lifecycle notifications  (now tracker-only, no separate message)
+# --------------------------------------------------------------------------- #
+
 def notify_stage_change(task_id: int, old_stage: str, new_stage: str, agent: str = None):
-    """Log and notify stage transition."""
+    """Log a stage transition and refresh the live tracker (no separate message)."""
    work_item_id = _get_work_item_id(task_id)
    msg = f"\U0001f504 {work_item_id}: {old_stage} \u2192 {new_stage}"
    if agent:
        msg += f" (\u0437\u0430\u043f\u0443\u0449\u0435\u043d {agent})"
    logger.info(msg)
-    send_telegram(msg)
+    update_task_tracker(task_id)


 def notify_agent_started(run_id: int, agent: str, task_id: int):
-    """Notify agent launch."""
+    """Log an agent launch and refresh the tracker (no separate message)."""
    work_item_id = _get_work_item_id(task_id)
-    msg = f"\U0001f680 {work_item_id}: {agent} \u0437\u0430\u043f\u0443\u0449\u0435\u043d (run_id={run_id})"
-    logger.info(msg)
-    send_telegram(msg)
+    logger.info(f"\U0001f680 {work_item_id}: {agent} \u0437\u0430\u043f\u0443\u0449\u0435\u043d (run_id={run_id})")
+    if task_id:
+        update_task_tracker(task_id)


 def notify_agent_finished(run_id: int, agent: str, exit_code: int, task_id: int = None, duration_s: int = None):
-    """Notify agent completion."""
+    """Log agent completion and refresh the tracker (no separate message).
+
+    The agent-FAILED alert (exit_code != 0) is still sent separately by the
+    launcher via send_telegram; this helper itself only logs + refreshes.
+    """
    work_item_id = _get_work_item_id(task_id) if task_id else "?"
    if exit_code == 0:
        dur = f" ({duration_s // 60} \u043c\u0438\u043d)" if duration_s else ""
@@ -79,47 +516,66 @@ def notify_agent_finished(run_id: int, agent: str, exit_code: int, task_id: int
    else:
        msg = f"\u274c {work_item_id}: {agent} \u0443\u043f\u0430\u043b (exit_code={exit_code})"
    logger.info(msg)
-    send_telegram(msg)
+    if task_id:
+        update_task_tracker(task_id)


 def notify_qg_result(task_id: int, check: str, passed: bool, reason: str = None):
-    """Notify QG check result."""
+    """Log a QG check result (NO separate Telegram message: QG-pending is noise).
+
+    Kept for callers; QG outcomes are log-only now and reflected by the tracker
+    through the resulting stage transition.
+    """
    work_item_id = _get_work_item_id(task_id)
    if passed:
-        msg = f"\u2705 {work_item_id}: QG {check} \u2014 passed"
+        logger.info(f"\u2705 {work_item_id}: QG {check} \u2014 passed")
    else:
-        msg = f"\u26a0\ufe0f {work_item_id}: QG {check} \u2014 failed: {reason}"
-    logger.info(msg)
-    send_telegram(msg)
+        logger.warning(f"\u26a0\ufe0f {work_item_id}: QG {check} \u2014 failed: {reason}")


 def notify_qg_failure(task_id: int, stage: str, check: str, reason: str):
-    """Log and notify QG check failure."""
+    """Log a QG check failure (log-only).
+
+    QG-pending / QG-failed are NOT pinged as separate messages anymore (they are
+    not actionable for Slava). Real rollbacks/deploy-fails are alerted by their
+    own dedicated send_telegram calls in the engine/launcher.
+    """
    work_item_id = _get_work_item_id(task_id)
-    msg = f"\u26a0\ufe0f {work_item_id}: QG {check} \u2014 failed: {reason}"
-    logger.warning(msg)
-    send_telegram(msg)
+    logger.warning(f"\u26a0\ufe0f {work_item_id}: QG {check} \u2014 failed: {reason}")


 def notify_approve_requested(task_id: int):
-    """Notify that analyst requests :approved:."""
+    """ALERT (separate, notifying): BRD/TZ/AC ready -> flip Plane to Approved.
+
+    Also starts the BRD-review clock and refreshes the tracker so the
+    '⏸️ Ревью БРД · твоё время ⏳' line appears.
+    """
    work_item_id = _get_work_item_id(task_id)
-    msg = f"\U0001f4cb {work_item_id}: BRD/\u0422\u0417/AC \u0433\u043e\u0442\u043e\u0432\u044b. \u0416\u0434\u0443 :approved: \u0432 Plane"
+    try:
+        from .db import mark_brd_review_started
+        mark_brd_review_started(task_id)
+    except Exception as e:
+        logger.warning(f"notify_approve_requested: brd clock start failed: {e}")
+    msg = (
+        f"\U0001f4cb {work_item_id}: BRD/\u0422\u0417/AC \u0433\u043e\u0442\u043e\u0432\u044b. "
+        f"\u041f\u0435\u0440\u0435\u0432\u0435\u0434\u0438\u0442\u0435 \u0437\u0430\u0434\u0430\u0447\u0443 \u0432 \u0441\u0442\u0430\u0442\u0443\u0441 Approved "
+        f"\u0432 Plane \u0434\u043b\u044f \u043f\u0440\u043e\u0434\u043e\u043b\u0436\u0435\u043d\u0438\u044f."
+    )
    logger.info(msg)
-    send_telegram(msg)
+    update_task_tracker(task_id)
+    send_telegram(msg)  # separate, notifying


 def notify_done(task_id: int):
-    """Notify task completion."""
+    """Task completion: refresh the tracker to its final ГОТОВО form (no separate ping)."""
    work_item_id = _get_work_item_id(task_id)
-    msg = f"\U0001f389 {work_item_id}: \u0437\u0430\u0434\u0430\u0447\u0430 \u0437\u0430\u0432\u0435\u0440\u0448\u0435\u043d\u0430!"
-    logger.info(msg)
-    send_telegram(msg)
+    logger.info(f"\U0001f389 {work_item_id}: \u0437\u0430\u0434\u0430\u0447\u0430 \u0437\u0430\u0432\u0435\u0440\u0448\u0435\u043d\u0430!")
+    update_task_tracker(task_id)


 def notify_error(task_id: int, error: str):
-    """Log and notify error for a task."""
+    """ALERT (separate, notifying): task error."""
    work_item_id = _get_work_item_id(task_id) if task_id else "system"
    msg = f"\U0001f534 {work_item_id}: ERROR \u2014 {error}"
    logger.error(msg)
-    send_telegram(msg)
+    send_telegram(msg)  # separate, notifying
--- a/src/plane_sync.py
+++ b/src/plane_sync.py
@@ -6,9 +6,53 @@ from .config import settings

 logger = logging.getLogger("orchestrator.plane_sync")

+# L-3: emoji literals used in Plane comment bodies, named for readability.
+# Message text stays byte-for-byte identical to the previous output.
+EMOJI_STAGE = "\U0001F504"      # stage transition
+EMOJI_QG_FAIL = "\u26A0\uFE0F"   # quality-gate failure
+EMOJI_DONE = "\u2705"           # task completed
+
 PLANE_BASE = f"{settings.plane_api_url}/api/v1"
 PLANE_HEADERS = {"X-API-Key": settings.plane_api_token}
 WORKSPACE = settings.plane_workspace_slug
+
+# feat(plane): per-agent comment authorship.
+# Map an agent role -> its dedicated Plane bot token (read from config / env).
+# When the token is present, add_comment() POSTs under that bot so Plane shows
+# the real author. Empty/unknown role -> fallback to the shared orchestrator
+# token (PLANE_HEADERS), so commenting stays autonomous.
+PLANE_BOT_TOKENS = {
+    "analyst": settings.plane_bot_analyst,
+    "architect": settings.plane_bot_architect,
+    "developer": settings.plane_bot_developer,
+    "reviewer": settings.plane_bot_reviewer,
+    "tester": settings.plane_bot_tester,
+    "deployer": settings.plane_bot_deployer,
+    "stream": settings.plane_bot_stream,
+}
+
+# Map a pipeline stage -> the agent role that owns work in that stage. Used to
+# pick an author for rollback/stage notifications targeting a specific stage.
+STAGE_AUTHORS = {
+    "analysis": "analyst",
+    "architecture": "architect",
+    "development": "developer",
+    "review": "reviewer",
+    "testing": "tester",
+    "deploy": "deployer",
+}
+
+
+def _headers_for(author: str | None) -> dict:
+    """Return X-API-Key headers for the given agent role.
+
+    Falls back to the shared orchestrator token (PLANE_HEADERS /
+    settings.plane_api_token) when the role is None, unknown, or its bot token
+    is not configured. This keeps comment posting autonomous: a comment is
+    always written, just attributed to the orchestrator if no bot is set.
+    """
+    tok = PLANE_BOT_TOKENS.get(author or "") if author else None
+    return {"X-API-Key": tok} if tok else PLANE_HEADERS
 PROJECT_ID = settings.plane_project_id or "7a79f0a9-5278-49cd-9007-9a338f238f9c"


@@ -40,7 +84,12 @@ def _resolve_project_id(work_item_id: str = None, project_id: str = None) -> str
            logger.debug(f"_resolve_project_id fallback for {work_item_id}: {e}")
    return PROJECT_ID

-# Plane state IDs
+# Plane state IDs.
+# TODO(ORCH-10): these UUIDs are PER-PROJECT. The 6 stage-visibility / verdict
+# statuses below were created only in the enduro project (7a79f0a9-...). One
+# project is in prod today, so a single global dict is acceptable. When more
+# projects are onboarded these must be resolved per project (see ORCH-10 in
+# BACKLOG.md / the ORCH-6 project registry) — do NOT hardcode globally then.
 PLANE_STATES = {
    "backlog": "113b24f6-cce8-4be9-9a22-a359b9cf0122",
    "todo": "2c7d3df3-9eb9-419b-92b7-d7d560bcdd10",
@@ -50,21 +99,140 @@ PLANE_STATES = {
    "blocked": "6c4543f9-ac47-4ef7-ae0f-070020dc9920",
    "done": "381a2833-3c4e-4be5-bd0f-be84cb946ad8",
    "cancelled": "b1cae7f9-961d-4889-a179-f3acea697d17",
+    # Feature 3 (stage visibility) — per-stage statuses on the board.
+    "architecture": "3020bbb7-6122-4663-930c-0315ba8dfa3d",
+    "development": "9920609b-f140-4e46-ab95-89acda8412c8",
+    "review": "ba0d802c-5218-41d4-ab43-978b0ea123ed",
+    "testing": "7855d807-b1bf-42ef-8dae-6cde0df92d02",
+    # Feature 2 (verdict statuses) — Approved / Rejected.
+    "approved": "a519a341-dada-4a91-8910-7604f82b79c5",
+    "rejected": "ba958f3c-5db5-461d-8f82-89425e413b97",
 }

-# Map orchestrator stages to Plane states
+# Feature 3: map an orchestrator stage -> the Plane status to show on the board
+# when the pipeline ENTERS that stage. analysis stays driven by the existing
+# in_progress/in_review/needs_input logic (no dedicated status). deploy keeps
+# in_progress until done. Needs Input / In Review / Blocked remain higher
+# priority and are set explicitly elsewhere — do NOT override them from here.
+STAGE_VISIBILITY_STATE = {
+    "architecture": "architecture",
+    "development": "development",
+    "review": "review",
+    "testing": "testing",
+}
+
+# Map orchestrator stages to Plane states (used by update_issue_state /
+# notify_stage_change). Feature 3: architecture/development/review/testing now
+# point at their dedicated board statuses so the task physically moves across
+# columns. analysis -> in_progress, deploy -> in_progress, done -> done.
 STAGE_TO_STATE = {
    "created": PLANE_STATES["todo"],
    "analysis": PLANE_STATES["in_progress"],
-    "architecture": PLANE_STATES["in_progress"],
-    "development": PLANE_STATES["in_progress"],
-    "review": PLANE_STATES["in_progress"],
-    "testing": PLANE_STATES["in_progress"],
+    "architecture": PLANE_STATES["architecture"],
+    "development": PLANE_STATES["development"],
+    "review": PLANE_STATES["review"],
+    "testing": PLANE_STATES["testing"],
    "deploy": PLANE_STATES["in_progress"],
    "done": PLANE_STATES["done"],
 }


+def fetch_issue_sequence_id(issue_id: str, project_id: str) -> int | None:
+    """M-6: GET the Plane issue by UUID and return its sequence_id (the
+    authoritative per-project number), or None if unavailable.
+
+    Returns None on network error, non-2xx, or a missing field - never raises,
+    so the webhook handler can fall back to DB increment and stay autonomous.
+    """
+    url = f"{PLANE_BASE}/workspaces/{WORKSPACE}/projects/{project_id}/issues/{issue_id}/"
+    try:
+        resp = httpx.get(url, headers=PLANE_HEADERS, timeout=10)
+        resp.raise_for_status()
+        seq = resp.json().get("sequence_id")
+        return int(seq) if seq is not None else None
+    except Exception as e:
+        logger.warning(f"fetch_issue_sequence_id failed for {issue_id}: {e}")
+        return None
+
+
+import re as _re
+
+
+def _strip_html(html: str) -> str:
+    """Crude HTML -> text: drop tags and collapse whitespace. Good enough to
+    feed QG-0's length check when Plane only gives us description_html."""
+    if not html:
+        return ""
+    text = _re.sub(r"<[^>]+>", " ", html)
+    return _re.sub(r"\s+", " ", text).strip()
+
+
+def fetch_issue_description(issue_id: str, project_id: str) -> str:
+    """BUG 1: GET the Plane issue by UUID and return its description text.
+
+    Plane's ``issue.updated`` webhook (e.g. a status change) only carries the
+    CHANGED fields, so ``description``/``description_stripped`` are usually
+    absent there. start_pipeline calls this to pull the full description from the
+    issue detail endpoint so QG-0 does not blow up on an empty payload field.
+
+    Reuses the exact GET issue detail endpoint / shared token already used by
+    ``fetch_issue_sequence_id`` (same URL, same PLANE_HEADERS). Prefers
+    ``description_stripped``; falls back to stripping ``description_html``.
+
+    Returns "" on network error, non-2xx, or a missing field - never raises, so
+    a Plane outage degrades to the honest "empty description" QG-0 path instead
+    of crashing the webhook.
+    """
+    url = f"{PLANE_BASE}/workspaces/{WORKSPACE}/projects/{project_id}/issues/{issue_id}/"
+    try:
+        resp = httpx.get(url, headers=PLANE_HEADERS, timeout=10)
+        resp.raise_for_status()
+        body = resp.json()
+        desc = body.get("description_stripped")
+        if desc and desc.strip():
+            return desc
+        return _strip_html(body.get("description_html") or "")
+    except Exception as e:
+        logger.warning(f"fetch_issue_description failed for {issue_id}: {e}")
+        return ""
+
+
+def fetch_issue_fields(issue_id: str, project_id: str) -> tuple[str, str]:
+    """BUG B: GET the Plane issue by UUID ONCE and return (name, description).
+
+    Plane's ``issue.updated`` webhook (e.g. a status change) only carries the
+    CHANGED fields, so BOTH ``name`` and ``description`` are usually absent in
+    the payload. start_pipeline needs the real title (for the branch slug) and
+    the real description (for the analyst .task.md). To avoid issuing two
+    separate issue-detail GETs (one for name, one for description), this single
+    request returns both.
+
+    Reuses the exact GET issue detail endpoint / shared token already used by
+    ``fetch_issue_sequence_id`` / ``fetch_issue_description``. For the
+    description it applies the same logic as ``fetch_issue_description``
+    (prefer ``description_stripped``, fall back to stripping
+    ``description_html``).
+
+    Returns ("", "") on network error, non-2xx, or missing body - never raises,
+    so a Plane outage degrades gracefully (caller keeps its payload fallbacks).
+    """
+    url = f"{PLANE_BASE}/workspaces/{WORKSPACE}/projects/{project_id}/issues/{issue_id}/"
+    try:
+        resp = httpx.get(url, headers=PLANE_HEADERS, timeout=10)
+        resp.raise_for_status()
+        body = resp.json()
+        name = (body.get("name") or "").strip()
+        desc = body.get("description_stripped")
+        if desc and desc.strip():
+            description = desc
+        else:
+            description = _strip_html(body.get("description_html") or "")
+        return name, description
+    except Exception as e:
+        logger.warning(f"fetch_issue_fields failed for {issue_id}: {e}")
+        return "", ""
+
+
 def find_issue_id(work_item_id: str, project_id: str = None) -> str | None:
    """Find Plane issue UUID by work_item_id (e.g. 'ET-002')."""
    project_id = _resolve_project_id(work_item_id, project_id)
@@ -89,25 +257,26 @@ def find_issue_id(work_item_id: str, project_id: str = None) -> str | None:
        resp.raise_for_status()
        data = resp.json()
        results = data.get("results", data if isinstance(data, list) else [])
+        # M-6: match by sequence_id directly (the authoritative per-project
+        # number), parsed from the work_item_id suffix - no hardcoded prefix.
+        try:
+            target_num = int(work_item_id.rsplit("-", 1)[1])
+        except (IndexError, ValueError):
+            target_num = None
        for issue in results:
-            seq = issue.get("sequence_id")
-            identifier = f"ET-{seq:03d}" if seq else ""
-            if identifier == work_item_id or work_item_id in issue.get("name", ""):
+            if target_num is not None and issue.get("sequence_id") == target_num:
                return issue["id"]
-        # Fallback: get all issues and match by sequence_id number
-        if work_item_id.startswith("ET-"):
-            try:
-                target_num = int(work_item_id.split("-")[1])
-            except (IndexError, ValueError):
-                target_num = None
-            if target_num:
-                resp2 = httpx.get(url, headers=PLANE_HEADERS, timeout=10)
-                resp2.raise_for_status()
-                data2 = resp2.json()
-                results2 = data2.get("results", data2 if isinstance(data2, list) else [])
-                for issue in results2:
-                    if issue.get("sequence_id") == target_num:
-                        return issue["id"]
+            if work_item_id in issue.get("name", ""):
+                return issue["id"]
+        # Fallback: get all issues and match by sequence_id number (any prefix)
+        if target_num is not None:
+            resp2 = httpx.get(url, headers=PLANE_HEADERS, timeout=10)
+            resp2.raise_for_status()
+            data2 = resp2.json()
+            results2 = data2.get("results", data2 if isinstance(data2, list) else [])
+            for issue in results2:
+                if issue.get("sequence_id") == target_num:
+                    return issue["id"]
    except Exception as e:
        logger.error(f"Failed to find issue for {work_item_id}: {e}")
    return None
@@ -134,8 +303,14 @@ def update_issue_state(work_item_id: str, stage: str, project_id: str = None):
        logger.error(f"Failed to update Plane state for {work_item_id}: {e}")


-def add_comment(work_item_id: str, text: str, project_id: str = None):
-    """Add a comment to Plane issue."""
+def add_comment(work_item_id: str, text: str, project_id: str = None, author: str = None):
+    """Add a comment to a Plane issue.
+
+    feat(plane): when ``author`` (an agent role) maps to a configured bot
+    token, the comment is POSTed under that bot so Plane shows the real author.
+    Otherwise it falls back to the shared orchestrator token (see
+    ``_headers_for``). GET/PATCH calls elsewhere keep using PLANE_HEADERS.
+    """
    project_id = _resolve_project_id(work_item_id, project_id)
    issue_id = find_issue_id(work_item_id, project_id)
    if not issue_id:
@@ -145,9 +320,9 @@ def add_comment(work_item_id: str, text: str, project_id: str = None):
    url = f"{PLANE_BASE}/workspaces/{WORKSPACE}/projects/{project_id}/issues/{issue_id}/comments/"
    html = f"<p>{text}</p>"
    try:
-        resp = httpx.post(url, headers=PLANE_HEADERS, json={"comment_html": html}, timeout=10)
+        resp = httpx.post(url, headers=_headers_for(author), json={"comment_html": html}, timeout=10)
        resp.raise_for_status()
-        logger.info(f"Plane: comment added to {work_item_id}")
+        logger.info(f"Plane: comment added to {work_item_id} (author={author or 'orchestrator'})")
    except Exception as e:
        logger.error(f"Failed to add comment to {work_item_id}: {e}")

@@ -168,11 +343,37 @@ def set_issue_blocked(work_item_id: str, project_id: str = None):
    _set_issue_state_direct(work_item_id, PLANE_STATES["blocked"], project_id)


+def set_issue_done(work_item_id: str, project_id: str = None):
+    """Observability fix: force the issue into the TERMINAL Done state.
+
+    Used by the deploy->done success path so a completed task always reaches the
+    terminal Plane state (it used to stick on In Progress because the merge
+    webhook bypassed the stage engine). Uses the existing PLANE_STATES['done']
+    UUID — the mapping itself is NOT changed.
+    """
+    _set_issue_state_direct(work_item_id, PLANE_STATES["done"], project_id)
+
+
 def set_issue_in_progress(work_item_id: str, project_id: str = None):
    """Set issue to 'In Progress' state — agent working."""
    _set_issue_state_direct(work_item_id, PLANE_STATES["in_progress"], project_id)


+def set_issue_stage_state(work_item_id: str, stage: str, project_id: str = None):
+    """Feature 3: move the issue to the board status for a pipeline stage.
+
+    Only the visible-stage statuses (architecture/development/review/testing)
+    are driven here — stages without a dedicated status (analysis/deploy) are a
+    no-op so the existing in_progress/in_review/needs_input logic stays in
+    charge. By design this does NOT touch Needs Input / In Review / Blocked,
+    which are higher priority and set explicitly by their own helpers.
+    """
+    state_key = STAGE_VISIBILITY_STATE.get(stage)
+    if not state_key:
+        return
+    _set_issue_state_direct(work_item_id, PLANE_STATES[state_key], project_id)
+
+
 def _set_issue_state_direct(work_item_id: str, state_id: str, project_id: str = None):
    """Set issue state directly by state_id."""
    project_id = _resolve_project_id(work_item_id, project_id)
@@ -194,7 +395,7 @@ def notify_stage_change(work_item_id: str, old_stage: str, new_stage: str, agent
    project_id = _resolve_project_id(work_item_id, project_id)
    update_issue_state(work_item_id, new_stage, project_id)

-    msg = f"🔄 Stage: {old_stage} → {new_stage}"
+    msg = f"{EMOJI_STAGE} Stage: {old_stage} → {new_stage}"
    if agent:
        msg += f" (launching {agent})"

@@ -227,16 +428,29 @@ def notify_stage_change(work_item_id: str, old_stage: str, new_stage: str, agent
    except Exception:
        pass

-    add_comment(work_item_id, msg, project_id)
+    # Stage transition is the orchestrator's own voice -> attribute to stream.
+    add_comment(work_item_id, msg, project_id, author="stream")


 def notify_qg_failure(work_item_id: str, stage: str, check: str, reason: str, project_id: str = None):
    """Notify Plane about QG failure."""
-    add_comment(work_item_id, f"⚠️ QG failed at {stage}: {check} — {reason}", project_id)
+    # QG failure belongs to the agent that owns the failing stage.
+    add_comment(
+        work_item_id,
+        f"{EMOJI_QG_FAIL} QG failed at {stage}: {check} — {reason}",
+        project_id,
+        author=STAGE_AUTHORS.get(stage, "stream"),
+    )


 def notify_done(work_item_id: str, project_id: str = None):
    """Mark issue as Done in Plane."""
    project_id = _resolve_project_id(work_item_id, project_id)
    update_issue_state(work_item_id, "done", project_id)
-    add_comment(work_item_id, "✅ Task completed! PR merged and deployed.", project_id)
+    # Deploy finished the task -> attribute the completion comment to Deployer.
+    add_comment(
+        work_item_id,
+        f"{EMOJI_DONE} Task completed! PR merged and deployed.",
+        project_id,
+        author="deployer",
+    )
--- a/src/preflight.py
+++ b/src/preflight.py
@@ -0,0 +1,106 @@
+"""ORCH-1 resilience: cheap preflight check (CLI / network available?).
+
+Goal: before the worker claims a job, confirm the claude CLI binary and runtime
+are reachable WITHOUT spending any tokens. We only do local/cheap checks:
+
+  1. os.path.exists(CLAUDE_BIN)          -- instant
+  2. `claude --version` (timeout 5s)     -- spawns CLI, does NOT call the API
+
+The result is cached for `preflight_cache_ttl` seconds so we do not re-run
+`claude --version` on every worker tick.
+
+🚫 We deliberately do NOT do a prompt ping (ping->pong) — that would burn the
+rate limit and add latency. Preflight is local-only.
+"""
+import os
+import time
+import logging
+import subprocess
+
+from .config import settings
+
+logger = logging.getLogger("orchestrator.preflight")
+
+_VERSION_TIMEOUT = 5
+
+
+class _PreflightCache:
+    def __init__(self):
+        self.ts: float = 0.0
+        self.ok: bool = False
+        self.reason: str = "not checked yet"
+
+
+_cache = _PreflightCache()
+
+
+def _claude_bin() -> str:
+    """Resolve the claude binary preflight should check.
+
+    Must match the binary the launcher actually spawns. The launcher hardcodes
+    AgentLauncher.CLAUDE_BIN for the real Popen, so we prefer that; we only fall
+    back to settings.claude_bin / a default if it is somehow unset. (Note: the
+    container's ORCH_CLAUDE_BIN may point elsewhere; preflight follows the path
+    that is genuinely executed, not the unused env override.)
+    """
+    try:
+        from .agents.launcher import AgentLauncher
+        launcher_bin = getattr(AgentLauncher, "CLAUDE_BIN", None)
+        if launcher_bin and os.path.exists(launcher_bin):
+            return launcher_bin
+        # Launcher path not present -> fall back to configured/default.
+        return launcher_bin or getattr(settings, "claude_bin", None) or "/opt/claude-code/bin/claude.exe"
+    except Exception:
+        return getattr(settings, "claude_bin", None) or "/opt/claude-code/bin/claude.exe"
+
+
+def _run_version(bin_path: str) -> tuple[bool, str]:
+    """`claude --version` — proves the CLI runs without touching the API."""
+    try:
+        r = subprocess.run(
+            [bin_path, "--version"],
+            capture_output=True,
+            text=True,
+            timeout=_VERSION_TIMEOUT,
+        )
+        if r.returncode == 0:
+            return True, (r.stdout or r.stderr or "").strip()[:120] or "ok"
+        return False, f"--version exit {r.returncode}: {(r.stderr or r.stdout).strip()[:120]}"
+    except subprocess.TimeoutExpired:
+        return False, f"--version timed out after {_VERSION_TIMEOUT}s"
+    except FileNotFoundError:
+        return False, "claude binary not found (FileNotFoundError)"
+    except Exception as e:  # pragma: no cover - defensive
+        return False, f"--version error: {e}"
+
+
+def _compute() -> tuple[bool, str]:
+    bin_path = _claude_bin()
+    if not os.path.exists(bin_path):
+        return False, f"CLAUDE_BIN not found: {bin_path}"
+    return _run_version(bin_path)
+
+
+def check(force: bool = False) -> tuple[bool, str]:
+    """Return (ok, reason). Cached for preflight_cache_ttl seconds.
+
+    force=True bypasses the cache (used by the breaker half-open probe / tests).
+    """
+    now = time.time()
+    ttl = settings.preflight_cache_ttl
+    if not force and _cache.ts > 0 and (now - _cache.ts) < ttl:
+        return _cache.ok, _cache.reason
+    ok, reason = _compute()
+    _cache.ts = now
+    _cache.ok = ok
+    _cache.reason = reason
+    if not ok:
+        logger.warning(f"Preflight FAIL: {reason}")
+    return ok, reason
+
+
+def reset_cache() -> None:
+    """Invalidate the cache (tests / forced recheck)."""
+    _cache.ts = 0.0
+    _cache.ok = False
+    _cache.reason = "reset"
--- a/src/qg/checks.py
+++ b/src/qg/checks.py
@@ -2,6 +2,7 @@

 import os
 import logging
+import subprocess
 import httpx
 from ..config import settings

@@ -249,9 +250,17 @@ def check_reviewer_verdict(repo: str, work_item_id: str, branch: str | None = No

 def check_tests_local(repo: str, branch: str) -> tuple[bool, str]:
    """
+    DEPRECATED: replaced by check_ci_green on the development stage (CI is now
+    configured). Kept for backward-compat; not wired to any stage.
+
    S-1 fix: run the project test suite locally and judge by exit code, instead of
    depending on Gitea CI (which is not configured -> always false).

+    БАГ 5 fix: invoke pytest directly instead of make test. make is not installed
+    in the orchestrator container, so the previous ["make", "test"] call raised
+    FileNotFoundError. This reproduces the Makefile test target 1:1
+    (cd src/api && python -m pytest ../../tests/ -v).
+
    ORCH-2 / S-4: tests run inside the per-branch worktree (ensure_worktree), so this
    is safe for concurrent active tasks — no shared /repos checkout race.
    """
@@ -259,7 +268,8 @@ def check_tests_local(repo: str, branch: str) -> tuple[bool, str]:
    try:
        repo_path = ensure_worktree(repo, branch)
        r = subprocess.run(
-            ["make", "test"], cwd=repo_path,
+            ["python", "-m", "pytest", "../../tests/", "-v"],
+            cwd=os.path.join(repo_path, "src", "api"),
            capture_output=True, text=True, timeout=600,
        )
        if r.returncode == 0:
@@ -272,6 +282,100 @@ def check_tests_local(repo: str, branch: str) -> tuple[bool, str]:
        return False, f"Local test run error: {e}"


+def _parse_deploy_status(content: str) -> tuple[bool, str]:
+    """Parse a 14-deploy-log.md body and map its `deploy_status:` frontmatter to a
+    quality-gate verdict. Reads ONLY the machine-readable YAML field, never prose.
+
+      deploy_status: SUCCESS -> (True,  "Deploy status: SUCCESS")
+      deploy_status: FAILED  -> (False, "Deploy status: FAILED")
+      missing field / no frontmatter / bad YAML -> (False, <reason>)
+    """
+    import yaml
+    status = None
+    if content.startswith("---"):
+        parts = content.split("---", 2)
+        if len(parts) >= 3:
+            try:
+                fm = yaml.safe_load(parts[1]) or {}
+            except yaml.YAMLError as e:
+                return False, f"Invalid YAML frontmatter in deploy log: {e}"
+            status = str(fm.get("deploy_status", "")).upper().strip()
+    if status == "SUCCESS":
+        return True, "Deploy status: SUCCESS"
+    if status == "FAILED":
+        return False, "Deploy status: FAILED"
+    return False, f"No machine-readable deploy_status in frontmatter (got: {status!r})"
+
+
+def _deploy_log_from_main(repo: str, work_item_id: str) -> str | None:
+    """Best-effort read of 14-deploy-log.md from origin/main on the shared clone.
+
+    The deployer writes 14-deploy-log.md and merges the deploy artifacts into main
+    via a separate PR (see ET-013), so the file lands in origin/main, NOT in the
+    feature branch worktree the gate normally reads. This recovers it from main.
+
+    Degrades gracefully: any git failure (no clone, network/fetch error, file
+    absent in main) returns None instead of raising, so the caller falls back to
+    the plain "not found" verdict. Never raises.
+    """
+    repo_clone = os.path.join(settings.repos_dir, repo)
+    if not os.path.isdir(os.path.join(repo_clone, ".git")):
+        return None
+    rel = f"docs/work-items/{work_item_id}/14-deploy-log.md"
+    try:
+        # Refresh origin/main so we see freshly-merged deploy artifacts.
+        subprocess.run(
+            ["git", "-C", repo_clone, "fetch", "origin", "main"],
+            check=False, capture_output=True, timeout=30,
+        )
+        show = subprocess.run(
+            ["git", "-C", repo_clone, "show", f"origin/main:{rel}"],
+            check=False, capture_output=True, text=True, timeout=15,
+        )
+    except (subprocess.SubprocessError, OSError) as e:
+        logger.warning("deploy-log origin/main lookup failed for %s/%s: %s", repo, work_item_id, e)
+        return None
+    if show.returncode != 0:
+        return None
+    return show.stdout
+
+
+def check_deploy_status(repo: str, work_item_id: str, branch: str | None = None) -> tuple[bool, str]:
+    """
+    БАГ 8 fix: gate the deploy -> done transition on the deployer's machine-readable
+    verdict in 14-deploy-log.md frontmatter, NOT on the LLM process exit code
+    (which is always 0 on a successful agent session even when the deploy failed).
+
+    Mirrors check_reviewer_verdict (S-5): reads ONLY `deploy_status:` from YAML
+    frontmatter. Returns:
+      (True, ...)  -> deploy_status: SUCCESS
+      (False, ...) -> deploy_status: FAILED, missing field, or no frontmatter
+
+    ET-013 path-sync fix: the deployer writes 14-deploy-log.md and merges the deploy
+    artifacts into main via a SEPARATE PR, so the log lands in origin/main, not in
+    the feature-branch worktree this gate reads via _repo_path(repo, branch). If the
+    file is absent in the worktree we fall back to reading it from origin/main on the
+    shared clone. Lookup order: worktree -> origin/main -> not found.
+    """
+    repo_path = _repo_path(repo, branch)
+    log_path = os.path.join(repo_path, f"docs/work-items/{work_item_id}/14-deploy-log.md")
+
+    if os.path.isfile(log_path):
+        try:
+            with open(log_path, "r") as f:
+                content = f.read()
+        except OSError as e:
+            return False, f"Error reading deploy log: {e}"
+        return _parse_deploy_status(content)
+
+    # Not in the feature worktree — the deployer may have merged it into main.
+    main_content = _deploy_log_from_main(repo, work_item_id)
+    if main_content is not None:
+        return _parse_deploy_status(main_content)
+
+    return False, "Deploy log not found (14-deploy-log.md)"
+
+
 # Registry for dynamic lookup by name
 QG_CHECKS = {
    "check_analysis_approved": check_analysis_approved,
@@ -282,4 +386,5 @@ QG_CHECKS = {
    "check_tests_passed": check_tests_passed,
    "check_reviewer_verdict": check_reviewer_verdict,
    "check_tests_local": check_tests_local,
+    "check_deploy_status": check_deploy_status,
 }
--- a/src/queue_worker.py
+++ b/src/queue_worker.py
@@ -0,0 +1,246 @@
+"""ORCH-1 (F-2b): background job-queue worker with resilience layer.
+
+A single background thread polls the `jobs` table and spawns agents:
+
+    while running:
+        if breaker.open and not cooled_down: sleep; continue   # don't touch CLI
+        if not preflight.ok: sleep; continue                   # CLI/net down -> wait
+        while count_running_jobs() < max_concurrency:
+            job = claim_next_job()              # atomic queued -> running (available_at-gated)
+            if not job: break
+            launcher.launch_job(job)            # spawns claude (Popen) + monitor thread
+        sleep(poll_interval)
+
+Resilience (ДОПОЛНЕНИЕ):
+  A. Preflight — cheap local CLI/net check (cached, no tokens) gates claiming.
+  B/C. The launcher classifies failures (transient vs permanent) and applies
+       backoff via available_at; the worker only needs to honour available_at
+       (claim_next_job does) and react to transient outcomes via the breaker.
+  D. Circuit breaker — N consecutive transient failures -> open (pause M minutes,
+       no CLI calls, Telegram alert) -> half-open (probe one job) -> closed.
+
+Design: plain daemon thread + threading.Event (the launcher already manages its
+own monitor/watchdog threads + blocking Popen).
+"""
+import time
+import logging
+import threading
+
+from .config import settings
+from .db import claim_next_job, count_running_jobs
+from .agents.launcher import launcher
+from . import preflight
+
+logger = logging.getLogger("orchestrator.queue_worker")
+
+
+class CircuitBreaker:
+    """Trips after `threshold` consecutive transient failures.
+
+    States: closed -> (threshold transient) -> open -> (after pause) half-open
+            -> (recovered) closed | (transient again) open.
+    Thread-safe enough for our single-worker + monitor-thread callbacks (a lock
+    guards the counters).
+    """
+
+    def __init__(self, threshold: int = None, pause_seconds: int = None):
+        self.threshold = threshold if threshold is not None else settings.breaker_threshold
+        self.pause_seconds = (
+            pause_seconds if pause_seconds is not None else settings.breaker_pause_seconds
+        )
+        self._lock = threading.Lock()
+        self.state = "closed"          # closed | open | half-open
+        self.consecutive_transient = 0
+        self.opened_at = 0.0
+        self._notify = None            # optional callable(message) for alerts
+
+    def set_notifier(self, fn):
+        self._notify = fn
+
+    def record_transient(self):
+        with self._lock:
+            self.consecutive_transient += 1
+            if self.state == "half-open":
+                # Probe failed -> re-open.
+                self._open("circuit re-opened: probe job hit transient again")
+            elif self.consecutive_transient >= self.threshold and self.state == "closed":
+                self._open(
+                    f"circuit OPEN: {self.consecutive_transient} consecutive "
+                    f"transient failures; pausing {self.pause_seconds}s (no CLI calls)"
+                )
+
+    def record_recovered(self):
+        with self._lock:
+            self.consecutive_transient = 0
+            if self.state in ("half-open", "open"):
+                self.state = "closed"
+                logger.info("Circuit CLOSED: recovered")
+
+    def record_permanent(self):
+        # A clean permanent (code-fault) failure breaks the transient streak.
+        with self._lock:
+            self.consecutive_transient = 0
+
+    def _open(self, msg: str):
+        self.state = "open"
+        self.opened_at = time.time()
+        logger.warning(msg)
+        if self._notify:
+            try:
+                self._notify(f"\U0001f534 {msg}")
+            except Exception:
+                pass
+
+    def allow_claim(self) -> bool:
+        """Return True if the worker may attempt to claim/launch a job now.
+
+        - closed   -> yes.
+        - open      -> no until pause elapsed; then transition to half-open (yes, one probe).
+        - half-open -> yes (the single probe).
+        """
+        with self._lock:
+            if self.state == "closed":
+                return True
+            if self.state == "open":
+                if (time.time() - self.opened_at) >= self.pause_seconds:
+                    self.state = "half-open"
+                    logger.info("Circuit HALF-OPEN: probing one job")
+                    return True
+                return False
+            # half-open: allow the probe.
+            return True
+
+    def snapshot(self) -> dict:
+        with self._lock:
+            remaining = 0
+            if self.state == "open":
+                remaining = max(0, int(self.pause_seconds - (time.time() - self.opened_at)))
+            return {
+                "state": self.state,
+                "consecutive_transient": self.consecutive_transient,
+                "pause_remaining_s": remaining,
+            }
+
+
+class QueueWorker:
+    """Background worker that drains the persistent job queue (with resilience)."""
+
+    def __init__(self, max_concurrency: int = None, poll_interval: float = None,
+                 breaker: CircuitBreaker = None):
+        self.max_concurrency = (
+            max_concurrency if max_concurrency is not None else settings.max_concurrency
+        )
+        self.poll_interval = (
+            poll_interval if poll_interval is not None else settings.queue_poll_interval
+        )
+        self.breaker = breaker or CircuitBreaker()
+        self.last_preflight_ok = True
+        self.last_preflight_reason = "not checked"
+        self._stop = threading.Event()
+        self._thread: threading.Thread | None = None
+
+    # --- circuit breaker outcome callback wired into the launcher ----------
+    def _on_outcome(self, transient: bool, recovered: bool):
+        if recovered:
+            self.breaker.record_recovered()
+        elif transient:
+            self.breaker.record_transient()
+        else:
+            self.breaker.record_permanent()
+
+    def _drain_once(self):
+        """Claim and launch jobs until concurrency is full or the queue is empty.
+
+        Gated by the circuit breaker and preflight: if the breaker is open (and
+        not yet cooled down) or preflight fails, we do NOT claim — jobs stay
+        queued and no CLI/tokens are touched.
+        """
+        if not self.breaker.allow_claim():
+            return
+        ok, reason = preflight.check()
+        self.last_preflight_ok = ok
+        self.last_preflight_reason = reason
+        if not ok:
+            logger.info(f"Preflight not ok ({reason}) -> not claiming jobs this tick")
+            return
+
+        # In half-open we only probe a single job, regardless of max_concurrency.
+        half_open = self.breaker.snapshot()["state"] == "half-open"
+        launched = 0
+        while not self._stop.is_set():
+            if half_open and launched >= 1:
+                return
+            if count_running_jobs() >= self.max_concurrency:
+                return
+            job = claim_next_job()
+            if not job:
+                return
+            launched += 1
+            try:
+                run_id = launcher.launch_job(job)
+                logger.info(
+                    f"Worker launched job {job['id']} ({job['agent']}, "
+                    f"repo {job['repo']}) -> run_id={run_id}"
+                )
+            except Exception as e:
+                # Launch itself failed (e.g. repo missing): treat as a permanent
+                # launch error so the job does not wedge as 'running' forever.
+                logger.error(f"Worker failed to launch job {job['id']}: {e}")
+                try:
+                    from .db import get_job, mark_job
+
+                    j = get_job(job["id"])
+                    attempts = j.get("attempts", 0) if j else 0
+                    max_attempts = j.get("max_attempts", 2) if j else 2
+                    if attempts < max_attempts:
+                        mark_job(job["id"], "queued", error=f"launch error: {e}")
+                    else:
+                        mark_job(job["id"], "failed", error=f"launch error: {e}")
+                except Exception:
+                    pass
+
+    def _run(self):
+        logger.info(
+            f"Queue worker started (max_concurrency={self.max_concurrency}, "
+            f"poll_interval={self.poll_interval}s, breaker_threshold={self.breaker.threshold})"
+        )
+        while not self._stop.is_set():
+            try:
+                self._drain_once()
+            except Exception as e:
+                logger.error(f"Queue worker loop error: {e}")
+            self._stop.wait(self.poll_interval)
+        logger.info("Queue worker stopped")
+
+    def start(self):
+        if self._thread and self._thread.is_alive():
+            return
+        # Wire breaker alerting + launcher outcome callback.
+        try:
+            from .notifications import send_telegram
+            self.breaker.set_notifier(send_telegram)
+        except Exception:
+            pass
+        launcher.on_outcome = self._on_outcome
+        self._stop.clear()
+        self._thread = threading.Thread(
+            target=self._run, name="queue-worker", daemon=True
+        )
+        self._thread.start()
+
+    def stop(self, timeout: float = 5.0):
+        self._stop.set()
+        if self._thread:
+            self._thread.join(timeout=timeout)
+
+    def status(self) -> dict:
+        """Resilience snapshot for /queue."""
+        return {
+            "breaker": self.breaker.snapshot(),
+            "preflight_ok": self.last_preflight_ok,
+            "preflight_reason": self.last_preflight_reason,
+        }
+
+
+# Module-level singleton used by the FastAPI lifespan.
+worker = QueueWorker()
--- a/src/stage_engine.py
+++ b/src/stage_engine.py
@@ -0,0 +1,546 @@
+"""Unified stage engine (ORCH-4 / M-3).
+
+Single source of truth for "an agent finished / a human approved -> run the
+stage's quality gate and either advance the pipeline or roll it back".
+
+Before ORCH-4 this logic was duplicated in two places that had silently
+diverged:
+  - src/agents/launcher.py::_try_advance_stage (sync, rich business logic:
+    analyst approved-flow, reviewer REQUEST_CHANGES rollback+retry, tester FAIL
+    rollback+retry, architect conflict rollback) — but it picked the next agent
+    with get_agent_for_stage(next_stage), which is WRONG.
+  - src/webhooks/plane.py::_try_advance_stage (async, leaner, but it had the
+    check_review_approved PR-by-branch dispatch and used the CORRECT
+    get_agent_for_stage(current_stage)).
+
+This module merges both into one sync `advance_stage(...)`. launcher calls it
+directly; the plane webhook calls it through asyncio.to_thread so there is
+exactly one implementation.
+
+Agent-selection bug fix (ORCH-4):
+  stages.py defines `agent` as "the agent to launch when advancing FROM this
+  stage". So when advancing current -> next, the correct agent to launch is
+  get_agent_for_stage(current_stage). launcher's old next_stage lookup skipped a
+  stage (e.g. analysis->architecture launched 'developer' instead of
+  'architect'). plane and gitea already used current_stage; we unify on that.
+"""
+
+import logging
+import os
+from dataclasses import dataclass, field
+
+from .db import get_db, update_task_stage, enqueue_job
+from .stages import get_next_stage, get_qg_for_stage, get_agent_for_stage
+from .git_worktree import get_worktree_path
+from .qg.checks import QG_CHECKS
+from .notifications import (
+    notify_stage_change,
+    notify_qg_failure,
+    notify_approve_requested,
+    send_telegram,
+)
+from .plane_sync import (
+    notify_stage_change as plane_notify_stage,
+    notify_qg_failure as plane_notify_qg,
+    add_comment as plane_add_comment,
+    set_issue_in_review,
+    set_issue_needs_input,
+    set_issue_in_progress,
+    set_issue_blocked,
+    set_issue_done,
+)
+from .config import settings
+
+logger = logging.getLogger("orchestrator.stage_engine")
+
+MAX_DEVELOPER_RETRIES = 3
+
+
+@dataclass
+class AdvanceResult:
+    """Outcome of an advance_stage() call (mostly for tests/observability)."""
+
+    advanced: bool = False
+    from_stage: str | None = None
+    to_stage: str | None = None
+    enqueued_agent: str | None = None
+    enqueued_job_id: int | None = None
+    qg_name: str | None = None
+    qg_passed: bool | None = None
+    qg_reason: str | None = None
+    rolled_back_to: str | None = None
+    alerted: bool = False
+    note: str | None = None
+    notes: list = field(default_factory=list)
+
+
+def _run_qg(qg_name: str, repo: str, work_item_id: str, branch: str):
+    """Dispatch a quality-gate check to the right signature and run it.
+
+    Signatures (unified from launcher + plane):
+      - check_ci_green / check_tests_local      -> (repo, branch)
+      - check_review_approved                   -> (repo, pr_number) [PR found by branch]
+      - everything else (artifact checks)       -> (repo, work_item_id, branch)
+
+    Returns (passed: bool, reason: str).
+    """
+    check_fn = QG_CHECKS.get(qg_name)
+    if not check_fn:
+        logger.error(f"QG function '{qg_name}' not found in registry")
+        return False, f"Unknown QG: {qg_name}"
+
+    if qg_name in ("check_ci_green", "check_tests_local"):
+        # (repo, branch) — already worktree-aware.
+        return check_fn(repo, branch)
+
+    if qg_name == "check_review_approved":
+        # Special case kept from plane: find the open PR for this branch via
+        # Gitea, then check it; fall back to a file-based review marker.
+        return _check_review_approved_by_branch(check_fn, repo, work_item_id, branch)
+
+    # All other artifact checks: (repo, work_item_id, branch). Pass branch so the
+    # check reads from the task worktree (ORCH-2 / S-4).
+    return check_fn(repo, work_item_id or "", branch)
+
+
+def _check_review_approved_by_branch(check_fn, repo: str, work_item_id: str, branch: str):
+    """check_review_approved dispatch preserved from plane._try_advance_stage.
+
+    Finds the open PR whose head ref == branch via the Gitea API and runs
+    check_review_approved(repo, pr_number). If no open PR exists, falls back to a
+    file-based review marker (12-review.md / 09-review.md) like the original.
+    """
+    import httpx as _httpx
+
+    owner = settings.gitea_owner
+    url = f"{settings.gitea_url}/api/v1/repos/{owner}/{repo}/pulls?state=open&limit=50"
+    headers = {"Authorization": f"token {settings.gitea_token}"}
+    try:
+        resp = _httpx.get(url, headers=headers, timeout=10)
+        prs = resp.json()
+        pr_number = None
+        for pr in prs:
+            if pr.get("head", {}).get("ref") == branch:
+                pr_number = pr["number"]
+                break
+        if pr_number:
+            return check_fn(repo, pr_number)
+        # No open PR but a review file may exist — check file-based.
+        wt = get_worktree_path(repo, branch)
+        if not os.path.isdir(wt):
+            wt = os.path.join(settings.repos_dir, repo)
+        review_path = os.path.join(wt, f"docs/work-items/{work_item_id}/12-review.md")
+        review_path2 = os.path.join(wt, f"docs/work-items/{work_item_id}/09-review.md")
+        if os.path.isfile(review_path) or os.path.isfile(review_path2):
+            return True, "Review file exists (file-based approval)"
+        return False, "No open PR found and no review file"
+    except Exception as e:
+        return False, f"Error finding PR: {e}"
+
+
+def _developer_retry_count(task_id: int) -> int:
+    """How many developer runs have already happened for this task."""
+    conn = get_db()
+    n = conn.execute(
+        "SELECT COUNT(*) FROM agent_runs WHERE task_id=? AND agent='developer'",
+        (task_id,),
+    ).fetchone()[0]
+    conn.close()
+    return n
+
+
+def advance_stage(
+    task_id: int,
+    current_stage: str,
+    repo: str,
+    work_item_id: str,
+    branch: str,
+    finished_agent: str | None = None,
+) -> AdvanceResult:
+    """Run the current stage's quality gate and advance / roll back the pipeline.
+
+    This is the single merged implementation (ORCH-4 / M-3). It is synchronous;
+    the async plane webhook calls it via asyncio.to_thread.
+
+    Args:
+      task_id:        tasks.id
+      current_stage:  the stage the task is currently in
+      repo:           repository name
+      work_item_id:   Plane work item id (may be "" / None)
+      branch:         feature branch
+      finished_agent: the agent that just finished (launcher path). Drives the
+                      approved/REQUEST_CHANGES/tester/architect branches. In the
+                      plane webhook path it is None, so those agent-specific
+                      branches simply do not trigger (matches old plane behavior).
+
+    Returns AdvanceResult describing what happened.
+    """
+    result = AdvanceResult(from_stage=current_stage)
+    agent = finished_agent
+    try:
+        qg_name = get_qg_for_stage(current_stage)
+        next_stage = get_next_stage(current_stage)
+        result.qg_name = qg_name
+        result.to_stage = next_stage
+
+        if not next_stage:
+            logger.info(f"Task {task_id}: already at terminal stage '{current_stage}'")
+            result.note = "terminal"
+            return result
+
+        # --- Quality gate ----------------------------------------------------
+        if qg_name and qg_name in QG_CHECKS:
+            # Human-approval gate: split by path.
+            if qg_name == "check_analysis_approved":
+                # Launcher path (analyst just finished): set In Review + ask for
+                # the Approved status. This gate never advances on its own -- a
+                # human Approved verdict does that.
+                if agent == "analyst":
+                    _handle_analysis_approved_flow(
+                        task_id, current_stage, repo, work_item_id, branch, agent, result
+                    )
+                    return result
+                # Webhook Approved-verdict path (agent is None): the human flipped
+                # the Plane status to Approved, which IS the approval. The gate is
+                # satisfied -- do NOT re-run check_analysis_approved (it looks for
+                # an :approved: *comment* and would block on a status-only
+                # approval). Mark it passed and fall through to the Advance block.
+                result.qg_name = qg_name
+                result.qg_passed = True
+                result.qg_reason = "approved-via-status"
+            else:
+                passed, reason = _run_qg(qg_name, repo, work_item_id, branch)
+                result.qg_passed = passed
+                result.qg_reason = reason
+
+                if not passed:
+                    logger.info(
+                        f"Task {task_id}: QG '{qg_name}' not passed after {agent}: {reason}"
+                    )
+                    # Behaviour parity:
+                    #  - webhook path (finished_agent is None): emit the generic
+                    #    QG-failure notification, exactly like the old plane handler.
+                    #  - launcher path (finished_agent set): NO generic notification;
+                    #    the rollback branches below own their own messaging, exactly
+                    #    like the old launcher handler.
+                    if agent is None:
+                        notify_qg_failure(task_id, current_stage, qg_name, reason)
+                        plane_notify_qg(work_item_id, current_stage, qg_name, reason)
+
+                    _handle_qg_failure_rollbacks(
+                        task_id, current_stage, repo, work_item_id, branch,
+                        agent, qg_name, reason, result,
+                    )
+                    return result
+
+        elif qg_name:
+            # QG name set but not registered — do not advance (launcher behavior).
+            result.note = f"qg '{qg_name}' not in registry"
+            return result
+
+        # --- Advance ---------------------------------------------------------
+        update_task_stage(task_id, next_stage)
+        # Telegram live tracker: the analysis->architecture advance is the human
+        # Approved gate clearing -> stamp the END of "Ревью БРД" (the only
+        # human time). Idempotent: only the first stamp counts.
+        if current_stage == "analysis" and next_stage == "architecture":
+            try:
+                from .db import mark_brd_review_ended
+                mark_brd_review_ended(task_id)
+            except Exception as e:
+                logger.warning(f"Task {task_id}: brd review end stamp failed: {e}")
+        notify_stage_change(task_id, current_stage, next_stage)
+        plane_notify_stage(work_item_id, current_stage, next_stage)
+        result.advanced = True
+        logger.info(
+            f"Task {task_id}: {current_stage} -> {next_stage} "
+            f"(auto-advance after {agent})"
+        )
+
+        # --- Terminal sync: deploy -> done must reach Plane's Done -----------
+        # When the deployer's check_deploy_status passes we advance to the
+        # terminal 'done' stage. Previously a merged-PR webhook completed the
+        # task out-of-band and Plane stuck on In Progress. Now done flows through
+        # here, so explicitly drive the Plane issue into the terminal Done state
+        # (PLANE_STATES['done'] — mapping unchanged) in addition to the
+        # stage-change comment above.
+        if next_stage == "done" and work_item_id:
+            try:
+                set_issue_done(work_item_id)
+                logger.info(
+                    f"Task {task_id}: deploy->done, Plane state forced to Done"
+                )
+            except Exception as e:
+                logger.error(f"Task {task_id}: failed to set Plane Done: {e}")
+
+        # --- Launch the next agent (ORCH-4 fix: current_stage, not next) -----
+        next_agent = get_agent_for_stage(current_stage)
+        if next_agent:
+            task_desc = (
+                f"Work item: {work_item_id}\nRepo: {repo}\n"
+                f"Branch: {branch}\nStage: {next_stage}"
+            )
+            new_job_id = enqueue_job(next_agent, repo, task_desc, task_id=task_id)
+            result.enqueued_agent = next_agent
+            result.enqueued_job_id = new_job_id
+            logger.info(
+                f"Task {task_id}: enqueued '{next_agent}' (job_id={new_job_id})"
+            )
+
+        return result
+
+    except Exception as e:
+        logger.error(f"advance_stage failed for task_id={task_id}: {e}")
+        result.note = f"error: {e}"
+        return result
+
+
+def _build_analyst_ready_comment(repo: str, work_item_id: str, branch: str) -> str:
+    """BUG C: HTML comment posted when analyst artifacts are ready.
+
+    Status-only model (PR #12): approval is the **Approved** status, NOT a
+    ``:approved:`` comment and NOT moving back to In Progress. The comment asks
+    the stakeholder to flip the status and links the documents the analyst
+    actually produced.
+
+    Links point at the Gitea web view:
+      {gitea_url}/{owner}/{repo}/src/branch/{branch}/docs/work-items/{wid}/<file>
+    Only files that REALLY exist in the worktree are listed (no invented docs).
+    """
+    text = (
+        "\u2705 BRD/\u0422\u0417/AC \u0433\u043e\u0442\u043e\u0432\u044b. "
+        "\u0414\u043b\u044f \u043f\u0440\u043e\u0434\u0432\u0438\u0436\u0435\u043d\u0438\u044f "
+        "\u043f\u0435\u0440\u0435\u0432\u0435\u0434\u0438\u0442\u0435 \u0437\u0430\u0434\u0430\u0447\u0443 "
+        "\u0432 \u0441\u0442\u0430\u0442\u0443\u0441 Approved. "
+        "\u0414\u043b\u044f \u043e\u0442\u043a\u043b\u043e\u043d\u0435\u043d\u0438\u044f \u2014 "
+        "\u043d\u0430\u043f\u0438\u0448\u0438\u0442\u0435 \u043f\u0440\u0438\u0447\u0438\u043d\u0443 "
+        "\u043a\u043e\u043c\u043c\u0435\u043d\u0442\u043e\u043c \u0438 \u043f\u0435\u0440\u0435\u0432\u0435\u0434\u0438\u0442\u0435 "
+        "\u0432 Rejected."
+    )
+
+    # Candidate analyst artifacts (label -> filename). Only existing ones linked.
+    candidates = [
+        ("Business request", "00-business-request.md"),
+        ("BRD", "01-brd.md"),
+        ("\u0422\u0417 (TRZ)", "02-trz.md"),
+        ("Acceptance Criteria", "03-acceptance-criteria.md"),
+        ("Test Plan", "04-test-plan.yaml"),
+        ("UI Test Cases", "04b-ui-test-cases.md"),
+    ]
+    rel_dir = f"docs/work-items/{work_item_id}"
+    try:
+        wt_dir = os.path.join(get_worktree_path(repo, branch), rel_dir)
+    except Exception:
+        wt_dir = None
+
+    owner = getattr(settings, "gitea_owner", "admin")
+    base = (getattr(settings, "gitea_public_url", "") or settings.gitea_url).rstrip("/")
+    links = []
+    for label, fname in candidates:
+        if wt_dir and not os.path.isfile(os.path.join(wt_dir, fname)):
+            continue
+        href = f"{base}/{owner}/{repo}/src/branch/{branch}/{rel_dir}/{fname}"
+        links.append(f'<li><a href="{href}">{label}</a></li>')
+
+    if links:
+        text += "<br><b>\u0414\u043e\u043a\u0443\u043c\u0435\u043d\u0442\u044b:</b><ul>" + "".join(links) + "</ul>"
+    return text
+
+
+def _handle_analysis_approved_flow(
+    task_id, current_stage, repo, work_item_id, branch, agent, result: AdvanceResult
+):
+    """Analyst approved-flow (launcher only).
+
+    Only triggers when the analyst just finished (agent == 'analyst') in the
+    launcher path. Decides between: artifacts ready -> In Review + request
+    :approved:; questions file -> Needs Input; otherwise a warning comment.
+    This gate never advances on its own (human approval does that via the plane
+    webhook), matching the original launcher behavior.
+    """
+    result.qg_name = "check_analysis_approved"
+    result.note = "analysis-approval-gate"
+    if not (agent == "analyst" and work_item_id):
+        return
+
+    files_check = QG_CHECKS.get("check_analysis_complete")
+    if not files_check:
+        return
+
+    files_ok, _ = files_check(repo, work_item_id, branch)
+    if files_ok:
+        # Full artifacts ready -> In Review, ask for the Approved STATUS (BUG C).
+        set_issue_in_review(work_item_id)
+        plane_add_comment(
+            work_item_id,
+            _build_analyst_ready_comment(repo, work_item_id, branch),
+            author="analyst",
+        )
+        notify_approve_requested(task_id)
+        result.note = "analysis-in-review"
+        logger.info(
+            f"Task {task_id}: analyst finished, requested Approved status in Plane"
+        )
+        return
+
+    questions_path = os.path.join(
+        get_worktree_path(repo, branch),
+        f"docs/work-items/{work_item_id}/01-questions.md",
+    )
+    if os.path.isfile(questions_path):
+        set_issue_needs_input(work_item_id)
+        with open(questions_path, "r") as qf:
+            questions_text = qf.read()
+        plane_add_comment(
+            work_item_id,
+            f"\u2753 Analyst \u043d\u0443\u0436\u0434\u0430\u0435\u0442\u0441\u044f \u0432 \u0443\u0442\u043e\u0447\u043d\u0435\u043d\u0438\u0438:\n\n{questions_text}",
+            author="analyst",
+        )
+        send_telegram(
+            f"\u2753 {work_item_id}: Analyst \u0437\u0430\u0434\u0430\u0451\u0442 \u0432\u043e\u043f\u0440\u043e\u0441\u044b. \u041e\u0442\u0432\u0435\u0442\u044c \u0432 Plane."
+        )
+        result.note = "analysis-needs-input"
+        return
+
+    # No artifacts and no questions.
+    plane_add_comment(
+        work_item_id,
+        "\u26a0\ufe0f Analyst \u0437\u0430\u0432\u0435\u0440\u0448\u0438\u043b\u0441\u044f \u0431\u0435\u0437 \u0430\u0440\u0442\u0435\u0444\u0430\u043a\u0442\u043e\u0432 \u0438 \u0431\u0435\u0437 \u0432\u043e\u043f\u0440\u043e\u0441\u043e\u0432. \u041f\u0440\u043e\u0432\u0435\u0440\u044c\u0442\u0435 \u043b\u043e\u0433.",
+        author="analyst",
+    )
+    result.note = "analysis-empty"
+
+
+def _handle_qg_failure_rollbacks(
+    task_id, current_stage, repo, work_item_id, branch,
+    agent, qg_name, reason, result: AdvanceResult,
+):
+    """All rollback/retry branches from the original launcher, preserved verbatim.
+
+    Only fire on the launcher path (finished_agent is set). The webhook path
+    passes finished_agent=None, so none of these agent-specific branches trigger
+    — that matches the old plane behavior (it just reported the QG failure).
+    """
+    # Reviewer REQUEST_CHANGES -> rollback to development + retry (max 3).
+    if agent == "reviewer" and "REQUEST_CHANGES" in (reason or ""):
+        update_task_stage(task_id, "development")
+        notify_stage_change(task_id, current_stage, "development")
+        plane_notify_stage(work_item_id, current_stage, "development")
+        result.rolled_back_to = "development"
+        retry_count = _developer_retry_count(task_id)
+        if retry_count < MAX_DEVELOPER_RETRIES:
+            task_desc = (
+                f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\n"
+                f"Stage: development\nNote: REQUEST_CHANGES from reviewer "
+                f"(attempt {retry_count+1}/3). Fix findings in "
+                f"docs/work-items/{work_item_id}/12-review.md"
+            )
+            new_job = enqueue_job("developer", repo, task_desc, task_id=task_id)
+            result.enqueued_agent = "developer"
+            result.enqueued_job_id = new_job
+            logger.info(
+                f"Task {task_id}: reviewer REQUEST_CHANGES, enqueued developer "
+                f"(job_id={new_job})"
+            )
+        else:
+            send_telegram(
+                f"\u26a0\ufe0f {work_item_id}: Max developer retries (3) reached. "
+                f"Manual intervention needed."
+            )
+            result.alerted = True
+            logger.error(f"Task {task_id}: max retries reached")
+
+    # Tester check_tests_passed FAIL -> rollback to development + retry (max 3).
+    if agent == "tester" and qg_name == "check_tests_passed":
+        update_task_stage(task_id, "development")
+        notify_stage_change(task_id, current_stage, "development")
+        plane_notify_stage(work_item_id, current_stage, "development")
+        result.rolled_back_to = "development"
+        set_issue_in_progress(work_item_id)
+        plane_add_comment(
+            work_item_id,
+            f"\u274c \u0422\u0435\u0441\u0442\u044b \u043d\u0435 \u043f\u0440\u043e\u0448\u043b\u0438: {reason}. "
+            f"Developer \u043f\u0435\u0440\u0435\u0437\u0430\u043f\u0443\u0449\u0435\u043d \u0434\u043b\u044f \u0444\u0438\u043a\u0441\u0430.",
+            author="tester",
+        )
+        retry_count = _developer_retry_count(task_id)
+        if retry_count < MAX_DEVELOPER_RETRIES:
+            task_desc = (
+                f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\n"
+                f"Stage: development\nNote: Tests FAILED. "
+                f"Fix failures described in docs/work-items/{work_item_id}/13-test-report.md"
+            )
+            new_job = enqueue_job("developer", repo, task_desc, task_id=task_id)
+            result.enqueued_agent = "developer"
+            result.enqueued_job_id = new_job
+            logger.info(
+                f"Task {task_id}: tester FAIL, enqueued developer (job_id={new_job})"
+            )
+        else:
+            set_issue_blocked(work_item_id)
+            send_telegram(
+                f"\U0001f6a8 {work_item_id}: Tests still failing after 3 developer "
+                f"retries. Manual intervention needed."
+            )
+            result.alerted = True
+
+    # Architect conflict (10-conflict.md exists) -> rollback to analysis.
+    if agent == "architect" and qg_name == "check_architecture_done":
+        conflict_path = os.path.join(
+            get_worktree_path(repo, branch),
+            f"docs/work-items/{work_item_id}/10-conflict.md",
+        )
+        if os.path.isfile(conflict_path):
+            update_task_stage(task_id, "analysis")
+            notify_stage_change(task_id, current_stage, "analysis")
+            plane_notify_stage(work_item_id, current_stage, "analysis")
+            result.rolled_back_to = "analysis"
+            set_issue_in_progress(work_item_id)
+            with open(conflict_path, "r") as cf:
+                conflict_text = cf.read()[:500]
+            plane_add_comment(
+                work_item_id,
+                f"\u26a0\ufe0f Architect \u043d\u0430\u0448\u0451\u043b \u043a\u043e\u043d\u0444\u043b\u0438\u043a\u0442 \u0441 \u0422\u0417. "
+                f"\u0412\u043e\u0437\u0432\u0440\u0430\u0442 \u0432 Analysis.\n\n{conflict_text}",
+                author="architect",
+            )
+            task_desc = (
+                f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\n"
+                f"Stage: analysis\nNote: Architect conflict. Revise TRZ. "
+                f"See docs/work-items/{work_item_id}/10-conflict.md"
+            )
+            new_job = enqueue_job("analyst", repo, task_desc, task_id=task_id)
+            result.enqueued_agent = "analyst"
+            result.enqueued_job_id = new_job
+            logger.info(
+                f"Task {task_id}: architect conflict, enqueued analyst "
+                f"(job_id={new_job})"
+            )
+
+    # БАГ 8: deployer verdict FAILED -> roll deploy back to development.
+    # The launcher's exit_code-based guard (launcher.py:475) never fires because
+    # the LLM process exit code is always 0; this gate fires on the machine-readable
+    # deploy_status verdict in 14-deploy-log.md instead. Mirrors the launcher block
+    # (rollback + set_issue_blocked + notify) but is driven by the VERDICT.
+    if agent == "deployer" and qg_name == "check_deploy_status":
+        update_task_stage(task_id, "development")
+        notify_stage_change(task_id, current_stage, "development")
+        plane_notify_stage(work_item_id, current_stage, "development")
+        result.rolled_back_to = "development"
+        set_issue_blocked(work_item_id)
+        notify_qg_failure(task_id, "deploy", "check_deploy_status", reason)
+        plane_add_comment(
+            work_item_id,
+            f"\u274c Deploy FAILED ({reason}). Rolled back to development. "
+            f"Developer \u043d\u0443\u0436\u0435\u043d \u0434\u043b\u044f \u0444\u0438\u043a\u0441\u0430.",
+            author="deployer",
+        )
+        send_telegram(
+            f"\U0001f6a8 {work_item_id}: Deploy FAILED ({reason}). "
+            f"Rolled back to development. Needs fix."
+        )
+        result.alerted = True
+        logger.error(
+            f"Task {task_id}: deployer verdict FAILED, rolled back deploy -> "
+            f"development ({reason})"
+        )
--- a/src/stages.py
+++ b/src/stages.py
@@ -5,7 +5,7 @@ Stages:

 Each stage defines:
  - next: the stage to advance to
-  - agent: the agent to launch when entering the NEXT stage
+  - agent: the agent to launch when advancing FROM this stage (NOT the next stage's agent)
  - qg: the quality gate check required to leave this stage
 """

@@ -13,10 +13,10 @@ STAGE_TRANSITIONS = {
    "created": {"next": "analysis", "agent": "analyst", "qg": None},
    "analysis": {"next": "architecture", "agent": "architect", "qg": "check_analysis_approved"},
    "architecture": {"next": "development", "agent": "developer", "qg": "check_architecture_done"},
-    "development": {"next": "review", "agent": "reviewer", "qg": "check_tests_local"},
+    "development": {"next": "review", "agent": "reviewer", "qg": "check_ci_green"},
    "review": {"next": "testing", "agent": "tester", "qg": "check_reviewer_verdict"},
    "testing": {"next": "deploy", "agent": "deployer", "qg": "check_tests_passed"},
-    "deploy": {"next": "done", "agent": None, "qg": None},
+    "deploy": {"next": "done", "agent": None, "qg": "check_deploy_status"},
    "done": {"next": None, "agent": None, "qg": None},
 }

--- a/src/usage.py
+++ b/src/usage.py
@@ -0,0 +1,464 @@
+"""Feature 4: token / cost accounting for agent runs.
+
+claude --output-format json emits a single result JSON object at the end of the
+run log with fields:
+  total_cost_usd
+  usage.input_tokens / output_tokens / cache_read_input_tokens /
+       cache_creation_input_tokens
+  modelUsage, num_turns, duration_ms
+
+This module parses that JSON out of a (text-or-json) run log, records the usage
+on the agent_runs row, formats a Plane comment for the finishing agent, and
+builds the per-task summary the Deployer posts on deploy/done.
+
+Everything here is defensive: a missing/garbled JSON never raises \u2014 we record
+NULL/0 and log a warning so a broken agent run can't crash the monitor.
+"""
+
+import json
+import logging
+
+from .db import get_db
+
+logger = logging.getLogger("orchestrator.usage")
+
+
+def parse_usage_from_text(text: str) -> dict | None:
+    """Extract the claude result-JSON usage from a run log's text.
+
+    The log may contain plain text before/after the JSON; with
+    --output-format json the JSON is the final object. We scan for the LAST
+    top-level '{' ... '}' that parses and carries usage/total_cost_usd.
+
+    Returns a normalised dict
+      {input_tokens, output_tokens, cache_read_tokens, cache_creation_tokens,
+       cost_usd}
+    (ints / float, missing fields -> 0 / 0.0), or None if no usable JSON found.
+    """
+    if not text:
+        return None
+
+    candidate = _extract_last_json_object(text)
+    if candidate is None:
+        return None
+
+    usage = candidate.get("usage") or {}
+    if not isinstance(usage, dict):
+        usage = {}
+
+    cost = candidate.get("total_cost_usd")
+    if cost is None:
+        cost = candidate.get("cost_usd")
+
+    # If there is neither a usage block nor a cost, this isn't a result object.
+    if not usage and cost is None:
+        return None
+
+    def _int(v):
+        try:
+            return int(v)
+        except (TypeError, ValueError):
+            return 0
+
+    def _float(v):
+        try:
+            return float(v)
+        except (TypeError, ValueError):
+            return 0.0
+
+    return {
+        "input_tokens": _int(usage.get("input_tokens")),
+        "output_tokens": _int(usage.get("output_tokens")),
+        "cache_read_tokens": _int(
+            usage.get("cache_read_input_tokens", usage.get("cache_read_tokens"))
+        ),
+        # The cache-CREATION slice (writing new cache entries) is part of the
+        # REAL input and used to be dropped on the floor. Persist it so the
+        # "X in" figure reflects the full prompt size, not just fresh tokens.
+        "cache_creation_tokens": _int(
+            usage.get("cache_creation_input_tokens", usage.get("cache_creation_tokens"))
+        ),
+        "cost_usd": _float(cost),
+        # Telegram live tracker: the model the run actually used. claude
+        # --output-format json reports it under modelUsage (a dict keyed by the
+        # full model id) and/or a top-level "model" field. We keep the FULL name
+        # here; short_model_name() trims it for the tracker. None when unknown.
+        "model": _extract_model(candidate),
+    }
+
+
+def _extract_model(candidate: dict) -> str | None:
+    """Best-effort: pull the model id out of a claude result JSON object.
+
+    Prefers modelUsage (a dict keyed by full model ids, e.g.
+    {"claude-opus-4-8": {...}}) and returns the key with the most output
+    tokens; falls back to a top-level "model" string. Never raises -> None.
+    """
+    try:
+        mu = candidate.get("modelUsage")
+        if isinstance(mu, dict) and mu:
+            def _out(v):
+                try:
+                    return int((v or {}).get("outputTokens", 0))
+                except (TypeError, ValueError, AttributeError):
+                    return 0
+            best = max(mu.items(), key=lambda kv: _out(kv[1]))
+            if best and best[0]:
+                return str(best[0])
+        model = candidate.get("model")
+        if isinstance(model, str) and model:
+            return model
+    except Exception:
+        pass
+    return None
+
+
+def short_model_name(full: str | None) -> str:
+    """Trim a full model id to a short tag for the tracker.
+
+    'tokenator/claude-opus-4-8'  -> 'opus-4-8'
+    'vibecode/claude-sonnet-4.6' -> 'sonnet-4.6'
+    'claude-opus-4-8'            -> 'opus-4-8'
+    Returns '' when full is falsy so callers can omit the ' · <model>' suffix.
+    """
+    if not full:
+        return ""
+    name = str(full).strip()
+    # Drop any provider prefix up to and including the last '/'.
+    if "/" in name:
+        name = name.rsplit("/", 1)[-1]
+    # Drop a leading 'claude-' marketing prefix.
+    if name.startswith("claude-"):
+        name = name[len("claude-"):]
+    return name
+
+
+def _extract_last_json_object(text: str) -> dict | None:
+    """Return the last balanced top-level JSON object in `text` that parses.
+
+    Scans from the end for '}' and walks back to the matching '{' using a depth
+    counter (string-aware), trying json.loads on each candidate. Robust to log
+    lines or text emitted before the JSON.
+    """
+    # Fast path: the whole stripped text is the JSON.
+    stripped = text.strip()
+    try:
+        obj = json.loads(stripped)
+        if isinstance(obj, dict):
+            return obj
+    except (ValueError, TypeError):
+        pass
+
+    # Otherwise find the last balanced { ... } block.
+    end = len(text)
+    while True:
+        close = text.rfind("}", 0, end)
+        if close == -1:
+            return None
+        depth = 0
+        in_str = False
+        esc = False
+        start = None
+        for i in range(close, -1, -1):
+            ch = text[i]
+            if in_str:
+                if esc:
+                    esc = False
+                elif ch == "\\":
+                    esc = True
+                elif ch == '"':
+                    in_str = False
+                continue
+            if ch == '"':
+                in_str = True
+            elif ch == "}":
+                depth += 1
+            elif ch == "{":
+                depth -= 1
+                if depth == 0:
+                    start = i
+                    break
+        if start is not None:
+            blob = text[start:close + 1]
+            try:
+                obj = json.loads(blob)
+                if isinstance(obj, dict):
+                    return obj
+            except (ValueError, TypeError):
+                pass
+        end = close  # keep scanning earlier in the text
+
+
+def parse_usage_from_log(path: str) -> dict | None:
+    """Read a run log file and parse usage from it. Never raises."""
+    try:
+        with open(path, "r", encoding="utf-8", errors="replace") as f:
+            return parse_usage_from_text(f.read())
+    except OSError as e:
+        logger.warning(f"parse_usage_from_log: cannot read {path}: {e}")
+        return None
+
+
+def record_usage(run_id: int, usage: dict | None):
+    """Write parsed usage onto the agent_runs row. NULLs if usage is None."""
+    if usage is None:
+        logger.warning(f"run_id={run_id}: no usage JSON parsed, recording NULLs")
+        usage = {}
+    conn = get_db()
+    try:
+        conn.execute(
+            "UPDATE agent_runs SET input_tokens=?, output_tokens=?, "
+            "cache_read_tokens=?, cache_creation_tokens=?, cost_usd=?, "
+            "model=COALESCE(?, model) WHERE id=?",
+            (
+                usage.get("input_tokens"),
+                usage.get("output_tokens"),
+                usage.get("cache_read_tokens"),
+                usage.get("cache_creation_tokens"),
+                usage.get("cost_usd"),
+                usage.get("model"),
+                run_id,
+            ),
+        )
+        conn.commit()
+    finally:
+        conn.close()
+
+
+def fmt_tokens(n) -> str:
+    """Format a token count compactly: 1234 -> '1.2k', 2_500_000 -> '2.5M'."""
+    try:
+        n = int(n or 0)
+    except (TypeError, ValueError):
+        n = 0
+    if n >= 1_000_000:
+        return f"{n / 1_000_000:.1f}M"
+    if n >= 1_000:
+        return f"{n / 1_000:.1f}k"
+    return str(n)
+
+
+def fmt_cost(c) -> str:
+    """Format USD cost with 2 decimals: '$0.21'."""
+    try:
+        c = float(c or 0.0)
+    except (TypeError, ValueError):
+        c = 0.0
+    return f"${c:.2f}"
+
+
+# Pretty agent names for comments (mirrors STAGE_AUTHORS roles).
+AGENT_DISPLAY = {
+    "analyst": "Analyst",
+    "architect": "Architect",
+    "developer": "Developer",
+    "reviewer": "Reviewer",
+    "tester": "Tester",
+    "deployer": "Deployer",
+}
+
+
+def _input_total(usage: dict) -> int:
+    """FULL input = fresh input + cache-read + cache-creation tokens."""
+    def _i(k):
+        try:
+            return int(usage.get(k) or 0)
+        except (TypeError, ValueError):
+            return 0
+    return _i("input_tokens") + _i("cache_read_tokens") + _i("cache_creation_tokens")
+
+
+def _cached_total(usage: dict) -> int:
+    """Cached portion of the input = cache-read + cache-creation tokens."""
+    def _i(k):
+        try:
+            return int(usage.get(k) or 0)
+        except (TypeError, ValueError):
+            return 0
+    return _i("cache_read_tokens") + _i("cache_creation_tokens")
+
+
+def fmt_in(usage: dict) -> str:
+    """Render the input figure as full total with a cached breakdown.
+
+    '8.5M in (8.4M cached)' when there is a cache; '45.2k in' when cached==0.
+    """
+    total = _input_total(usage)
+    cached = _cached_total(usage)
+    if cached > 0:
+        return f"{fmt_tokens(total)} in ({fmt_tokens(cached)} cached)"
+    return f"{fmt_tokens(total)} in"
+
+
+def usage_comment(
+    agent: str,
+    usage: dict | None,
+    repo: str | None = None,
+    branch: str | None = None,
+    work_item_id: str | None = None,
+    pr_number=None,
+) -> str:
+    """Build the per-agent finish comment, e.g.
+    '\U0001f4bb Developer \u0433\u043e\u0442\u043e\u0432 \u00b7 8.5M in (8.4M cached) / 45.8k out \u00b7 $7.29'.
+
+    When repo/branch/work_item_id are supplied, the agent's artifact link(s) are
+    appended (BUG: only analyst used to link its docs). Missing artifacts are
+    silently skipped — link building never raises.
+    """
+    usage = usage or {}
+    name = AGENT_DISPLAY.get(agent, agent.capitalize())
+    icon = AGENT_ICON.get(agent, "\u2705")
+    line = (
+        f"{icon} {name} \u0433\u043e\u0442\u043e\u0432 \u00b7 "
+        f"{fmt_in(usage)} / "
+        f"{fmt_tokens(usage.get('output_tokens'))} out \u00b7 "
+        f"{fmt_cost(usage.get('cost_usd'))}"
+    )
+    links = artifact_links(agent, repo, branch, work_item_id, pr_number)
+    if links:
+        line += "\n" + "\n".join(links)
+    return line
+
+
+# Per-agent artifact file under docs/work-items/{wid}/ (architect/developer use
+# special handling for ADR dirs / PR links, see artifact_links()).
+AGENT_ARTIFACT = {
+    "reviewer": ("Review", "12-review.md"),
+    "tester": ("Test report", "13-test-report.md"),
+    "deployer": ("Deploy log", "14-deploy-log.md"),
+}
+
+
+def artifact_links(
+    agent: str,
+    repo: str | None,
+    branch: str | None,
+    work_item_id: str | None,
+    pr_number=None,
+) -> list[str]:
+    """Markdown link(s) to the finishing agent's artifact(s) in Gitea.
+
+    Uses gitea_public_url (falls back to gitea_url) for clickable links, mirroring
+    the analyst doc links. Returns [] (never raises) when there is nothing to
+    link or the required context is missing. analyst is intentionally NOT handled
+    here — its richer doc list lives in stage_engine._build_analyst_ready_comment.
+    """
+    try:
+        from .config import settings
+        owner = getattr(settings, "gitea_owner", "admin")
+        base = (
+            getattr(settings, "gitea_public_url", "") or getattr(settings, "gitea_url", "")
+        ).rstrip("/")
+        if not base or not repo:
+            return []
+        links: list[str] = []
+
+        if agent == "developer":
+            if branch:
+                links.append(
+                    f"\U0001f4c2 [Branch {branch}]({base}/{owner}/{repo}/src/branch/{branch})"
+                )
+            if pr_number:
+                links.append(
+                    f"\U0001f517 [PR #{pr_number}]({base}/{owner}/{repo}/pulls/{pr_number})"
+                )
+            return links
+
+        if agent == "architect":
+            if branch and work_item_id:
+                adr_dir = (
+                    f"{base}/{owner}/{repo}/src/branch/{branch}/"
+                    f"docs/work-items/{work_item_id}/06-adr"
+                )
+                links.append(f"\U0001f4d0 [ADR]({adr_dir})")
+            return links
+
+        spec = AGENT_ARTIFACT.get(agent)
+        if spec and branch and work_item_id:
+            label, fname = spec
+            href = (
+                f"{base}/{owner}/{repo}/src/branch/{branch}/"
+                f"docs/work-items/{work_item_id}/{fname}"
+            )
+            links.append(f"\U0001f4c4 [{label}]({href})")
+        return links
+    except Exception:
+        return []
+
+
+AGENT_ICON = {
+    "analyst": "\U0001f50d",
+    "architect": "\U0001f4d0",
+    "developer": "\U0001f4bb",
+    "reviewer": "\U0001f50e",
+    "tester": "\U0001f9ea",
+    "deployer": "\U0001f680",
+}
+
+
+def task_usage_summary(task_id: int) -> dict:
+    """Aggregate agent_runs usage for a task.
+
+    total_in counts the FULL input (input + cache_read + cache_creation), and
+    total_cached counts the cached portion (cache_read + cache_creation).
+    COALESCE(...,0) keeps pre-existing rows (NULL cache_creation) from breaking.
+
+    Returns {total_in, total_cached, total_out, total_cost,
+             per_agent: [(agent, in, cached, out, cost), ...]}.
+    """
+    conn = get_db()
+    try:
+        rows = conn.execute(
+            "SELECT agent, "
+            "COALESCE(SUM(input_tokens),0) "
+            "  + COALESCE(SUM(cache_read_tokens),0) "
+            "  + COALESCE(SUM(cache_creation_tokens),0), "
+            "COALESCE(SUM(cache_read_tokens),0) "
+            "  + COALESCE(SUM(cache_creation_tokens),0), "
+            "COALESCE(SUM(output_tokens),0), "
+            "COALESCE(SUM(cost_usd),0.0) "
+            "FROM agent_runs WHERE task_id=? GROUP BY agent ORDER BY agent",
+            (task_id,),
+        ).fetchall()
+    finally:
+        conn.close()
+    per_agent = [(r[0], int(r[1]), int(r[2]), int(r[3]), float(r[4])) for r in rows]
+    total_in = sum(r[1] for r in per_agent)
+    total_cached = sum(r[2] for r in per_agent)
+    total_out = sum(r[3] for r in per_agent)
+    total_cost = sum(r[4] for r in per_agent)
+    return {
+        "total_in": total_in,
+        "total_cached": total_cached,
+        "total_out": total_out,
+        "total_cost": total_cost,
+        "per_agent": per_agent,
+    }
+
+
+def task_summary_comment(task_id: int) -> str:
+    """Build the Deployer end-of-task summary comment (Feature 4, variant B)."""
+    s = task_usage_summary(task_id)
+    cached = s.get("total_cached", 0)
+    head_in = (
+        f"{fmt_tokens(s['total_in'])} \u0432\u0445\u043e\u0434 ({fmt_tokens(cached)} cached)"
+        if cached > 0
+        else f"{fmt_tokens(s['total_in'])} \u0432\u0445\u043e\u0434"
+    )
+    lines = [
+        f"\U0001f4ca \u0418\u0442\u043e\u0433\u043e \u043f\u043e \u0437\u0430\u0434\u0430\u0447\u0435: "
+        f"{head_in} / "
+        f"{fmt_tokens(s['total_out'])} \u0432\u044b\u0445\u043e\u0434 \u00b7 "
+        f"{fmt_cost(s['total_cost'])}"
+    ]
+    for agent, ti, tc, to, cost in s["per_agent"]:
+        name = AGENT_DISPLAY.get(agent, agent.capitalize())
+        in_str = (
+            f"{fmt_tokens(ti)} in ({fmt_tokens(tc)} cached)"
+            if tc > 0
+            else f"{fmt_tokens(ti)} in"
+        )
+        lines.append(
+            f"\u2022 {name}: {in_str} / {fmt_tokens(to)} out \u00b7 {fmt_cost(cost)}"
+        )
+    return "\n".join(lines)
--- a/src/webhooks/_dedup.py
+++ b/src/webhooks/_dedup.py
@@ -0,0 +1,52 @@
+"""ORCH-5 (M-7): webhook delivery de-duplication helper.
+
+Webhook providers (Gitea/Plane) retry deliveries on timeout, network reset, or
+manual replay. Without idempotency a retried delivery re-enters the pipeline and
+spawns a duplicate run (the ET-009 incident class: parallel conveyors on one
+repo). This module computes a stable per-delivery id so the webhook handlers can
+INSERT-OR-IGNORE into events and skip the dispatch on a repeat.
+
+delivery_id format: ``f"{source}:{raw_or_hash}"`` where source prefixes
+gitea/plane so their id-spaces never collide. ``raw`` is the provider's native
+delivery header (a GUID) when present; otherwise we fall back to a sha256 of the
+body (a retried identical body yields the same hash).
+"""
+
+import hashlib
+
+
+def _sha256_hex(*parts: str) -> str:
+    h = hashlib.sha256()
+    for p in parts:
+        h.update(p.encode("utf-8", "replace"))
+    return h.hexdigest()
+
+
+def gitea_delivery_id(headers, event_type: str, body: bytes) -> str:
+    """Compute the delivery_id for a Gitea webhook.
+
+    Prefers the ``X-Gitea-Delivery`` header (a per-delivery GUID). Falls back to
+    sha256(source + event_type + body) so a retried identical body still maps to
+    one id even if Gitea omitted the header.
+    """
+    raw = (headers.get("X-Gitea-Delivery") or "").strip()
+    if not raw:
+        raw = _sha256_hex("gitea", event_type or "", body.decode("utf-8", "replace"))
+    return f"gitea:{raw}"
+
+
+def plane_delivery_id(headers, body: bytes) -> str:
+    """Compute the delivery_id for a Plane webhook.
+
+    Plane does not reliably send a delivery header, so we try a couple of common
+    names and otherwise fall back to sha256("plane" + body): a retried identical
+    body yields the same id.
+    """
+    raw = (
+        headers.get("X-Plane-Delivery")
+        or headers.get("X-Hook-Delivery")
+        or ""
+    ).strip()
+    if not raw:
+        raw = _sha256_hex("plane", body.decode("utf-8", "replace"))
+    return f"plane:{raw}"
--- a/src/webhooks/gitea.py
+++ b/src/webhooks/gitea.py
@@ -10,7 +10,14 @@ import httpx
 from fastapi import APIRouter, Request, HTTPException

 from ..config import settings
-from ..db import get_db, get_task_by_repo_branch, update_task_stage
+from ..db import (
+    get_db,
+    get_task_by_repo_branch,
+    update_task_stage,
+    enqueue_job,
+    insert_event_dedup,
+)
+from ._dedup import gitea_delivery_id
 from ..stages import get_next_stage, get_agent_for_stage
 from ..qg.checks import check_ci_green, check_review_approved
 from ..notifications import notify_stage_change, notify_qg_failure, notify_error
@@ -51,15 +58,17 @@ async def gitea_webhook(request: Request):

    payload = json.loads(body)

-    # Log event
-    conn = get_db()
+    # ORCH-5 (M-7): idempotent logging. Compute a stable delivery_id (X-Gitea-Delivery
+    # GUID, or sha256 fallback) and INSERT OR IGNORE. A repeated delivery (Gitea retry
+    # / manual replay) returns inserted=False -> log + return {"status":"duplicate"}
+    # WITHOUT re-dispatching, so the pipeline is not re-triggered (ET-009 class).
+    # Runs AFTER HMAC verification above.
    event_type = request.headers.get("X-Gitea-Event", "unknown")
-    conn.execute(
-        "INSERT INTO events (source, event_type, payload) VALUES (?, ?, ?)",
-        ("gitea", event_type, body.decode()),
-    )
-    conn.commit()
-    conn.close()
+    delivery_id = gitea_delivery_id(request.headers, event_type, body)
+    inserted = insert_event_dedup("gitea", event_type, body.decode(), delivery_id)
+    if not inserted:
+        logger.info(f"Gitea webhook duplicate delivery_id={delivery_id}, skipping dispatch")
+        return {"status": "duplicate"}

    if event_type == "push":
        await handle_push(payload)
@@ -123,8 +132,8 @@ async def handle_push(payload: dict):
            if agent:
                try:
                    task_desc = f"Work item: {work_item_id}\nRepo: {repo_name}\nBranch: {branch}\nStage: {next_stage}"
-                    run_id = launcher.launch(agent, repo_name, task_desc, task_id=task_id)
-                    logger.info(f"Task {task_id}: push triggered {current_stage} → {next_stage}, launched '{agent}' (run_id={run_id})")
+                    job_id = enqueue_job(agent, repo_name, task_desc, task_id=task_id)
+                    logger.info(f"Task {task_id}: push triggered {current_stage} → {next_stage}, enqueued '{agent}' (job_id={job_id})")
                except Exception as e:
                    notify_error(task_id, f"Failed to launch agent '{agent}': {e}")

@@ -200,19 +209,38 @@ async def handle_ci_status(payload: dict):
            if agent:
                try:
                    task_desc = f"Work item: {work_item_id}\nRepo: {repo_name}\nBranch: {branch}\nStage: {next_stage}"
-                    run_id = launcher.launch(agent, repo_name, task_desc, task_id=task_id)
-                    logger.info(f"Task {task_id}: CI green → {next_stage}, launched '{agent}' (run_id={run_id})")
+                    job_id = enqueue_job(agent, repo_name, task_desc, task_id=task_id)
+                    logger.info(f"Task {task_id}: CI green → {next_stage}, enqueued '{agent}' (job_id={job_id})")
                except Exception as e:
                    notify_error(task_id, f"Failed to launch agent '{agent}': {e}")
        else:
            notify_qg_failure(task_id, current_stage, "check_ci_green", reason)

-    elif state == "failure":
-        # S-1: Gitea CI is NOT the authoritative gate anymore (the orchestrator runs
-        # tests locally via check_tests_local). Gitea CI is often unconfigured, so a
-        # "failure"/empty status here is not actionable. Log only, do not alert.
-        logger.debug(f"Task {task_id}: Gitea CI state='failure' on branch '{branch}' "
-                     f"(non-authoritative, suppressed — local tests are the gate)")
+    elif state == "failure" and current_stage == "development":
+        # CI is the authoritative gate for development -> review.
+        # On red CI: notify, then bounce the task back to the developer (capped retries),
+        # symmetric to the review REQUEST_CHANGES path.
+        notify_qg_failure(task_id, current_stage, "check_ci_green", f"Gitea CI failed on branch '{branch}'")
+        conn = get_db()
+        retry_count = conn.execute(
+            "SELECT COUNT(*) as cnt FROM agent_runs WHERE task_id = ? AND agent = 'developer'",
+            (task_id,),
+        ).fetchone()["cnt"]
+        conn.close()
+        if retry_count < MAX_DEV_RETRIES:
+            # task already on 'development' — no stage change needed, just relaunch developer
+            try:
+                task_desc = (
+                    f"Work item: {work_item_id}\nRepo: {repo_name}\nBranch: {branch}\n"
+                    f"Stage: development\nNote: CI failed, fix and re-push (attempt {retry_count + 1}/{MAX_DEV_RETRIES})"
+                )
+                job_id = enqueue_job("developer", repo_name, task_desc, task_id=task_id)
+                logger.info(f"Task {task_id}: CI failed, enqueued developer (attempt {retry_count + 1}, job_id={job_id})")
+            except Exception as e:
+                notify_error(task_id, f"Failed to relaunch developer after CI failure: {e}")
+        else:
+            notify_error(task_id, f"Max developer retries ({MAX_DEV_RETRIES}) reached after CI failure, escalating")
+            logger.error(f"Task {task_id}: max retries reached after CI failure, needs manual intervention")


 async def handle_pr(payload: dict):
@@ -272,8 +300,8 @@ async def handle_pr(payload: dict):
                if agent:
                    try:
                        task_desc = f"Work item: {work_item_id}\nRepo: {repo_name}\nBranch: {head_branch}\nStage: {next_stage}"
-                        run_id = launcher.launch(agent, repo_name, task_desc, task_id=task_id)
-                        logger.info(f"Task {task_id}: PR approved → {next_stage}, launched '{agent}' (run_id={run_id})")
+                        job_id = enqueue_job(agent, repo_name, task_desc, task_id=task_id)
+                        logger.info(f"Task {task_id}: PR approved → {next_stage}, enqueued '{agent}' (job_id={job_id})")
                    except Exception as e:
                        notify_error(task_id, f"Failed to launch agent '{agent}': {e}")
            else:
@@ -297,8 +325,8 @@ async def handle_pr(payload: dict):
                        f"Work item: {work_item_id}\nRepo: {repo_name}\nBranch: {head_branch}\n"
                        f"Stage: development\nNote: Changes requested in review (attempt {retry_count + 1}/{MAX_DEV_RETRIES})"
                    )
-                    run_id = launcher.launch("developer", repo_name, task_desc, task_id=task_id)
-                    logger.info(f"Task {task_id}: changes requested, relaunching developer (attempt {retry_count + 1})")
+                    job_id = enqueue_job("developer", repo_name, task_desc, task_id=task_id)
+                    logger.info(f"Task {task_id}: changes requested, enqueued developer (attempt {retry_count + 1}, job_id={job_id})")
                except Exception as e:
                    notify_error(task_id, f"Failed to relaunch developer: {e}")
            else:
@@ -306,6 +334,20 @@ async def handle_pr(payload: dict):
                logger.error(f"Task {task_id}: max retries reached, needs manual intervention")

    elif action == "closed" and pr.get("merged", False):
+        # BUG 8 (second door): at the deploy stage `done` is gated by the
+        # deployer's verdict (check_deploy_status via advance_stage), NOT by the
+        # fact that the PR was merged. The deployer merges the PR at the START of
+        # its run, so a merged webhook arrives ~30s later while the deployer is
+        # still working — blindly setting done here would fake-complete the task
+        # and discard a later deploy_status: FAILED verdict. advance_stage will
+        # drive deploy→done (and Plane→Done) when the deployer job finishes.
+        # For every OTHER stage the merge-driven done behaviour is preserved.
+        if current_stage == "deploy":
+            logger.info(
+                f"Task {task_id}: PR merged at deploy stage — done gated by "
+                f"deployer verdict (check_deploy_status), ignoring merge-driven done."
+            )
+            return
        update_task_stage(task_id, "done")
        notify_stage_change(task_id, current_stage, "done")
        logger.info(f"Task {task_id}: PR merged, stage → done")
--- a/src/webhooks/plane.py
+++ b/src/webhooks/plane.py
@@ -13,8 +13,12 @@ from ..db import (
    get_db,
    get_task_by_plane_id,
    get_next_work_item_id,
+    ensure_unique_work_item_id,
    update_task_stage,
+    enqueue_job,
+    insert_event_dedup,
 )
+from ._dedup import plane_delivery_id
 from ..stages import get_next_stage, get_agent_for_stage, get_qg_for_stage, get_previous_stage
 from ..qg.checks import QG_CHECKS
 from ..notifications import notify_stage_change, notify_qg_failure, notify_error
@@ -60,14 +64,18 @@ async def plane_webhook(request: Request):

    payload = json.loads(body)

-    # Log event
-    conn = get_db()
-    conn.execute(
-        "INSERT INTO events (source, event_type, payload) VALUES (?, ?, ?)",
-        ("plane", payload.get("event", "unknown"), body.decode()),
-    )
-    conn.commit()
-    conn.close()
+    # ORCH-5 (M-7): idempotent logging. Plane rarely sends a delivery header, so the
+    # delivery_id falls back to sha256("plane" + body) (a retried identical body maps
+    # to one id). INSERT OR IGNORE; a duplicate returns inserted=False -> log + return
+    # {"status":"duplicate"} WITHOUT dispatching. Runs AFTER HMAC and BEFORE the ORCH-6
+    # project filter, so a repeat does no extra work; the FIRST delivery of an unknown
+    # project still falls through to the filter below and returns {"status":"ignored"}.
+    event_type = payload.get("event", "unknown")
+    delivery_id = plane_delivery_id(request.headers, body)
+    inserted = insert_event_dedup("plane", event_type, body.decode(), delivery_id)
+    if not inserted:
+        logger.info(f"Plane webhook duplicate delivery_id={delivery_id}, skipping dispatch")
+        return {"status": "duplicate"}

    event = payload.get("event")
    action = payload.get("action", "")
@@ -85,38 +93,264 @@ async def plane_webhook(request: Request):
        return {"status": "ignored", "reason": "unknown project"}

    if (event == "work_item.created") or (event == "issue" and action == "created"):
+        # Feature 1: creation NO LONGER starts the pipeline. Slava keeps the
+        # backlog until he moves an issue to In Progress. We only run a soft
+        # QG-0 sanity log here (no branch, no analyst, no task row).
        await handle_work_item_created(data, project_id)
+    elif (event == "work_item.updated") or (event == "issue" and action == "updated"):
+        # Status-only verdict model: status changes drive the pipeline.
+        #   Backlog/Todo/Triage -> In Progress : START pipeline, or relaunch the
+        #                                        stage agent if returned from
+        #                                        Needs Input.
+        #   -> Approved                         : advance to the next stage.
+        #   -> Rejected                         : rollback (reason from latest comment).
+        await handle_issue_updated(data, project_id)
    elif (event == "comment.created") or (event == "issue_comment" and action == "created"):
        await handle_comment(data, project_id)

    return {"status": "accepted"}


-async def handle_work_item_created(data: dict, project_id: str = ""):
+def _state_id(data: dict) -> str:
+    """Extract the new Plane state UUID from an 'issue updated' payload.
+
+    Real payload (verified from prod events): data.state is
+    {id, name, color, group}. Some payloads carry state as a bare UUID string.
    """
-    New work item created in Plane.
-    QG-0: validate title, description, priority.
-    If valid: create branch, init docs, launch analyst.
-    If invalid: comment with what's missing, set Blocked.
+    state = data.get("state")
+    if isinstance(state, dict):
+        return state.get("id", "") or ""
+    if isinstance(state, str):
+        return state
+    return ""
+
+
+async def handle_issue_updated(data: dict, project_id: str = ""):
+    """Feature 1 & 2: react to a Plane issue status change.
+
+    Routes the NEW state UUID (data.state.id) to:
+      - in_progress  : start the pipeline if this issue has no task yet; if a
+        task already exists and the stage agent is idle (returned from Needs
+        Input), relaunch the stage agent so it reads Slava's fresh comments.
+      - approved     : advance to the next stage.
+      - rejected     : rollback to the previous stage (reason from latest comment).
+    Any other status (Needs Input, In Review, Blocked, Done, board stages, etc.)
+    is ignored here — those are statuses the orchestrator itself sets.
+    """
+    from ..plane_sync import PLANE_STATES
+
+    plane_id = str(data.get("id") or "")
+    new_state = _state_id(data)
+    if not plane_id or not new_state:
+        logger.info("issue updated without id/state, ignoring")
+        return
+
+    if new_state == PLANE_STATES["in_progress"]:
+        await handle_status_start(data, project_id)
+    elif new_state == PLANE_STATES["approved"]:
+        await handle_verdict(data, project_id, approved=True)
+    elif new_state == PLANE_STATES["rejected"]:
+        await handle_verdict(data, project_id, approved=False)
+    else:
+        logger.info(f"issue {plane_id} updated to state {new_state[:8]}..., no pipeline action")
+
+
+async def handle_status_start(data: dict, project_id: str = ""):
+    """An issue moved into In Progress.
+
+    Two cases under the status-only verdict model:
+
+      1. No task yet for this plane_id  -> START the pipeline (start_pipeline).
+
+      2. A task already exists          -> this is Slava returning the issue from
+         Needs Input to In Progress after answering the analyst's questions. We
+         must RELAUNCH the current stage's agent so it reads the fresh comments
+         from Plane (the answer-to-questions flow used to live in handle_comment;
+         it is now status-driven).
+
+    KEY FORK — telling "answer to questions" apart from a plain duplicate In
+    Progress webhook (the dedup-protection case):
+
+      The tasks table stores no Plane status, and the issue.updated payload only
+      carries the NEW state (In Progress), so we cannot read the previous status
+      from here. Instead we use the only reliable local signal: whether the
+      stage's agent is currently in flight.
+
+      - The orchestrator sets In Progress itself while an agent runs. When the
+        agent FINISHES it leaves the issue in Needs Input or In Review and has
+        NO queued/running job. So: an existing task with NO active job means the
+        agent is idle / waiting -> a return to In Progress is a genuine relaunch
+        request -> enqueue the stage agent.
+      - If a queued/running job already exists for the task, the agent is busy
+        (or a duplicate webhook arrived) -> SKIP (no double launch). The events
+        de-dup at the top of plane_webhook already absorbs identical webhook
+        bodies; this job guard additionally covers distinct webhooks fired while
+        a job is still pending/running.
+    """
+    from ..db import has_active_job_for_task
+
+    plane_id = str(data.get("id") or "")
+    existing = get_task_by_plane_id(plane_id)
+
+    if not existing:
+        logger.info(f"Status->In Progress for {plane_id}: starting pipeline")
+        await start_pipeline(data, project_id)
+        return
+
+    task_id = existing["id"]
+    current_stage = existing["stage"]
+    repo = existing["repo"]
+    work_item_id = existing.get("work_item_id", "")
+    branch = existing.get("branch", "")
+
+    # Duplicate / busy guard: a job is already pending or running for this task.
+    if has_active_job_for_task(task_id):
+        logger.info(
+            f"Status->In Progress for {plane_id}: task {task_id} already has an "
+            f"active job (stage={current_stage}), not relaunching"
+        )
+        return
+
+    # Agent is idle -> Slava answered questions and returned the issue to In
+    # Progress. Relaunch the current stage's agent to read the fresh comments.
+    from ..plane_sync import STAGE_AUTHORS, add_comment as _add_comment
+    stage_agent = STAGE_AUTHORS.get(current_stage)
+    if not stage_agent:
+        logger.info(
+            f"Status->In Progress for {plane_id}: no agent for stage "
+            f"'{current_stage}', not relaunching"
+        )
+        return
+
+    task_desc = (
+        f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\n"
+        f"Stage: {current_stage}\nNote: Stakeholder returned the issue to In "
+        f"Progress (answered your questions). Read the latest comments in Plane "
+        f"and revise your artifacts."
+    )
+    job_id = enqueue_job(stage_agent, repo, task_desc, task_id=task_id)
+    logger.info(
+        f"Task {task_id}: returned to In Progress (Needs Input answered), "
+        f"relaunched {stage_agent} for stage {current_stage} (job_id={job_id})"
+    )
+    try:
+        _add_comment(
+            work_item_id,
+            "\U0001f504 \u0410\u0433\u0435\u043d\u0442 \u043f\u0435\u0440\u0435\u0437\u0430\u043f\u0443\u0449\u0435\u043d \u0441 \u043e\u0442\u0432\u0435\u0442\u0430\u043c\u0438 \u0441\u0442\u0435\u0439\u043a\u0445\u043e\u043b\u0434\u0435\u0440\u0430.",
+            author=stage_agent,
+        )
+    except Exception as e:
+        logger.error(f"Failed to post relaunch comment for {work_item_id}: {e}")
+
+
+async def handle_verdict(data: dict, project_id: str, approved: bool):
+    """Status-only verdict: a Plane status change drives advance / rollback.
+
+    Approved status -> _try_advance_stage. We do NOT touch the issue status here:
+    _try_advance_stage -> advance_stage -> plane_notify_stage already PATCHes the
+    issue to the NEXT stage's status. The old set_issue_in_progress call reset
+    the status to In Progress first, which made the board flicker In Progress
+    before the next stage (part of bug 3); it is removed.
+
+    Rejected status -> rollback to the previous stage. The reason is pulled from
+    the issue's latest comment (Slava writes the reason in a comment before/with
+    flipping the status to Rejected).
+    """
+    plane_id = str(data.get("id") or "")
+    task = get_task_by_plane_id(plane_id)
+    if not task:
+        logger.warning(f"Verdict status for {plane_id} but no task found, ignoring")
+        return
+
+    task_id = task["id"]
+    current_stage = task["stage"]
+    repo = task["repo"]
+    work_item_id = task.get("work_item_id", "")
+    branch = task.get("branch", "")
+
+    if approved:
+        # NOTE: no set_issue_in_progress here — _try_advance_stage sets the next
+        # stage's status itself (advance_stage -> plane_notify_stage).
+        logger.info(f"Task {task_id}: Approved status -> advance from {current_stage}")
+        await _try_advance_stage(task_id, current_stage, repo, work_item_id, branch)
+        return
+
+    # Rejected: pull the rejection reason from the issue's latest comment.
+    issue_id = task.get("plane_issue_id") or task.get("plane_id") or plane_id
+    reason = _latest_comment_reason(issue_id, repo, project_id)
+    await _rollback_stage(
+        task_id, current_stage, repo, work_item_id, branch, reason
+    )
+
+
+def _latest_comment_reason(issue_id: str, repo: str, project_id: str = "") -> str:
+    """Fetch the issue's most recent comment text (HTML stripped) as the reject
+    reason. Slava writes the reason in a comment before/with flipping the status
+    to Rejected.
+
+    Returns a fixed fallback when there is no comment / the API call fails.
+    """
+    from ..plane_sync import (
+        PLANE_BASE,
+        PLANE_HEADERS,
+        WORKSPACE,
+        PROJECT_ID as _DEFAULT_PROJECT_ID,
+    )
+    fallback = "Rejected via status, no reason comment"
+    if not issue_id:
+        return fallback
+    _proj = get_project_by_repo(repo)
+    pid = _proj.plane_project_id if _proj else (project_id or _DEFAULT_PROJECT_ID)
+    url = (
+        f"{PLANE_BASE}/workspaces/{WORKSPACE}/projects/{pid}/issues/"
+        f"{issue_id}/comments/"
+    )
+    try:
+        resp = httpx.get(url, headers=PLANE_HEADERS, timeout=10)
+        if resp.status_code != 200:
+            logger.warning(
+                f"reject-reason: GET comments for {issue_id} returned "
+                f"{resp.status_code}"
+            )
+            return fallback
+        payload = resp.json()
+        comments = payload.get("results", payload) if isinstance(payload, dict) else payload
+        if not comments:
+            return fallback
+        latest = max(comments, key=lambda c: c.get("created_at", "") or "")
+        raw = (
+            latest.get("comment_stripped")
+            or latest.get("comment_html")
+            or latest.get("comment")
+            or ""
+        )
+        text = re.sub(r"<[^>]+>", "", raw).strip()
+        return text[:300] if text else fallback
+    except Exception as e:
+        logger.error(f"reject-reason: failed to fetch comments for {issue_id}: {e}")
+        return fallback
+
+
+async def handle_work_item_created(data: dict, project_id: str = ""):
+    """Feature 1: creation does NOT start the pipeline anymore.
+
+    The pipeline is started when Slava moves the issue into In Progress
+    (handle_status_start -> start_pipeline). On creation we only run a SOFT QG-0
+    sanity check and log the result — NO branch, NO docs, NO analyst, NO task row
+    — so the issue can sit in the backlog until Slava is ready.
    """
    plane_id = data.get("id", "")
    name = data.get("name", "untitled")
    description = data.get("description_stripped", data.get("description", ""))
-    priority = data.get("priority", {})
-    priority_name = priority if isinstance(priority, str) else priority.get("name", "")
+    errors = _qg0_errors(name, description)
+    if errors:
+        logger.info(f"work_item.created {plane_id}: soft QG-0 warnings: {errors}")
+    else:
+        logger.info(f"work_item.created {plane_id} ('{name}'): in backlog, awaiting In Progress")

-    # ORCH-6: resolve repo / prefix / Plane project from the registry instead of
-    # the single hardcoded default_repo.
-    if not project_id:
-        project_id = data.get("project") or data.get("project_id") or ""
-    proj = get_project_by_plane_id(project_id)
-    if not proj:
-        logger.warning(f"handle_work_item_created: unknown project '{project_id}', ignoring {plane_id}")
-        return
-    repo = proj.repo
-    plane_project_id = proj.plane_project_id

-    # QG-0 validation
+def _qg0_errors(name: str, description: str) -> list:
+    """QG-0 validation: returns a list of human-readable problems (empty = OK)."""
    errors = []
    if not name or len(name) < 5:
        errors.append("Title \u0441\u043b\u0438\u0448\u043a\u043e\u043c \u043a\u043e\u0440\u043e\u0442\u043a\u0438\u0439 (\u043d\u0443\u0436\u043d\u043e >= 5 \u0441\u0438\u043c\u0432\u043e\u043b\u043e\u0432)")
@@ -125,6 +359,66 @@ async def handle_work_item_created(data: dict, project_id: str = ""):
    if not description or len(description.strip()) < 20:
        errors.append("Description \u0441\u043b\u0438\u0448\u043a\u043e\u043c \u043a\u043e\u0440\u043e\u0442\u043a\u0438\u0439 (\u043d\u0443\u0436\u043d\u043e >= 20 \u0441\u0438\u043c\u0432\u043e\u043b\u043e\u0432)")

+    return errors
+
+
+async def start_pipeline(data: dict, project_id: str = ""):
+    """Feature 1: start the pipeline for an issue (moved to In Progress).
+
+    This is the body extracted from the old handle_work_item_created: resolve the
+    project, run QG-0 (hard — blocks on failure), create the work item id +
+    branch + initial docs, insert the task row, and enqueue the analyst.
+
+    Callers (handle_status_start) already guarantee no existing task for this
+    plane_id, so this never duplicates.
+    """
+    plane_id = data.get("id", "")
+    name = data.get("name", "untitled")
+    description = data.get("description_stripped", data.get("description", ""))
+
+    # ORCH-6: resolve repo / prefix / Plane project from the registry instead of
+    # the single hardcoded default_repo.
+    if not project_id:
+        project_id = data.get("project") or data.get("project_id") or ""
+    proj = get_project_by_plane_id(project_id)
+    if not proj:
+        logger.warning(f"start_pipeline: unknown project '{project_id}', ignoring {plane_id}")
+        return
+    repo = proj.repo
+    plane_project_id = proj.plane_project_id
+
+    # BUG 1 + BUG B: Plane's issue.updated webhook (status change -> In Progress)
+    # sends only the CHANGED fields, so BOTH description / description_stripped
+    # AND name are usually empty here even though the issue HAS them. Pull the
+    # full title + description from the Plane issue detail API in a SINGLE GET
+    # (fetch_issue_fields: same endpoint + shared token already used by
+    # fetch_issue_sequence_id) before QG-0 and before the branch slug is built.
+    # If the API is also empty, QG-0 legitimately fails (truly empty ticket) and
+    # name falls back to "untitled".
+    name_missing = (not name) or name.strip().lower() == "untitled" or len(name.strip()) < 3
+    desc_missing = (not description) or len(description.strip()) < 20
+    if name_missing or desc_missing:
+        from ..plane_sync import fetch_issue_fields
+        fetched_name, fetched_desc = fetch_issue_fields(plane_id, plane_project_id)
+        if desc_missing and fetched_desc and len(fetched_desc.strip()) >= len(description.strip()):
+            description = fetched_desc
+            logger.info(
+                f"start_pipeline: pulled description from Plane API for {plane_id} "
+                f"({len(description.strip())} chars)"
+            )
+        if name_missing and fetched_name and len(fetched_name.strip()) >= 3:
+            name = fetched_name
+            logger.info(
+                f"start_pipeline: pulled name from Plane API for {plane_id} "
+                f"('{name}')"
+            )
+    # BUG B fallback: if name is still empty/blank after the API pull, keep the
+    # legacy "untitled" so the slug/branch build never crashes on an empty name.
+    if not name or not name.strip():
+        name = "untitled"
+
+    # QG-0 validation (hard gate on pipeline start)
+    errors = _qg0_errors(name, description)
    if errors:
        # QG-0 failed
        error_text = "\u26a0\ufe0f QG-0 failed:\n" + "\n".join(f"\u2022 {e}" for e in errors)
@@ -147,18 +441,62 @@ async def handle_work_item_created(data: dict, project_id: str = ""):
        logger.info(f"QG-0 failed for {plane_id}: {errors}")
        return

-    # Generate work item ID
-    work_item_id = get_next_work_item_id(repo, proj.work_item_prefix)
+    # Generate work item ID.
+    # M-6: source of truth for the number is the Plane sequence_id. Fetch it by
+    # issue UUID; if Plane is unavailable, fall back to the DB increment so a
+    # Plane outage never blocks task creation (autonomy > exact numbering).
+    from ..plane_sync import fetch_issue_sequence_id
+    seq = fetch_issue_sequence_id(plane_id, plane_project_id)
+    if seq is not None:
+        work_item_id = f"{proj.work_item_prefix}-{seq:03d}"
+    else:
+        work_item_id = get_next_work_item_id(repo, proj.work_item_prefix)
+        logger.warning(
+            f"Plane sequence_id unavailable for {plane_id}, "
+            f"fell back to DB increment: {work_item_id}"
+        )
+
+    # BUG 2a: uniqueness-guard LAYERED ON TOP of the M-6 derive above (the derive
+    # itself is untouched). If the derived ET-NNN is already taken by another
+    # task in this repo (collision -> two tasks would share branch/worktree, see
+    # ET-006), bump to the next free number.
+    _derived = work_item_id
+    work_item_id = ensure_unique_work_item_id(work_item_id, repo)
+    if work_item_id != _derived:
+        logger.warning(
+            f"work_item_id collision: derived {_derived} already in use for "
+            f"{repo}; reassigned {plane_id} -> {work_item_id}"
+        )

    # Create slug from name
    slug = re.sub(r"[^a-z0-9]+", "-", name.lower()).strip("-")[:30]
    branch = f"feature/{work_item_id}-{slug}"

+    # BUG 2b (defense-in-depth): the worktree/path is keyed by BRANCH
+    # (git_worktree.get_worktree_path) and tasks are reverse-resolved by
+    # (repo, branch). With 2a the work_item_id is unique, so the branch prefix is
+    # too; but the slug could still collide (e.g. two issues with the same title
+    # under different ids -> fine) or, worse, an identical branch already exist.
+    # Guard physically: if this exact branch is already owned by another task in
+    # this repo, disambiguate with the (now unique) work_item_id so two tasks can
+    # never share a worktree.
+    _conn_b = get_db()
+    _branch_taken = _conn_b.execute(
+        "SELECT 1 FROM tasks WHERE repo = ? AND branch = ? LIMIT 1", (repo, branch)
+    ).fetchone()
+    _conn_b.close()
+    if _branch_taken is not None:
+        branch = f"feature/{work_item_id}-{plane_id[:8]}"
+        logger.warning(
+            f"branch collision for {repo}; disambiguated to unique branch {branch}"
+        )
+
    # Insert task into DB
    conn = get_db()
    conn.execute(
-        "INSERT INTO tasks (plane_id, work_item_id, repo, branch, stage, plane_issue_id) VALUES (?, ?, ?, ?, ?, ?)",
-        (plane_id, work_item_id, repo, branch, "analysis", plane_id),
+        "INSERT INTO tasks (plane_id, work_item_id, repo, branch, stage, plane_issue_id, title) "
+        "VALUES (?, ?, ?, ?, ?, ?, ?)",
+        (plane_id, work_item_id, repo, branch, "analysis", plane_id, name),
    )
    conn.commit()
    conn.close()
@@ -185,213 +523,133 @@ async def handle_work_item_created(data: dict, project_id: str = ""):
        task_row = get_db().execute("SELECT id FROM tasks WHERE work_item_id=?", (work_item_id,)).fetchone()
        if task_row:
            task_id = task_row[0]
-            task_desc = f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\nStage: analysis\nTitle: {name}"
-            run_id = launcher.launch("analyst", repo, task_desc, task_id=task_id)
-            logger.info(f"Task {task_id}: launched analyst (run_id={run_id})")
+            task_desc = (
+                f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\n"
+                f"Stage: analysis\nTitle: {name}\n\nDescription:\n{description}"
+            )
+            job_id = enqueue_job("analyst", repo, task_desc, task_id=task_id)
+            logger.info(f"Task {task_id}: enqueued analyst (job_id={job_id})")
            # Post start comment to Plane
            from ..plane_sync import add_comment as _add_comment
-            _add_comment(work_item_id, "\U0001f50d Analyst \u0437\u0430\u043f\u0443\u0449\u0435\u043d. BRD/\u0422\u0417/AC/TestPlan \u0432 \u0440\u0430\u0431\u043e\u0442\u0435 (\u043e\u0436\u0438\u0434\u0430\u0439\u0442\u0435 8-15 \u043c\u0438\u043d).")
+            _add_comment(work_item_id, "\U0001f50d Analyst \u0437\u0430\u043f\u0443\u0449\u0435\u043d. BRD/\u0422\u0417/AC/TestPlan \u0432 \u0440\u0430\u0431\u043e\u0442\u0435 (\u043e\u0436\u0438\u0434\u0430\u0439\u0442\u0435 8-15 \u043c\u0438\u043d).", author="analyst")
    except Exception as e:
        logger.error(f"Failed to launch analyst for {work_item_id}: {e}")


 async def handle_comment(data: dict, project_id: str = ""):
+    """Status-only verdict model: comments NEVER drive the pipeline.
+
+    The whole comment-based control mechanism (``:approved:`` / ``:rejected:``
+    and the analysis answer-to-questions flow) was removed. It caused bug 3
+    (echo self-hit): the analyst posts its own "waiting for approval" comment,
+    handle_comment catches its own comment and reverts In Review -> In Progress.
+
+    Comments are now logged only — no status change, no enqueue, no side effect.
+    The pipeline is driven solely by status changes (handle_issue_updated):
+      - Approved  -> advance
+      - Rejected  -> rollback (reason pulled from the latest comment)
+      - In Progress (returned from Needs Input) -> relaunch the stage agent
    """
-    Handle comment event — check for :approved: or :rejected:.
-    Advance or rollback stage accordingly.
+    plane_id = str(
+        data.get("work_item_id") or data.get("issue_id") or data.get("issue") or ""
+    )
+    logger.info(
+        f"comment.created for {plane_id}: logged only, no pipeline action "
+        f"(status-only verdict model)"
+    )
+
+
+async def _rollback_stage(
+    task_id: int, current_stage: str, repo: str, work_item_id: str, branch: str,
+    reason: str,
+):
+    """Rollback triggered by a status change to Rejected.
+
+      - at analysis: relaunch the analyst with the rejection reason;
+      - otherwise: roll back to the previous stage and relaunch its agent
+        (via the existing rollback notify + an enqueue of the prev-stage agent).
    """
-    comment_body = data.get("comment_stripped", data.get("comment", data.get("body", data.get("comment_html", ""))))
-    plane_id = str(data.get("work_item_id") or data.get("issue_id") or data.get("issue") or "")
-
-    if not plane_id:
-        logger.warning("Comment event without work_item_id, skipping")
-        return
-
-    task = get_task_by_plane_id(plane_id)
-    if not task:
-        logger.warning(f"No task found for plane_id={plane_id}")
-        return
-
-    task_id = task["id"]
-    current_stage = task["stage"]
-    repo = task["repo"]
-    work_item_id = task.get("work_item_id", "")
-    branch = task.get("branch", "")
-
-    if ":rejected:" in comment_body:
-        # Extract reason (text after :rejected:)
-        reason = comment_body.split(":rejected:", 1)[-1].strip()[:300]
-
-        if current_stage == "analysis":
-            # Already in analysis — just relaunch analyst with rejection reason
-            from ..plane_sync import set_issue_in_progress
-            set_issue_in_progress(work_item_id)
-            task_desc = (
-                f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\n"
-                f"Stage: analysis\nNote: Stakeholder REJECTED your artifacts. "
-                f"Reason: {reason}\nRevise and improve."
-            )
-            new_run = launcher.launch("analyst", repo, task_desc, task_id=task_id)
-            from ..plane_sync import add_comment as _plane_comment
-            _plane_comment(work_item_id, f"\U0001f504 Analyst \u043f\u0435\u0440\u0435\u0437\u0430\u043f\u0443\u0449\u0435\u043d. \u041f\u0440\u0438\u0447\u0438\u043d\u0430 \u043e\u0442\u043a\u043b\u043e\u043d\u0435\u043d\u0438\u044f: {reason}")
-            logger.info(f"Task {task_id}: rejected at analysis, relaunched analyst")
-        else:
-            # Rollback to previous stage
-            prev_stage = get_previous_stage(current_stage)
-            if prev_stage:
-                update_task_stage(task_id, prev_stage)
-                from ..plane_sync import set_issue_in_progress
-                set_issue_in_progress(work_item_id)
-                notify_stage_change(task_id, current_stage, prev_stage)
-                plane_notify_stage(work_item_id, current_stage, prev_stage)
-                from ..plane_sync import add_comment as _plane_comment
-                _plane_comment(work_item_id, f"\U0001f504 \u041e\u0442\u043a\u0430\u0442: {current_stage} \u2192 {prev_stage}. \u041f\u0440\u0438\u0447\u0438\u043d\u0430: {reason}")
-                logger.info(f"Task {task_id}: rejected, rolled back {current_stage} \u2192 {prev_stage}")
-        return
-
-    if ":approved:" in comment_body:
+    if current_stage == "analysis":
+        # Already in analysis — just relaunch analyst with rejection reason
        from ..plane_sync import set_issue_in_progress
        set_issue_in_progress(work_item_id)
-        # Try to advance stage
-        await _try_advance_stage(task_id, current_stage, repo, work_item_id, branch)
+        task_desc = (
+            f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\n"
+            f"Stage: analysis\nNote: Stakeholder REJECTED your artifacts. "
+            f"Reason: {reason}\nRevise and improve."
+        )
+        new_job = enqueue_job("analyst", repo, task_desc, task_id=task_id)
+        from ..plane_sync import add_comment as _plane_comment
+        _plane_comment(work_item_id, f"\U0001f504 Analyst \u043f\u0435\u0440\u0435\u0437\u0430\u043f\u0443\u0449\u0435\u043d. \u041f\u0440\u0438\u0447\u0438\u043d\u0430 \u043e\u0442\u043a\u043b\u043e\u043d\u0435\u043d\u0438\u044f: {reason}", author="analyst")
+        logger.info(f"Task {task_id}: rejected at analysis, enqueued analyst (job_id={new_job})")
        return

-    # Task 3: If neither :approved: nor :rejected: — check if this is an answer to questions
-    if current_stage == "analysis":
-        from ..plane_sync import PLANE_STATES, set_issue_in_progress
-        issue_id = task.get("plane_issue_id") or task.get("plane_id")
-        if not issue_id:
-            issue_id = plane_id
-        if issue_id:
-            from ..plane_sync import PLANE_BASE, PLANE_HEADERS, WORKSPACE
-            from ..plane_sync import PROJECT_ID as _DEFAULT_PROJECT_ID
-            # ORCH-6: route to this task's own Plane project (resolved from repo).
-            _proj = get_project_by_repo(repo)
-            _pid = _proj.plane_project_id if _proj else (project_id or _DEFAULT_PROJECT_ID)
-            import httpx as _httpx
-            try:
-                _resp = _httpx.get(
-                    f"{PLANE_BASE}/workspaces/{WORKSPACE}/projects/{_pid}/issues/{issue_id}/",
-                    headers=PLANE_HEADERS, timeout=10
-                )
-                if _resp.status_code == 200:
-                    issue_data = _resp.json()
-                    if issue_data.get("state") == PLANE_STATES["needs_input"]:
-                        # Task 11: Check analyst retry count (max 3 question rounds)
-                        conn3 = get_db()
-                        analyst_runs = conn3.execute(
-                            "SELECT COUNT(*) FROM agent_runs WHERE task_id=? AND agent='analyst'",
-                            (task_id,)
-                        ).fetchone()[0]
-                        conn3.close()
-
-                        if analyst_runs >= 4:  # initial + 3 retries
-                            from ..plane_sync import set_issue_blocked, add_comment as _pc
-                            set_issue_blocked(work_item_id)
-                            _pc(
-                                work_item_id,
-                                "\U0001f6a8 3 \u0440\u0430\u0443\u043d\u0434\u0430 \u0443\u0442\u043e\u0447\u043d\u0435\u043d\u0438\u0439 \u0438\u0441\u0447\u0435\u0440\u043f\u0430\u043d\u044b. Analyst \u043d\u0435 \u043c\u043e\u0436\u0435\u0442 \u0441\u0444\u043e\u0440\u043c\u0438\u0440\u043e\u0432\u0430\u0442\u044c \u0422\u0417. "
-                                "\u0422\u0440\u0435\u0431\u0443\u0435\u0442\u0441\u044f \u0431\u043e\u043b\u0435\u0435 \u0434\u0435\u0442\u0430\u043b\u044c\u043d\u043e\u0435 \u043e\u043f\u0438\u0441\u0430\u043d\u0438\u0435 \u0438\u043b\u0438 \u0432\u0441\u0442\u0440\u0435\u0447\u0430."
-                            )
-                            from ..notifications import send_telegram
-                            send_telegram(f"\U0001f6a8 {work_item_id}: 3 \u0440\u0430\u0443\u043d\u0434\u0430 \u0432\u043e\u043f\u0440\u043e\u0441\u043e\u0432 analyst'\u0430 \u0438\u0441\u0447\u0435\u0440\u043f\u0430\u043d\u044b. \u041d\u0443\u0436\u043d\u0430 \u043f\u043e\u043c\u043e\u0449\u044c.")
-                            return
-
-                        # This is an answer to analyst's questions — relaunch
-                        set_issue_in_progress(work_item_id)
-                        task_desc = (
-                            f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\n"
-                            f"Stage: analysis\nNote: Stakeholder answered your questions. "
-                            f"Read the latest comment in Plane and revise your artifacts.\n"
-                            f"Answer: {comment_body[:500]}"
-                        )
-                        new_run = launcher.launch("analyst", repo, task_desc, task_id=task_id)
-                        from ..plane_sync import add_comment as _pc2
-                        _pc2(work_item_id, "\U0001f504 Analyst \u043f\u0435\u0440\u0435\u0437\u0430\u043f\u0443\u0449\u0435\u043d \u0441 \u043e\u0442\u0432\u0435\u0442\u0430\u043c\u0438 \u0441\u0442\u0435\u0439\u043a\u0445\u043e\u043b\u0434\u0435\u0440\u0430.")
-                        logger.info(f"Task {task_id}: stakeholder answered questions, relaunched analyst (run_id={new_run})")
-                        return
-            except Exception as e:
-                logger.error(f"Failed to check issue state: {e}")
+    # Rollback to previous stage
+    prev_stage = get_previous_stage(current_stage)
+    if not prev_stage:
+        logger.info(f"Task {task_id}: rejected at {current_stage} but no previous stage")
+        return
+    update_task_stage(task_id, prev_stage)
+    notify_stage_change(task_id, current_stage, prev_stage)
+    # Feature 3: plane_notify_stage moves the board to the prev stage's status.
+    plane_notify_stage(work_item_id, current_stage, prev_stage)
+    # Then put it back to In Progress so the relaunched agent is clearly working.
+    from ..plane_sync import set_issue_in_progress
+    set_issue_in_progress(work_item_id)
+    from ..plane_sync import add_comment as _plane_comment, STAGE_AUTHORS
+    _plane_comment(
+        work_item_id,
+        f"\U0001f504 \u041e\u0442\u043a\u0430\u0442: {current_stage} \u2192 {prev_stage}. \u041f\u0440\u0438\u0447\u0438\u043d\u0430: {reason}",
+        author=STAGE_AUTHORS.get(prev_stage, "stream"),
+    )
+    # Relaunch the previous stage's agent so the rollback actually re-runs work.
+    # STAGE_AUTHORS maps a stage directly to the role that OWNS work in it
+    # (analysis->analyst, architecture->architect, ...), which is exactly the
+    # agent we must re-run on a rollback into prev_stage.
+    from ..plane_sync import STAGE_AUTHORS as _STAGE_AUTHORS
+    prev_agent = _STAGE_AUTHORS.get(prev_stage)
+    if prev_agent:
+        task_desc = (
+            f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\n"
+            f"Stage: {prev_stage}\nNote: Stakeholder REJECTED. Reason: {reason}\n"
+            f"Revise and improve."
+        )
+        new_job = enqueue_job(prev_agent, repo, task_desc, task_id=task_id)
+        logger.info(
+            f"Task {task_id}: rejected, rolled back {current_stage} \u2192 {prev_stage}, "
+            f"enqueued {prev_agent} (job_id={new_job})"
+        )
+    else:
+        logger.info(f"Task {task_id}: rejected, rolled back {current_stage} \u2192 {prev_stage}")


 async def _try_advance_stage(
    task_id: int, current_stage: str, repo: str, work_item_id: str, branch: str
 ):
-    """Run QG check for current stage and advance if passed."""
-    qg_name = get_qg_for_stage(current_stage)
-    next_stage = get_next_stage(current_stage)
+    """Thin async wrapper over the unified stage engine (ORCH-4 / M-3).

-    if not next_stage:
-        logger.info(f"Task {task_id}: already at terminal stage '{current_stage}'")
-        return
+    The QG dispatch (including the check_review_approved PR-by-branch logic) and
+    the advance/launch logic now live in src/stage_engine.advance_stage(), which
+    is synchronous. We run it off the event loop via asyncio.to_thread so there
+    is exactly one implementation shared with the launcher.

-    # Run QG check if one is required
-    if qg_name:
-        qg_func = QG_CHECKS.get(qg_name)
-        if not qg_func:
-            logger.error(f"QG function '{qg_name}' not found in registry")
-            return
+    finished_agent is None on this webhook path (a human Approved status change,
+    not a finished agent), so the agent-specific rollback branches inside the
+    engine intentionally do not trigger — the webhook path only runs the QG and
+    either advances or reports the failure.
+    """
+    import asyncio
+    from ..stage_engine import advance_stage

-        # Determine args based on QG function
-        if qg_name in ("check_analysis_approved", "check_analysis_complete", "check_architecture_done", "check_tests_passed", "check_reviewer_verdict"):
-            # ORCH-2 / S-4: pass branch so artifacts are read from the task worktree.
-            passed, reason = qg_func(repo, work_item_id, branch)
-        elif qg_name in ("check_ci_green", "check_tests_local"):
-            passed, reason = qg_func(repo, branch)
-        elif qg_name == "check_review_approved":
-            # Find PR number by branch via Gitea API
-            import httpx as _httpx
-            from ..config import settings as _s
-            _owner = _s.gitea_owner
-            _url = f"{_s.gitea_url}/api/v1/repos/{_owner}/{repo}/pulls?state=open&limit=50"
-            _headers = {"Authorization": f"token {_s.gitea_token}"}
-            try:
-                _resp = _httpx.get(_url, headers=_headers, timeout=10)
-                _prs = _resp.json()
-                _pr_number = None
-                for _pr in _prs:
-                    if _pr.get("head", {}).get("ref") == branch:
-                        _pr_number = _pr["number"]
-                        break
-                if _pr_number:
-                    passed, reason = qg_func(repo, _pr_number)
-                else:
-                    # No open PR but review file exists — check file-based
-                    import os
-                    from ..git_worktree import get_worktree_path as _gwp
-                    _wt = _gwp(repo, branch) if os.path.isdir(_gwp(repo, branch)) else os.path.join(_s.repos_dir, repo)
-                    _review_path = os.path.join(_wt, f"docs/work-items/{work_item_id}/12-review.md")
-                    _review_path2 = os.path.join(_wt, f"docs/work-items/{work_item_id}/09-review.md")
-                    if os.path.isfile(_review_path) or os.path.isfile(_review_path2):
-                        passed, reason = True, "Review file exists (file-based approval)"
-                    else:
-                        passed, reason = False, "No open PR found and no review file"
-            except Exception as _e:
-                passed, reason = False, f"Error finding PR: {_e}"
-        else:
-            passed, reason = False, f"Unknown QG: {qg_name}"
-
-        if not passed:
-            notify_qg_failure(task_id, current_stage, qg_name, reason)
-            plane_notify_qg(work_item_id, current_stage, qg_name, reason)
-            return
-
-    # Advance stage
-    update_task_stage(task_id, next_stage)
-    notify_stage_change(task_id, current_stage, next_stage)
-    plane_notify_stage(work_item_id, current_stage, next_stage)
-
-    # Launch agent associated with the current stage's transition
-    agent = get_agent_for_stage(current_stage)
-    if agent:
-        try:
-            task_desc = f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\nStage: {next_stage}"
-            run_id = launcher.launch(agent, repo, task_desc, task_id=task_id)
-            plane_notify_stage(work_item_id, current_stage, next_stage, agent)
-            logger.info(f"Task {task_id}: launched agent '{agent}', run_id={run_id}")
-        except Exception as e:
-            notify_error(task_id, f"Failed to launch agent '{agent}': {e}")
-            logger.error(f"Agent launch failed: {e}")
+    await asyncio.to_thread(
+        advance_stage,
+        task_id,
+        current_stage,
+        repo,
+        work_item_id,
+        branch,
+        None,
+    )


 async def _create_gitea_branch(repo: str, branch: str):
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -0,0 +1,40 @@
+"""Global pytest fixtures.
+
+test(conftest): mute Telegram in ALL tests to stop prod leakage.
+
+Background: a pytest run on prod was sending REAL Telegram messages to Slava,
+because some tests (e.g. test_webhook_dedup advancing a stage) reach
+notify_stage_change -> send_telegram, which reads the live .env
+telegram_bot_token/chat_id and actually POSTs to Telegram.
+
+This autouse fixture stubs send_telegram to a no-op for every test:
+
+  - "src.notifications.send_telegram" is the SOURCE. All the notify_* helpers in
+    notifications.py call the module-global send_telegram, and every other module
+    that does a *local* `from .notifications import send_telegram` inside a
+    function resolves it live at call time -> covered by patching the source.
+
+  - "src.stage_engine.send_telegram" is patched too, because stage_engine binds
+    send_telegram as a MODULE-LEVEL name (from .notifications import send_telegram
+    at import), so a patch of the source alone would not intercept its 3 direct
+    calls. webhooks/plane and launcher import it locally inside functions, so the
+    source patch already covers them; they are patched defensively with
+    raising=False anyway in case that ever changes.
+
+raising=False so a module that doesn't (yet) expose the name never breaks setup.
+"""
+
+import pytest
+
+
+@pytest.fixture(autouse=True)
+def _no_telegram(monkeypatch):
+    _noop = lambda *a, **k: None  # noqa: E731
+    # Source of truth (covers notifications.notify_* and all local re-imports).
+    monkeypatch.setattr("src.notifications.send_telegram", _noop, raising=False)
+    # Module-level binding in stage_engine (and defensive coverage elsewhere).
+    monkeypatch.setattr("src.stage_engine.send_telegram", _noop, raising=False)
+    monkeypatch.setattr("src.webhooks.plane.send_telegram", _noop, raising=False)
+    monkeypatch.setattr("src.agents.launcher.send_telegram", _noop, raising=False)
+    monkeypatch.setattr("src.queue_worker.send_telegram", _noop, raising=False)
+    yield
--- a/tests/test_analyst_comment.py
+++ b/tests/test_analyst_comment.py
@@ -0,0 +1,74 @@
+"""BUG C: analyst "artifacts ready" comment under the status-only model.
+
+The comment must ask for the **Approved** status (not the obsolete
+":approved:" reaction, not moving back to "In Progress") and link only the
+docs that actually exist in the worktree.
+"""
+
+import os
+import tempfile
+
+os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
+os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token")
+
+
+def test_analyst_comment_asks_approved_with_links(monkeypatch, tmp_path):
+    from src import stage_engine as SE
+
+    # Worktree with only SOME of the candidate docs present.
+    wt = tmp_path / "wt"
+    docs = wt / "docs" / "work-items" / "ET-011"
+    docs.mkdir(parents=True)
+    for fname in ("00-business-request.md", "01-brd.md", "02-trz.md",
+                  "03-acceptance-criteria.md", "04-test-plan.yaml"):
+        (docs / fname).write_text("x")
+    # 04b-ui-test-cases.md intentionally absent -> must NOT be linked
+
+    monkeypatch.setattr(SE, "get_worktree_path", lambda repo, branch: str(wt))
+    # public URL set -> links must be built from it (not gitea_url)
+    monkeypatch.setattr(SE.settings, "gitea_url", "http://localhost:3000")
+    monkeypatch.setattr(SE.settings, "gitea_public_url", "https://git.mva154.duckdns.org")
+    monkeypatch.setattr(SE.settings, "gitea_owner", "admin")
+
+    html = SE._build_analyst_ready_comment(
+        "enduro-trails", "ET-011", "feature/ET-011-gpx-upload-feature"
+    )
+
+    # text asks for the Approved STATUS, not the obsolete mechanisms
+    assert "Approved" in html
+    assert ":approved:" not in html
+    assert "In Progress" not in html
+    assert "Rejected" in html
+    # clickable links to docs that ACTUALLY exist
+    assert "<a href=" in html
+    base = ("https://git.mva154.duckdns.org/admin/enduro-trails/src/branch/"
+            "feature/ET-011-gpx-upload-feature/docs/work-items/ET-011/")
+    assert base + "01-brd.md" in html
+    assert base + "04-test-plan.yaml" in html
+    # the missing file is NOT invented
+    assert "04b-ui-test-cases.md" not in html
+    # internal git url must NOT appear in clickable links
+    assert "localhost:3000" not in html
+
+
+def test_analyst_comment_falls_back_to_gitea_url(monkeypatch, tmp_path):
+    """When gitea_public_url is empty, links fall back to gitea_url."""
+    from src import stage_engine as SE
+
+    wt = tmp_path / "wt"
+    docs = wt / "docs" / "work-items" / "ET-011"
+    docs.mkdir(parents=True)
+    (docs / "01-brd.md").write_text("x")
+
+    monkeypatch.setattr(SE, "get_worktree_path", lambda repo, branch: str(wt))
+    monkeypatch.setattr(SE.settings, "gitea_url", "http://localhost:3000")
+    monkeypatch.setattr(SE.settings, "gitea_public_url", "")
+    monkeypatch.setattr(SE.settings, "gitea_owner", "admin")
+
+    html = SE._build_analyst_ready_comment(
+        "enduro-trails", "ET-011", "feature/ET-011-gpx-upload-feature"
+    )
+
+    base = ("http://localhost:3000/admin/enduro-trails/src/branch/"
+            "feature/ET-011-gpx-upload-feature/docs/work-items/ET-011/")
+    assert base + "01-brd.md" in html
--- a/tests/test_launcher.py
+++ b/tests/test_launcher.py
@@ -7,6 +7,7 @@ Covers the audit-2026-06-02 fixes:
         the YAML frontmatter only (no fragile substring matching).
 """
 import os
+import signal
 import tempfile

 import pytest
@@ -20,6 +21,7 @@ os.environ["ORCH_PLANE_API_TOKEN"] = "test-token"

 from src.agents.launcher import AgentLauncher
 from src.qg.checks import check_reviewer_verdict
+from src.config import settings


 # ---------------------------------------------------------------------------
@@ -138,3 +140,141 @@ class TestCheckReviewerVerdict:
        passed, reason = check_reviewer_verdict("enduro-trails", "ET-999")
        assert passed is False
        assert "not found" in reason.lower()
+
+
+# ---------------------------------------------------------------------------
+# ORCH-7 (M-4): dead code removed
+# ---------------------------------------------------------------------------
+class TestDeadCodeRemoved:
+    """M-4: _auto_merge_pr was never called (merge is the deployer's job) and is
+    removed. _ensure_pr (used by the auto-advance path) must stay."""
+
+    def test_auto_merge_pr_is_gone(self):
+        assert not hasattr(AgentLauncher, "_auto_merge_pr")
+
+    def test_ensure_pr_still_present(self):
+        assert hasattr(AgentLauncher, "_ensure_pr")
+
+
+# ---------------------------------------------------------------------------
+# ORCH-7 (M-2): configurable timeout + per-agent override
+# ---------------------------------------------------------------------------
+class TestResolveTimeout:
+    """M-2: _resolve_timeout honours a per-agent JSON override, else the default."""
+
+    def test_default_when_no_override(self, monkeypatch):
+        monkeypatch.setattr(settings, "agent_timeout_seconds", 1800)
+        monkeypatch.setattr(settings, "agent_timeout_overrides_json", "")
+        assert AgentLauncher._resolve_timeout("developer") == 1800
+        assert AgentLauncher._resolve_timeout(None) == 1800
+
+    def test_override_for_specific_agent(self, monkeypatch):
+        monkeypatch.setattr(settings, "agent_timeout_seconds", 1800)
+        monkeypatch.setattr(
+            settings, "agent_timeout_overrides_json", '{"reviewer": 3600, "architect": 2700}'
+        )
+        assert AgentLauncher._resolve_timeout("reviewer") == 3600
+        assert AgentLauncher._resolve_timeout("architect") == 2700
+        # an agent not in the override map falls back to the default
+        assert AgentLauncher._resolve_timeout("developer") == 1800
+
+    def test_malformed_override_falls_back_to_default(self, monkeypatch):
+        monkeypatch.setattr(settings, "agent_timeout_seconds", 1800)
+        monkeypatch.setattr(settings, "agent_timeout_overrides_json", "{not-json")
+        # must not raise, must return the default
+        assert AgentLauncher._resolve_timeout("reviewer") == 1800
+
+
+class TestWatchdogGracefulKill:
+    """M-2: SIGTERM -> grace -> SIGKILL ordering, with graceful-exit short-circuit
+    and ProcessLookupError tolerance. The OS process is fully faked: we record the
+    signals sent and decide liveness from a script, so no real process is touched."""
+
+    def _patch_db(self, monkeypatch):
+        """Stub get_db so _record_kill does not need a real DB."""
+        class _Conn:
+            def execute(self, *a, **k):
+                return self
+            def commit(self):
+                pass
+            def close(self):
+                pass
+        monkeypatch.setattr("src.agents.launcher.get_db", lambda: _Conn())
+
+    def test_sigterm_then_sigkill_after_grace(self, monkeypatch):
+        """Process stays alive through the whole grace window -> SIGTERM then SIGKILL."""
+        self._patch_db(monkeypatch)
+        monkeypatch.setattr(settings, "agent_kill_grace_seconds", 1)
+        monkeypatch.setattr("src.agents.launcher.time.sleep", lambda s: None)
+
+        sent = []
+
+        def fake_kill(pid, sig):
+            sent.append(sig)
+            # signal 0 (liveness probe) -> always alive; never raise
+            return None
+
+        monkeypatch.setattr("src.agents.launcher.os.kill", fake_kill)
+
+        launcher = AgentLauncher()
+        launcher._watchdog(pid=4242, run_id=1, timeout=0, agent="developer")
+
+        assert signal.SIGTERM in sent
+        assert signal.SIGKILL in sent
+        # SIGTERM must come before SIGKILL
+        assert sent.index(signal.SIGTERM) < sent.index(signal.SIGKILL)
+
+    def test_graceful_exit_in_grace_skips_sigkill(self, monkeypatch):
+        """Process dies during the grace window -> SIGKILL is NOT sent."""
+        self._patch_db(monkeypatch)
+        monkeypatch.setattr(settings, "agent_kill_grace_seconds", 5)
+        monkeypatch.setattr("src.agents.launcher.time.sleep", lambda s: None)
+
+        sent = []
+        state = {"alive": True, "probes": 0}
+
+        def fake_kill(pid, sig):
+            if sig == 0:
+                state["probes"] += 1
+                # die on the 2nd liveness probe (within grace)
+                if state["probes"] >= 2:
+                    raise ProcessLookupError
+                return None
+            sent.append(sig)
+            return None
+
+        monkeypatch.setattr("src.agents.launcher.os.kill", fake_kill)
+
+        launcher = AgentLauncher()
+        launcher._watchdog(pid=4242, run_id=2, timeout=0, agent="developer")
+
+        assert signal.SIGTERM in sent
+        assert signal.SIGKILL not in sent
+
+    def test_already_dead_before_sigterm(self, monkeypatch):
+        """Process already gone at SIGTERM -> ProcessLookupError tolerated, no SIGKILL,
+        and _record_kill is NOT called (the monitor's proc.wait owns the exit)."""
+        self._patch_db(monkeypatch)
+        monkeypatch.setattr("src.agents.launcher.time.sleep", lambda s: None)
+
+        sent = []
+
+        def fake_kill(pid, sig):
+            if sig == signal.SIGTERM:
+                raise ProcessLookupError
+            sent.append(sig)
+            return None
+
+        recorded = {"called": False}
+        monkeypatch.setattr(
+            AgentLauncher, "_record_kill",
+            staticmethod(lambda rid: recorded.__setitem__("called", True)),
+        )
+        monkeypatch.setattr("src.agents.launcher.os.kill", fake_kill)
+
+        launcher = AgentLauncher()
+        # must not raise
+        launcher._watchdog(pid=4242, run_id=3, timeout=0, agent="developer")
+
+        assert signal.SIGKILL not in sent
+        assert recorded["called"] is False
--- a/tests/test_log_rotation.py
+++ b/tests/test_log_rotation.py
@@ -0,0 +1,92 @@
+"""L-2: tests for prune_run_logs (run-log rotation).
+
+Verifies that old / surplus *.log files are removed while fresh logs, non-.log
+files, the active log, and subdirectories are left intact. Function is
+best-effort and must never raise.
+"""
+import os
+import time
+
+from src.agents.launcher import prune_run_logs
+
+
+def _touch(path, age_days=0):
+    with open(path, "w") as f:
+        f.write("x")
+    mtime = time.time() - age_days * 86400
+    os.utime(path, (mtime, mtime))
+    return path
+
+
+def test_old_logs_removed_fresh_kept(tmp_path):
+    runs = tmp_path
+    fresh = _touch(str(runs / "1.log"), age_days=1)
+    old = _touch(str(runs / "2.log"), age_days=40)
+
+    removed = prune_run_logs(str(runs), keep_days=30, keep_max=500)
+
+    assert removed == 1
+    assert os.path.exists(fresh)
+    assert not os.path.exists(old)
+
+
+def test_non_log_files_untouched(tmp_path):
+    runs = tmp_path
+    old_log = _touch(str(runs / "stale.log"), age_days=99)
+    keep_txt = _touch(str(runs / "notes.txt"), age_days=99)
+    keep_db = _touch(str(runs / "orchestrator.db"), age_days=99)
+
+    prune_run_logs(str(runs), keep_days=30, keep_max=500)
+
+    assert not os.path.exists(old_log)
+    assert os.path.exists(keep_txt)
+    assert os.path.exists(keep_db)
+
+
+def test_keep_max_retains_newest(tmp_path):
+    runs = tmp_path
+    # 5 logs, all recent (within keep_days), increasing age 0..4 days.
+    paths = []
+    for i in range(5):
+        paths.append(_touch(str(runs / f"{i}.log"), age_days=i))
+
+    removed = prune_run_logs(str(runs), keep_days=365, keep_max=2)
+
+    # Only the 2 newest (age 0, 1) survive.
+    assert removed == 3
+    assert os.path.exists(paths[0])
+    assert os.path.exists(paths[1])
+    for p in paths[2:]:
+        assert not os.path.exists(p)
+
+
+def test_active_log_never_removed(tmp_path):
+    runs = tmp_path
+    active = _touch(str(runs / "active.log"), age_days=99)
+    other = _touch(str(runs / "other.log"), age_days=99)
+
+    removed = prune_run_logs(
+        str(runs), keep_days=30, keep_max=500, active_paths=[active]
+    )
+
+    assert removed == 1
+    assert os.path.exists(active)
+    assert not os.path.exists(other)
+
+
+def test_subdirs_untouched(tmp_path):
+    runs = tmp_path
+    sub = runs / "sub.log"
+    sub.mkdir()  # a directory that happens to end in .log
+    old_log = _touch(str(runs / "old.log"), age_days=99)
+
+    prune_run_logs(str(runs), keep_days=30, keep_max=500)
+
+    assert sub.is_dir()
+    assert not os.path.exists(old_log)
+
+
+def test_missing_dir_is_noop(tmp_path):
+    missing = tmp_path / "does-not-exist"
+    # Must not raise.
+    assert prune_run_logs(str(missing)) == 0
--- a/tests/test_m6_sequence.py
+++ b/tests/test_m6_sequence.py
@@ -0,0 +1,187 @@
+"""M-6: work_item_id derived from Plane sequence_id (source of truth = Plane).
+
+Covers:
+  * fetch_issue_sequence_id returns int on a valid Plane response (mocked httpx);
+  * returns None on network error / missing field WITHOUT raising;
+  * handle_work_item_created uses prefix-NNN when seq is available, and falls
+    back to get_next_work_item_id when seq is None (Plane down => autonomy);
+  * find_issue_id no longer hardcodes 'ET-' and matches an arbitrary prefix
+    (e.g. ORCH-005) by sequence_id.
+"""
+
+import os
+import tempfile
+
+import pytest
+
+_test_db = os.path.join(tempfile.gettempdir(), "test_orchestrator_m6.db")
+os.environ["ORCH_DB_PATH"] = _test_db
+os.environ.setdefault("ORCH_PLANE_WEBHOOK_SECRET", "")
+os.environ.setdefault("ORCH_GITEA_WEBHOOK_SECRET", "")
+os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
+os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token")
+
+from unittest.mock import patch, AsyncMock, MagicMock  # noqa: E402
+
+from fastapi.testclient import TestClient  # noqa: E402
+
+from src.main import app  # noqa: E402
+from src.db import init_db, get_db  # noqa: E402
+from src import projects as P  # noqa: E402
+from src.projects import reload_projects  # noqa: E402
+import src.plane_sync as plane_sync  # noqa: E402
+
+ORCH_PLANE_ID = "8da6aa25-a60e-44d6-a1e2-d8ae59aa7d6a"
+ENDURO_PLANE_ID = "7a79f0a9-5278-49cd-9007-9a338f238f9c"
+
+client = TestClient(app)
+
+
+@pytest.fixture(autouse=True)
+def setup(monkeypatch):
+    monkeypatch.setattr(P.settings, "db_path", _test_db)
+    import src.db as _db
+    monkeypatch.setattr(_db.settings, "db_path", _test_db)
+    if os.path.exists(_test_db):
+        os.unlink(_test_db)
+    init_db()
+
+    monkeypatch.setattr("src.webhooks.plane.verify_plane_signature", lambda body, sig: True)
+
+    registry_json = (
+        f'[{{"plane_project_id": "{ENDURO_PLANE_ID}", "repo": "enduro-trails",'
+        f' "work_item_prefix": "ET", "name": "enduro-trails"}},'
+        f' {{"plane_project_id": "{ORCH_PLANE_ID}", "repo": "orchestrator",'
+        f' "work_item_prefix": "ORCH", "name": "orchestrator"}}]'
+    )
+    monkeypatch.setattr(P.settings, "projects_json", registry_json)
+    reload_projects()
+
+    yield
+
+    reload_projects()
+    if os.path.exists(_test_db):
+        os.unlink(_test_db)
+
+
+def _mock_resp(json_body, status=200):
+    m = MagicMock()
+    m.json.return_value = json_body
+    m.raise_for_status.return_value = None
+    if status >= 400:
+        def _raise():
+            raise RuntimeError(f"HTTP {status}")
+        m.raise_for_status.side_effect = _raise
+    return m
+
+
+# ---------------------------------------------------------------------------
+# fetch_issue_sequence_id
+# ---------------------------------------------------------------------------
+
+def test_fetch_sequence_id_returns_int():
+    with patch.object(plane_sync.httpx, "get", return_value=_mock_resp({"sequence_id": 42})):
+        seq = plane_sync.fetch_issue_sequence_id("issue-uuid", "proj-uuid")
+    assert seq == 42
+    assert isinstance(seq, int)
+
+
+def test_fetch_sequence_id_network_error_returns_none():
+    with patch.object(plane_sync.httpx, "get", side_effect=RuntimeError("connection refused")):
+        seq = plane_sync.fetch_issue_sequence_id("issue-uuid", "proj-uuid")
+    assert seq is None  # must not raise
+
+
+def test_fetch_sequence_id_missing_field_returns_none():
+    with patch.object(plane_sync.httpx, "get", return_value=_mock_resp({"error": "not found"})):
+        seq = plane_sync.fetch_issue_sequence_id("missing-uuid", "proj-uuid")
+    assert seq is None
+
+
+# ---------------------------------------------------------------------------
+# handle_work_item_created: seq available -> prefix-NNN
+# ---------------------------------------------------------------------------
+
+# Feature 1: pipeline starts on a status change to In Progress, not on creation.
+_IN_PROGRESS = "b873d9eb-993c-48cd-97ac-99a9b1623967"
+
+
+def _post(plane_id, plane_project_id=ORCH_PLANE_ID, name="A valid work item title"):
+    return client.post(
+        "/webhook/plane",
+        json={
+            "event": "issue",
+            "action": "updated",
+            "data": {
+                "id": plane_id,
+                "name": name,
+                "description_stripped": "This is a sufficiently long description.",
+                "project": plane_project_id,
+                "state": {"id": _IN_PROGRESS, "name": "In Progress", "group": "started"},
+            },
+        },
+    )
+
+
+@patch("src.webhooks.plane.launcher")
+@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
+@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
+@patch("src.plane_sync.fetch_issue_sequence_id", return_value=7)
+def test_created_uses_plane_sequence_id(mock_fetch, mock_branch, mock_docs, mock_launcher):
+    mock_launcher.launch.return_value = 1
+    resp = _post("seq-issue")
+    assert resp.status_code == 200
+    conn = get_db()
+    task = conn.execute("SELECT work_item_id FROM tasks WHERE plane_id='seq-issue'").fetchone()
+    conn.close()
+    assert task is not None
+    assert task["work_item_id"] == "ORCH-007"
+    mock_fetch.assert_called_once()
+
+
+@patch("src.webhooks.plane.launcher")
+@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
+@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
+@patch("src.plane_sync.fetch_issue_sequence_id", return_value=None)
+@patch("src.webhooks.plane.get_next_work_item_id", return_value="ORCH-099")
+def test_created_falls_back_to_db_when_plane_down(
+    mock_next, mock_fetch, mock_branch, mock_docs, mock_launcher
+):
+    """Plane unavailable (seq=None) => fall back to DB increment; task still created."""
+    mock_launcher.launch.return_value = 1
+    resp = _post("fallback-issue")
+    assert resp.status_code == 200
+    conn = get_db()
+    task = conn.execute("SELECT work_item_id FROM tasks WHERE plane_id='fallback-issue'").fetchone()
+    conn.close()
+    assert task is not None  # autonomy: Plane down does not block creation
+    assert task["work_item_id"] == "ORCH-099"
+    mock_next.assert_called_once()
+
+
+# ---------------------------------------------------------------------------
+# find_issue_id: no hardcoded ET- prefix, matches arbitrary prefix by seq
+# ---------------------------------------------------------------------------
+
+def test_find_issue_id_matches_arbitrary_prefix_by_sequence():
+    """ORCH-005 must resolve via the issue whose sequence_id == 5 (no ET- assumption)."""
+    issues = {"results": [
+        {"id": "uuid-a", "sequence_id": 3, "name": "something"},
+        {"id": "uuid-b", "sequence_id": 5, "name": "ORCH-005: target"},
+        {"id": "uuid-c", "sequence_id": 9, "name": "other"},
+    ]}
+    # No DB row for this work_item_id => goes to the Plane API search branch.
+    with patch.object(plane_sync.httpx, "get", return_value=_mock_resp(issues)):
+        found = plane_sync.find_issue_id("ORCH-005", project_id="proj-uuid")
+    assert found == "uuid-b"
+
+
+def test_find_issue_id_matches_et_prefix_too():
+    """Backward compat: ET-002 still resolves by sequence_id == 2."""
+    issues = {"results": [
+        {"id": "uuid-x", "sequence_id": 2, "name": "ET item"},
+        {"id": "uuid-y", "sequence_id": 7, "name": "other"},
+    ]}
+    with patch.object(plane_sync.httpx, "get", return_value=_mock_resp(issues)):
+        found = plane_sync.find_issue_id("ET-002", project_id="proj-uuid")
+    assert found == "uuid-x"
--- a/tests/test_pipeline_start_bugs.py
+++ b/tests/test_pipeline_start_bugs.py
@@ -0,0 +1,213 @@
+"""Tests for the two pipeline-start bugs surfaced by the ET-006 live run.
+
+BUG 1: issue.updated (status -> In Progress) ships a payload WITHOUT the
+       description, so start_pipeline must pull it from the Plane issue API
+       before QG-0 runs (otherwise QG-0 wrongly blocks the issue).
+
+BUG 2a: M-6 derives work_item_id from the Plane sequence_id, which can collide.
+        ensure_unique_work_item_id() must hand out the next FREE id instead of
+        reusing one that is already in the tasks table.
+
+BUG 2b: two tasks with an (artificially) identical work_item_id must not share a
+        branch/worktree.
+
+launcher / Gitea / Plane network are mocked. Real FastAPI endpoint via
+TestClient for the BUG 1 end-to-end path.
+"""
+
+import os
+import tempfile
+
+_test_db = os.path.join(tempfile.gettempdir(), "test_orchestrator_pipeline_bugs.db")
+os.environ["ORCH_DB_PATH"] = _test_db
+os.environ.setdefault("ORCH_PLANE_WEBHOOK_SECRET", "")
+os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
+os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token")
+
+import pytest  # noqa: E402
+from unittest.mock import patch, AsyncMock  # noqa: E402
+from fastapi.testclient import TestClient  # noqa: E402
+
+from src.main import app  # noqa: E402
+from src.db import init_db, get_db, ensure_unique_work_item_id  # noqa: E402
+from src import projects as P  # noqa: E402
+from src.projects import reload_projects  # noqa: E402
+from src.git_worktree import get_worktree_path  # noqa: E402
+
+ENDURO_PLANE_ID = "7a79f0a9-5278-49cd-9007-9a338f238f9c"
+IN_PROGRESS = "b873d9eb-993c-48cd-97ac-99a9b1623967"
+BACKLOG = "113b24f6-cce8-4be9-9a22-a359b9cf0122"
+
+client = TestClient(app)
+
+
+@pytest.fixture(autouse=True)
+def setup(monkeypatch):
+    monkeypatch.setattr(P.settings, "db_path", _test_db)
+    import src.db as _db
+    monkeypatch.setattr(_db.settings, "db_path", _test_db)
+    if os.path.exists(_test_db):
+        os.unlink(_test_db)
+    init_db()
+    monkeypatch.setattr("src.webhooks.plane.verify_plane_signature", lambda body, sig: True)
+    registry_json = (
+        f'[{{"plane_project_id": "{ENDURO_PLANE_ID}", "repo": "enduro-trails",'
+        f' "work_item_prefix": "ET", "name": "enduro-trails"}}]'
+    )
+    monkeypatch.setattr(P.settings, "projects_json", registry_json)
+    reload_projects()
+    yield
+    reload_projects()
+    if os.path.exists(_test_db):
+        os.unlink(_test_db)
+
+
+def _insert_task(work_item_id, branch, plane_id="x"):
+    conn = get_db()
+    conn.execute(
+        "INSERT INTO tasks (plane_id, work_item_id, repo, branch, stage, plane_issue_id) "
+        "VALUES (?, ?, ?, ?, ?, ?)",
+        (plane_id, work_item_id, "enduro-trails", branch, "analysis", plane_id),
+    )
+    conn.commit()
+    conn.close()
+
+
+def _count(plane_id):
+    conn = get_db()
+    n = conn.execute("SELECT COUNT(*) FROM tasks WHERE plane_id=?", (plane_id,)).fetchone()[0]
+    conn.close()
+    return n
+
+
+def _task(plane_id):
+    conn = get_db()
+    row = conn.execute("SELECT * FROM tasks WHERE plane_id=?", (plane_id,)).fetchone()
+    conn.close()
+    return row
+
+
+# --------------------------------------------------------------------------- #
+# BUG 1
+# --------------------------------------------------------------------------- #
+def _to_in_progress_no_desc(plane_id="bug1"):
+    """issue.updated payload WITHOUT description (only changed fields)."""
+    return client.post("/webhook/plane", json={
+        "event": "issue", "action": "updated",
+        "data": {
+            "id": plane_id, "name": "A valid backlog item title",
+            # NO description / description_stripped here, exactly like Plane sends
+            # on a status change.
+            "project": ENDURO_PLANE_ID,
+            "state": {"id": IN_PROGRESS, "name": "In Progress", "group": "started"},
+        },
+        "activity": {"field": "state", "new_value": IN_PROGRESS, "old_value": BACKLOG},
+    })
+
+
+@patch("src.webhooks.plane.enqueue_job", return_value=1)
+@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
+@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
+@patch("src.plane_sync.fetch_issue_sequence_id", return_value=42)
+@patch("src.plane_sync.fetch_issue_fields",
+       return_value=("A valid backlog item title",
+                     "This is a sufficiently long description fetched from Plane API."))
+def test_status_start_fetches_description(
+    mock_fields, mock_seq, mock_branch, mock_docs, mock_enqueue
+):
+    """BUG 1: empty description in payload -> start_pipeline pulls it from the
+    Plane API (single fetch_issue_fields GET) -> QG-0 passes -> task created +
+    analyst enqueued (NOT blocked)."""
+    resp = _to_in_progress_no_desc("bug1")
+    assert resp.status_code == 200
+    # name + description were pulled from the API in one call
+    mock_fields.assert_called_once()
+    # QG-0 passed -> task created and analyst launched (NOT set_issue_blocked)
+    assert _count("bug1") == 1
+    assert _task("bug1")["stage"] == "analysis"
+    mock_enqueue.assert_called_once()
+    assert mock_enqueue.call_args.args[0] == "analyst"
+
+
+@patch("src.webhooks.plane.enqueue_job", return_value=1)
+@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
+@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
+@patch("src.plane_sync.fetch_issue_sequence_id", return_value=42)
+@patch("src.plane_sync.fetch_issue_fields", return_value=("", ""))
+def test_status_start_empty_api_still_blocks(
+    mock_fields, mock_seq, mock_branch, mock_docs, mock_enqueue
+):
+    """BUG 1 negative path: if the API also returns empty, QG-0 legitimately
+    fails -> NO task is created (truly empty ticket)."""
+    resp = _to_in_progress_no_desc("bug1-empty")
+    assert resp.status_code == 200
+    mock_fields.assert_called_once()
+    assert _count("bug1-empty") == 0
+    mock_enqueue.assert_not_called()
+
+
+# --------------------------------------------------------------------------- #
+# BUG 2a
+# --------------------------------------------------------------------------- #
+def test_work_item_id_uniqueness():
+    """BUG 2a: if ET-006 is already in tasks, the guard returns the next free
+    id (ET-007), not ET-006 again."""
+    _insert_task("ET-006", "feature/ET-006-gpx-upload", plane_id="old")
+    assert ensure_unique_work_item_id("ET-006", "enduro-trails") == "ET-007"
+
+    # ET-006 AND ET-007 taken -> next free is ET-008.
+    _insert_task("ET-007", "feature/ET-007-something", plane_id="old2")
+    assert ensure_unique_work_item_id("ET-006", "enduro-trails") == "ET-008"
+
+    # A free id is returned unchanged.
+    assert ensure_unique_work_item_id("ET-099", "enduro-trails") == "ET-099"
+
+    # Per-repo isolation: a different repo with the same id is not a collision.
+    assert ensure_unique_work_item_id("ET-006", "other-repo") == "ET-006"
+
+
+@patch("src.webhooks.plane.enqueue_job", return_value=1)
+@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
+@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
+@patch("src.plane_sync.fetch_issue_sequence_id", return_value=6)
+@patch("src.plane_sync.fetch_issue_fields",
+       return_value=("Popup enduro trails feature",
+                     "A sufficiently long description for QG-0 to pass cleanly."))
+def test_collision_reassigns_in_start_pipeline(
+    mock_fields, mock_seq, mock_branch, mock_docs, mock_enqueue
+):
+    """BUG 2a end-to-end: ET-006 already exists -> a new In Progress issue whose
+    Plane sequence_id is also 6 must NOT reuse ET-006."""
+    _insert_task("ET-006", "feature/ET-006-gpx-upload", plane_id="task8")
+    resp = client.post("/webhook/plane", json={
+        "event": "issue", "action": "updated",
+        "data": {
+            "id": "task25", "name": "Popup enduro trails feature",
+            "description_stripped": "A sufficiently long description for QG-0.",
+            "project": ENDURO_PLANE_ID,
+            "state": {"id": IN_PROGRESS, "name": "In Progress", "group": "started"},
+        },
+        "activity": {"field": "state", "new_value": IN_PROGRESS, "old_value": BACKLOG},
+    })
+    assert resp.status_code == 200
+    new_id = _task("task25")["work_item_id"]
+    assert new_id != "ET-006"
+    assert new_id == "ET-007"
+
+
+# --------------------------------------------------------------------------- #
+# BUG 2b
+# --------------------------------------------------------------------------- #
+def test_worktree_per_task():
+    """BUG 2b: two tasks must not resolve to the same worktree path. With the
+    uniqueness guard the branches differ, so the worktree paths differ too."""
+    _insert_task("ET-006", "feature/ET-006-gpx-upload", plane_id="task8")
+    # The second task gets a unique id via the guard...
+    new_id = ensure_unique_work_item_id("ET-006", "enduro-trails")
+    assert new_id == "ET-007"
+    branch_a = "feature/ET-006-gpx-upload"
+    branch_b = f"feature/{new_id}-popup-enduro-trails"
+
+    wt_a = get_worktree_path("enduro-trails", branch_a)
+    wt_b = get_worktree_path("enduro-trails", branch_b)
+    assert wt_a != wt_b, "two tasks must not share a worktree path"
--- a/tests/test_plane_author.py
+++ b/tests/test_plane_author.py
@@ -0,0 +1,99 @@
+"""Tests for per-agent Plane comment authorship (feat: per-agent bot author).
+
+Covers:
+  * _headers_for: role -> bot token; None/unknown/empty token -> shared fallback.
+  * add_comment: author is propagated into the POST headers; no author keeps
+    backward-compatible behaviour (shared orchestrator token).
+
+GET/PATCH calls are intentionally NOT covered here: they stay on the shared
+token by design and are unchanged by this feature.
+"""
+
+import os
+
+# Set env defaults before importing app modules (same convention as the other
+# suites) so config/settings load cleanly without a real .env.
+os.environ.setdefault("ORCH_PLANE_API_TOKEN", "shared-token")
+os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
+
+from unittest.mock import patch, MagicMock  # noqa: E402
+
+from src import plane_sync  # noqa: E402
+
+
+# --------------------------------------------------------------------------- #
+# _headers_for
+# --------------------------------------------------------------------------- #
+def test_headers_for_known_role_uses_bot_token():
+    """A known role with a configured token -> that bot's X-API-Key."""
+    with patch.dict(plane_sync.PLANE_BOT_TOKENS, {"analyst": "analyst-tok"}, clear=False):
+        assert plane_sync._headers_for("analyst") == {"X-API-Key": "analyst-tok"}
+
+
+def test_headers_for_none_falls_back_to_shared():
+    """author=None -> shared orchestrator headers."""
+    assert plane_sync._headers_for(None) is plane_sync.PLANE_HEADERS
+
+
+def test_headers_for_unknown_role_falls_back_to_shared():
+    """Unknown role -> shared orchestrator headers."""
+    assert plane_sync._headers_for("nope") is plane_sync.PLANE_HEADERS
+
+
+def test_headers_for_empty_token_falls_back_to_shared():
+    """Known role but empty/unconfigured token -> shared orchestrator headers."""
+    with patch.dict(plane_sync.PLANE_BOT_TOKENS, {"tester": ""}, clear=False):
+        assert plane_sync._headers_for("tester") is plane_sync.PLANE_HEADERS
+
+
+def test_headers_for_empty_string_author_falls_back_to_shared():
+    """author='' -> shared orchestrator headers."""
+    assert plane_sync._headers_for("") is plane_sync.PLANE_HEADERS
+
+
+# --------------------------------------------------------------------------- #
+# add_comment
+# --------------------------------------------------------------------------- #
+def _mock_post_ok():
+    resp = MagicMock()
+    resp.raise_for_status.return_value = None
+    return resp
+
+
+def test_add_comment_with_author_posts_with_bot_headers():
+    """add_comment(author='developer') -> httpx.post called with the developer
+    bot's X-API-Key header."""
+    with patch.object(plane_sync, "find_issue_id", return_value="issue-uuid"), \
+         patch.object(plane_sync, "_resolve_project_id", return_value="proj-uuid"), \
+         patch.dict(plane_sync.PLANE_BOT_TOKENS, {"developer": "dev-tok"}, clear=False), \
+         patch.object(plane_sync.httpx, "post", return_value=_mock_post_ok()) as mock_post:
+        plane_sync.add_comment("ET-001", "hello", author="developer")
+
+    assert mock_post.called
+    _, kwargs = mock_post.call_args
+    assert kwargs["headers"] == {"X-API-Key": "dev-tok"}
+
+
+def test_add_comment_without_author_uses_shared_token():
+    """add_comment without author -> shared orchestrator headers (backward
+    compatible)."""
+    with patch.object(plane_sync, "find_issue_id", return_value="issue-uuid"), \
+         patch.object(plane_sync, "_resolve_project_id", return_value="proj-uuid"), \
+         patch.object(plane_sync.httpx, "post", return_value=_mock_post_ok()) as mock_post:
+        plane_sync.add_comment("ET-001", "hello")
+
+    assert mock_post.called
+    _, kwargs = mock_post.call_args
+    assert kwargs["headers"] is plane_sync.PLANE_HEADERS
+
+
+def test_add_comment_unknown_author_uses_shared_token():
+    """add_comment with an unknown role -> shared orchestrator headers."""
+    with patch.object(plane_sync, "find_issue_id", return_value="issue-uuid"), \
+         patch.object(plane_sync, "_resolve_project_id", return_value="proj-uuid"), \
+         patch.object(plane_sync.httpx, "post", return_value=_mock_post_ok()) as mock_post:
+        plane_sync.add_comment("ET-001", "hello", author="ghost")
+
+    assert mock_post.called
+    _, kwargs = mock_post.call_args
+    assert kwargs["headers"] is plane_sync.PLANE_HEADERS
--- a/tests/test_plane_webhook.py
+++ b/tests/test_plane_webhook.py
@@ -73,16 +73,24 @@ def setup(monkeypatch):
        os.unlink(_test_db)


+# Feature 1: the pipeline now starts on a status change to In Progress (not on
+# creation). _post_created drives that status-change event so these ORCH-6
+# routing tests still exercise task creation through the new trigger.
+_IN_PROGRESS = "b873d9eb-993c-48cd-97ac-99a9b1623967"
+
+
 def _post_created(plane_project_id, plane_id="wi-1", name="A valid work item title"):
    return client.post(
        "/webhook/plane",
        json={
-            "event": "work_item.created",
+            "event": "issue",
+            "action": "updated",
            "data": {
                "id": plane_id,
                "name": name,
                "description_stripped": "This is a sufficiently long description.",
                "project": plane_project_id,
+                "state": {"id": _IN_PROGRESS, "name": "In Progress", "group": "started"},
            },
        },
    )
--- a/tests/test_qg.py
+++ b/tests/test_qg.py
@@ -17,7 +17,10 @@ from src.qg.checks import (
    check_ci_green,
    check_review_approved,
    check_tests_passed,
+    check_tests_local,
+    check_deploy_status,
 )
+from src.stages import get_qg_for_stage


@pytest.fixture(autouse=True)
@@ -186,3 +189,175 @@ class TestCheckTestsPassed:
        passed, reason = check_tests_passed("enduro-trails", "ET-001")
        assert passed is False
        assert "not found" in reason.lower()
+
+
+class TestCheckDeployStatus:
+    """BUG 8: deploy -> done must be gated on the deployer's machine-readable
+    deploy_status verdict in 14-deploy-log.md frontmatter, NOT the LLM exit code
+    (always 0). Mirrors check_reviewer_verdict (reads ONLY the frontmatter field)."""
+
+    def _write_log(self, repo_dir, content):
+        wi_dir = repo_dir / "docs" / "work-items" / "ET-011"
+        wi_dir.mkdir(parents=True)
+        (wi_dir / "14-deploy-log.md").write_text(content)
+
+    def test_success_verdict_passes(self, setup_work_item_dir):
+        self._write_log(
+            setup_work_item_dir,
+            "---\ndeploy_status: SUCCESS\nversion: v0.0.3\n---\n\nDeployed OK.\n",
+        )
+        passed, reason = check_deploy_status("enduro-trails", "ET-011")
+        assert passed is True
+        assert "SUCCESS" in reason
+
+    def test_failed_verdict_fails(self, setup_work_item_dir):
+        self._write_log(
+            setup_work_item_dir,
+            "---\ndeploy_status: FAILED\nversion: v0.0.3\n---\n\npermission denied.\n",
+        )
+        passed, reason = check_deploy_status("enduro-trails", "ET-011")
+        assert passed is False
+        assert "FAILED" in reason
+
+    def test_no_file_fails(self, setup_work_item_dir):
+        passed, reason = check_deploy_status("enduro-trails", "ET-011")
+        assert passed is False
+        assert "not found" in reason.lower()
+
+    def test_no_field_fails(self, setup_work_item_dir):
+        # Frontmatter present but no deploy_status field -> must NOT pass.
+        self._write_log(
+            setup_work_item_dir,
+            "---\nversion: v0.0.3\n---\n\nStatus: FAILED (prose only).\n",
+        )
+        passed, reason = check_deploy_status("enduro-trails", "ET-011")
+        assert passed is False
+
+    def test_prose_only_no_frontmatter_fails(self, setup_work_item_dir):
+        # Prose mentioning SUCCESS but no machine-readable frontmatter -> fail.
+        self._write_log(
+            setup_work_item_dir,
+            "# Deploy log\n\nStatus: SUCCESS (prose, not frontmatter).\n",
+        )
+        passed, reason = check_deploy_status("enduro-trails", "ET-011")
+        assert passed is False
+
+    # --- ET-013 path-sync fix: log written to origin/main via separate PR ---
+
+    def test_origin_main_success_passes_when_absent_in_worktree(self, monkeypatch):
+        # Deployer merged 14-deploy-log.md into main via a separate PR; it is NOT
+        # in the feature worktree. Gate must recover it from origin/main -> PASS.
+        # (This is the exact ET-013 regression.)
+        monkeypatch.setattr(
+            "src.qg.checks._deploy_log_from_main",
+            lambda repo, wi: "---\ndeploy_status: SUCCESS\nversion: v0.0.5\n---\n\nLive.\n",
+        )
+        passed, reason = check_deploy_status("enduro-trails", "ET-013")
+        assert passed is True
+        assert "SUCCESS" in reason
+
+    def test_origin_main_failed_fails(self, monkeypatch):
+        # A genuine FAILED log in main must still fail.
+        monkeypatch.setattr(
+            "src.qg.checks._deploy_log_from_main",
+            lambda repo, wi: "---\ndeploy_status: FAILED\nversion: v0.0.5\n---\n\nboom.\n",
+        )
+        passed, reason = check_deploy_status("enduro-trails", "ET-013")
+        assert passed is False
+        assert "FAILED" in reason
+
+    def test_absent_everywhere_fails(self, monkeypatch):
+        # Not in worktree and origin/main lookup yields nothing -> not found.
+        monkeypatch.setattr(
+            "src.qg.checks._deploy_log_from_main", lambda repo, wi: None
+        )
+        passed, reason = check_deploy_status("enduro-trails", "ET-013")
+        assert passed is False
+        assert "not found" in reason.lower()
+
+    @patch("src.qg.checks.subprocess.run")
+    @patch("src.qg.checks.os.path.isdir", return_value=True)
+    def test_fetch_failure_degrades_no_exception(self, mock_isdir, mock_run):
+        # git fetch/show raising (e.g. network) must degrade to "not found",
+        # never propagate an exception out of the gate.
+        import subprocess as _sp
+        mock_run.side_effect = _sp.TimeoutExpired(cmd="git", timeout=30)
+        passed, reason = check_deploy_status("enduro-trails", "ET-013")
+        assert passed is False
+        assert "not found" in reason.lower()
+
+    def test_worktree_log_short_circuits_main_lookup(self, setup_work_item_dir, monkeypatch):
+        # If the log IS present in the worktree, origin/main must NOT be consulted.
+        self._write_log(
+            setup_work_item_dir,
+            "---\ndeploy_status: SUCCESS\nversion: v0.0.3\n---\n\nDeployed OK.\n",
+        )
+        called = {"n": 0}
+        def _boom(repo, wi):
+            called["n"] += 1
+            return None
+        monkeypatch.setattr("src.qg.checks._deploy_log_from_main", _boom)
+        passed, reason = check_deploy_status("enduro-trails", "ET-011")
+        assert passed is True
+        assert called["n"] == 0
+
+    def test_deploy_stage_qg_is_check_deploy_status(self):
+        assert get_qg_for_stage("deploy") == "check_deploy_status"
+
+    def test_registered_in_qg_checks(self):
+        from src.qg.checks import QG_CHECKS
+        assert QG_CHECKS.get("check_deploy_status") is check_deploy_status
+
+
+class TestDevelopmentStageQG:
+    """BUG 6: development stage QG is now check_ci_green (CI is the authoritative
+    gate), not the deprecated check_tests_local."""
+
+    def test_development_qg_is_check_ci_green(self):
+        assert get_qg_for_stage("development") == "check_ci_green"
+
+    def test_check_tests_local_is_deprecated_and_unwired(self):
+        # Kept in the registry for backward-compat, but not wired to any stage.
+        from src.qg.checks import QG_CHECKS
+        from src.stages import STAGE_TRANSITIONS
+        assert "check_tests_local" in QG_CHECKS
+        wired = {t.get("qg") for t in STAGE_TRANSITIONS.values()}
+        assert "check_tests_local" not in wired
+
+
+class TestCheckTestsLocal:
+    """BUG 5: check_tests_local must run pytest directly (not make, which is
+    not installed in the orchestrator container)."""
+
+    @patch("src.qg.checks.ensure_worktree")
+    @patch("subprocess.run")
+    def test_passes_on_returncode_zero(self, mock_run, mock_wt, tmp_path):
+        mock_wt.return_value = str(tmp_path)
+        mock_run.return_value = MagicMock(returncode=0, stdout="ok", stderr="")
+        passed, reason = check_tests_local("enduro-trails", "feature/ET-001-x")
+        assert passed is True
+        assert reason == "Local tests passed"
+
+    @patch("src.qg.checks.ensure_worktree")
+    @patch("subprocess.run")
+    def test_fails_on_nonzero_returncode(self, mock_run, mock_wt, tmp_path):
+        mock_wt.return_value = str(tmp_path)
+        mock_run.return_value = MagicMock(returncode=1, stdout="boom", stderr="trace")
+        passed, reason = check_tests_local("enduro-trails", "feature/ET-001-x")
+        assert passed is False
+        assert "Local tests failed" in reason
+
+    @patch("src.qg.checks.ensure_worktree")
+    @patch("subprocess.run")
+    def test_invokes_pytest_not_make(self, mock_run, mock_wt, tmp_path):
+        """The subprocess call must be pytest, from src/api, against ../../tests/."""
+        mock_wt.return_value = str(tmp_path)
+        mock_run.return_value = MagicMock(returncode=0, stdout="", stderr="")
+        check_tests_local("enduro-trails", "feature/ET-001-x")
+        args, kwargs = mock_run.call_args
+        cmd = args[0]
+        assert "make" not in cmd
+        assert cmd[:3] == ["python", "-m", "pytest"]
+        assert "../../tests/" in cmd
+        assert kwargs["cwd"] == os.path.join(str(tmp_path), "src", "api")
+
--- a/tests/test_queue.py
+++ b/tests/test_queue.py
@@ -0,0 +1,304 @@
+"""Tests for ORCH-1 (F-2b) persistent job queue.
+
+Covers:
+  - enqueue_job -> claim_next_job -> mark_job lifecycle
+  - claim_next_job atomicity (no double-dispatch of the same job)
+  - retry: fail -> requeue while attempts < max_attempts, then failed
+  - requeue_running_jobs (queue-recovery)
+  - count_running_jobs / job_status_counts / recent_jobs
+  - QueueWorker respects max_concurrency (Popen / launch fully mocked)
+
+The real claude/Popen is NEVER spawned: launcher.launch_job is mocked in worker
+tests, and the launcher finalize logic is exercised directly via mark_job.
+"""
+import os
+import tempfile
+
+import pytest
+
+# Override env before importing app modules (same convention as test_qg.py).
+_test_db = os.path.join(tempfile.gettempdir(), "test_orchestrator_queue.db")
+os.environ["ORCH_DB_PATH"] = _test_db
+os.environ["ORCH_REPOS_DIR"] = tempfile.gettempdir()
+os.environ["ORCH_GITEA_TOKEN"] = "test-token"
+os.environ["ORCH_PLANE_API_TOKEN"] = "test-token"
+
+import src.db as db
+from src.db import (
+    init_db,
+    enqueue_job,
+    claim_next_job,
+    mark_job,
+    count_running_jobs,
+    requeue_running_jobs,
+    get_job,
+    job_status_counts,
+    recent_jobs,
+)
+
+
+@pytest.fixture(autouse=True)
+def fresh_db(tmp_path, monkeypatch):
+    """Point the DB at a fresh per-test sqlite file and init the schema."""
+    dbfile = tmp_path / "queue.db"
+    monkeypatch.setattr(db.settings, "db_path", str(dbfile))
+    init_db()
+    yield
+
+
+# ---------------------------------------------------------------------------
+# enqueue / claim / mark lifecycle
+# ---------------------------------------------------------------------------
+class TestLifecycle:
+    def test_enqueue_creates_queued_job(self):
+        jid = enqueue_job("analyst", "enduro-trails", "task body", task_id=7)
+        job = get_job(jid)
+        assert job["status"] == "queued"
+        assert job["agent"] == "analyst"
+        assert job["repo"] == "enduro-trails"
+        assert job["task_content"] == "task body"
+        assert job["task_id"] == 7
+        assert job["attempts"] == 0
+        assert job["max_attempts"] == 2
+
+    def test_claim_marks_running_and_increments_attempts(self):
+        jid = enqueue_job("developer", "repo")
+        claimed = claim_next_job()
+        assert claimed is not None
+        assert claimed["id"] == jid
+        assert claimed["status"] == "running"
+        assert claimed["attempts"] == 1
+        assert count_running_jobs() == 1
+
+    def test_claim_empty_queue_returns_none(self):
+        assert claim_next_job() is None
+
+    def test_claim_is_fifo(self):
+        a = enqueue_job("analyst", "r")
+        b = enqueue_job("developer", "r")
+        assert claim_next_job()["id"] == a
+        assert claim_next_job()["id"] == b
+
+    def test_mark_done(self):
+        jid = enqueue_job("tester", "r")
+        claim_next_job()
+        mark_job(jid, "done", run_id=42)
+        job = get_job(jid)
+        assert job["status"] == "done"
+        assert job["run_id"] == 42
+        assert job["finished_at"] is not None
+        assert count_running_jobs() == 0
+
+    def test_mark_failed_records_error(self):
+        jid = enqueue_job("tester", "r")
+        claim_next_job()
+        mark_job(jid, "failed", run_id=9, error="boom")
+        job = get_job(jid)
+        assert job["status"] == "failed"
+        assert job["error"] == "boom"
+        assert job["finished_at"] is not None
+
+
+# ---------------------------------------------------------------------------
+# claim atomicity — no double dispatch
+# ---------------------------------------------------------------------------
+class TestClaimAtomicity:
+    def test_single_job_claimed_once(self):
+        jid = enqueue_job("analyst", "r")
+        first = claim_next_job()
+        second = claim_next_job()
+        assert first["id"] == jid
+        assert second is None  # already running, not re-dispatched
+
+    def test_concurrent_claims_no_duplicate(self):
+        """Many enqueued jobs claimed from parallel threads -> each claimed once."""
+        import threading
+
+        n = 20
+        for _ in range(n):
+            enqueue_job("developer", "r")
+
+        claimed_ids = []
+        lock = threading.Lock()
+
+        def grab():
+            while True:
+                job = claim_next_job()
+                if job is None:
+                    return
+                with lock:
+                    claimed_ids.append(job["id"])
+
+        threads = [threading.Thread(target=grab) for _ in range(8)]
+        for t in threads:
+            t.start()
+        for t in threads:
+            t.join()
+
+        assert len(claimed_ids) == n
+        assert len(set(claimed_ids)) == n  # no id claimed twice
+        assert count_running_jobs() == n
+
+
+# ---------------------------------------------------------------------------
+# retry semantics (mirrors launcher._finalize_job logic)
+# ---------------------------------------------------------------------------
+class TestRetry:
+    def test_fail_requeues_while_under_max(self):
+        jid = enqueue_job("developer", "r", max_attempts=2)
+        job = claim_next_job()              # attempts=1
+        assert job["attempts"] == 1
+        # attempts(1) < max(2) -> requeue
+        mark_job(jid, "queued", error="exit 1")
+        j = get_job(jid)
+        assert j["status"] == "queued"
+        assert j["error"] == "exit 1"
+        assert j["started_at"] is None      # requeue clears started_at
+
+    def test_fail_fails_when_max_reached(self):
+        jid = enqueue_job("developer", "r", max_attempts=2)
+        claim_next_job()                    # attempts=1 -> requeue
+        mark_job(jid, "queued")
+        job2 = claim_next_job()             # attempts=2
+        assert job2["attempts"] == 2
+        # attempts(2) >= max(2) -> failed
+        mark_job(jid, "failed", error="exit 1")
+        assert get_job(jid)["status"] == "failed"
+
+    def test_finalize_job_done(self):
+        """launcher._finalize_job marks done on exit_code 0 (no Popen needed)."""
+        from src.agents.launcher import AgentLauncher
+        jid = enqueue_job("analyst", "r")
+        claim_next_job()
+        AgentLauncher()._finalize_job(jid, "analyst", run_id=5, exit_code=0)
+        assert get_job(jid)["status"] == "done"
+
+    def test_finalize_job_requeue_then_fail(self, monkeypatch):
+        from src.agents.launcher import AgentLauncher
+        # Silence telegram side-effect.
+        monkeypatch.setattr("src.notifications.send_telegram", lambda *a, **k: None)
+        lr = AgentLauncher()
+        jid = enqueue_job("developer", "r", max_attempts=2)
+
+        claim_next_job()                    # attempts=1
+        lr._finalize_job(jid, "developer", run_id=1, exit_code=2)
+        assert get_job(jid)["status"] == "queued"  # 1 < 2 -> requeue
+
+        claim_next_job()                    # attempts=2
+        lr._finalize_job(jid, "developer", run_id=2, exit_code=2)
+        assert get_job(jid)["status"] == "failed"  # 2 >= 2 -> failed
+
+
+# ---------------------------------------------------------------------------
+# queue-recovery
+# ---------------------------------------------------------------------------
+class TestRequeueRunning:
+    def test_requeue_running_jobs(self):
+        a = enqueue_job("analyst", "r")
+        b = enqueue_job("developer", "r")
+        claim_next_job()  # a -> running
+        claim_next_job()  # b -> running
+        assert count_running_jobs() == 2
+        n = requeue_running_jobs()
+        assert n == 2
+        assert count_running_jobs() == 0
+        assert get_job(a)["status"] == "queued"
+        assert get_job(b)["status"] == "queued"
+
+    def test_requeue_preserves_attempts(self):
+        jid = enqueue_job("analyst", "r")
+        claim_next_job()  # attempts=1
+        requeue_running_jobs()
+        assert get_job(jid)["attempts"] == 1  # not reset
+
+
+# ---------------------------------------------------------------------------
+# observability helpers
+# ---------------------------------------------------------------------------
+class TestObservability:
+    def test_status_counts(self):
+        enqueue_job("analyst", "r")        # stays queued
+        enqueue_job("developer", "r")      # first claimed -> running (FIFO)
+        claim_next_job()
+        counts = job_status_counts()
+        assert counts["running"] == 1
+        assert counts["queued"] == 1
+        assert counts["done"] == 0
+        assert counts["failed"] == 0
+
+    def test_recent_jobs_desc(self):
+        ids = [enqueue_job("analyst", "r") for _ in range(3)]
+        recent = recent_jobs(10)
+        assert [r["id"] for r in recent] == sorted(ids, reverse=True)
+
+
+# ---------------------------------------------------------------------------
+# QueueWorker max_concurrency (launch_job fully mocked — no real Popen)
+# ---------------------------------------------------------------------------
+class TestWorkerConcurrency:
+    @pytest.fixture(autouse=True)
+    def _ok_preflight(self, monkeypatch):
+        # ORCH-1 resilience: the worker gates claims behind preflight; in tests there
+        # is no claude binary, so stub preflight OK to exercise pure queue/concurrency.
+        monkeypatch.setattr("src.queue_worker.preflight.check", lambda *a, **k: (True, "ok"))
+
+    def test_worker_respects_max_concurrency(self, monkeypatch):
+        from src.queue_worker import QueueWorker
+
+        launched = []
+
+        def fake_launch_job(job):
+            # Simulate a long-running agent: the job stays 'running' (we do NOT
+            # mark it done), so the slot remains occupied.
+            launched.append(job["id"])
+            return 100 + job["id"]
+
+        monkeypatch.setattr("src.queue_worker.launcher.launch_job", fake_launch_job)
+
+        for _ in range(5):
+            enqueue_job("developer", "r")
+
+        w = QueueWorker(max_concurrency=2, poll_interval=0.01)
+        w._drain_once()
+
+        # Only max_concurrency jobs may be launched / running at once.
+        assert len(launched) == 2
+        assert count_running_jobs() == 2
+
+    def test_worker_drains_as_slots_free(self, monkeypatch):
+        from src.queue_worker import QueueWorker
+
+        def fake_launch_job(job):
+            # Immediately complete the job so the slot frees for the next claim.
+            mark_job(job["id"], "done", run_id=job["id"])
+            return job["id"]
+
+        monkeypatch.setattr("src.queue_worker.launcher.launch_job", fake_launch_job)
+
+        for _ in range(4):
+            enqueue_job("analyst", "r")
+
+        w = QueueWorker(max_concurrency=1, poll_interval=0.01)
+        w._drain_once()
+
+        # With instant completion and concurrency 1, one drain pass empties the queue.
+        assert job_status_counts()["done"] == 4
+        assert count_running_jobs() == 0
+
+    def test_worker_launch_failure_does_not_wedge_slot(self, monkeypatch):
+        from src.queue_worker import QueueWorker
+
+        def boom(job):
+            raise RuntimeError("repo missing")
+
+        monkeypatch.setattr("src.queue_worker.launcher.launch_job", boom)
+        monkeypatch.setattr("src.notifications.send_telegram", lambda *a, **k: None)
+
+        enqueue_job("developer", "r", max_attempts=1)
+        w = QueueWorker(max_concurrency=1, poll_interval=0.01)
+        w._drain_once()
+
+        # attempts=1 >= max_attempts=1 -> failed, not stuck running.
+        assert count_running_jobs() == 0
+        counts = job_status_counts()
+        assert counts["failed"] == 1
--- a/tests/test_resilience.py
+++ b/tests/test_resilience.py
@@ -0,0 +1,295 @@
+"""ORCH-1 resilience tests: preflight, 429-classifier, backoff, circuit breaker.
+
+No real claude/Popen is ever spawned: preflight subprocess and launcher.launch_job
+are mocked. DB is a fresh per-test sqlite file.
+"""
+import os
+import tempfile
+
+import pytest
+
+_test_db = os.path.join(tempfile.gettempdir(), "test_orchestrator_resilience.db")
+os.environ["ORCH_DB_PATH"] = _test_db
+os.environ["ORCH_REPOS_DIR"] = tempfile.gettempdir()
+os.environ["ORCH_GITEA_TOKEN"] = "test-token"
+os.environ["ORCH_PLANE_API_TOKEN"] = "test-token"
+
+import src.db as db
+from src.db import (
+    init_db, enqueue_job, claim_next_job, get_job, count_running_jobs,
+    mark_job_transient,
+)
+from src import preflight, error_classifier
+from src.error_classifier import classify_text, parse_retry_after, classify_log_file
+from src.queue_worker import QueueWorker, CircuitBreaker
+from src.agents.launcher import AgentLauncher
+
+
+@pytest.fixture(autouse=True)
+def fresh_db(tmp_path, monkeypatch):
+    monkeypatch.setattr(db.settings, "db_path", str(tmp_path / "res.db"))
+    init_db()
+    preflight.reset_cache()
+    yield
+
+
+# ---------------------------------------------------------------------------
+# A. Preflight
+# ---------------------------------------------------------------------------
+class TestPreflight:
+    def test_fail_when_bin_missing(self, monkeypatch):
+        monkeypatch.setattr(preflight, "_claude_bin", lambda: "/no/such/claude")
+        ok, reason = preflight.check(force=True)
+        assert ok is False
+        assert "not found" in reason.lower()
+
+    def test_ok_when_version_succeeds(self, monkeypatch, tmp_path):
+        fake_bin = tmp_path / "claude"
+        fake_bin.write_text("#!/bin/sh\necho v1\n")
+        monkeypatch.setattr(preflight, "_claude_bin", lambda: str(fake_bin))
+        monkeypatch.setattr(preflight, "_run_version", lambda b: (True, "1.2.3"))
+        ok, reason = preflight.check(force=True)
+        assert ok is True
+
+    def test_cache_does_not_recheck_within_ttl(self, monkeypatch, tmp_path):
+        fake_bin = tmp_path / "claude"
+        fake_bin.write_text("x")
+        monkeypatch.setattr(preflight, "_claude_bin", lambda: str(fake_bin))
+        monkeypatch.setattr(db.settings, "preflight_cache_ttl", 999)
+
+        calls = {"n": 0}
+
+        def counting_version(b):
+            calls["n"] += 1
+            return True, "ok"
+
+        monkeypatch.setattr(preflight, "_run_version", counting_version)
+        preflight.reset_cache()
+        preflight.check()           # first -> runs version
+        preflight.check()           # cached -> no extra version call
+        preflight.check()
+        assert calls["n"] == 1
+
+    def test_force_bypasses_cache(self, monkeypatch, tmp_path):
+        fake_bin = tmp_path / "claude"
+        fake_bin.write_text("x")
+        monkeypatch.setattr(preflight, "_claude_bin", lambda: str(fake_bin))
+        calls = {"n": 0}
+        monkeypatch.setattr(preflight, "_run_version",
+                            lambda b: (calls.__setitem__("n", calls["n"] + 1), (True, "ok"))[1])
+        preflight.reset_cache()
+        preflight.check()
+        preflight.check(force=True)
+        assert calls["n"] == 2
+
+    def test_worker_does_not_claim_when_preflight_fails(self, monkeypatch):
+        # Preflight FAIL -> job stays queued, launch_job never called.
+        monkeypatch.setattr("src.queue_worker.preflight.check",
+                            lambda *a, **k: (False, "down"))
+        called = {"launch": False}
+        monkeypatch.setattr("src.queue_worker.launcher.launch_job",
+                            lambda job: called.__setitem__("launch", True))
+        jid = enqueue_job("analyst", "r")
+        QueueWorker(max_concurrency=1, poll_interval=0.01)._drain_once()
+        assert called["launch"] is False
+        assert get_job(jid)["status"] == "queued"
+        assert count_running_jobs() == 0
+
+
+# ---------------------------------------------------------------------------
+# B. Error classifier
+# ---------------------------------------------------------------------------
+class TestClassifier:
+    @pytest.mark.parametrize("text", [
+        "Error: 429 Too Many Requests",
+        "anthropic rate limit exceeded",
+        "overloaded_error: server is overloaded",
+        "API quota exhausted",
+        "503 Service Unavailable",
+        "connection reset by peer",
+    ])
+    def test_transient_patterns(self, text):
+        assert classify_text(text) == "transient"
+
+    @pytest.mark.parametrize("text", [
+        "Traceback: KeyError 'foo'",
+        "SyntaxError: invalid syntax",
+        "assertion failed in test",
+        "",
+    ])
+    def test_permanent_patterns(self, text):
+        assert classify_text(text) == "permanent"
+
+    def test_retry_after_header(self):
+        assert parse_retry_after("HTTP/1.1 429\nRetry-After: 42\n") == 42
+
+    def test_retry_after_json(self):
+        assert parse_retry_after('{"error":{"type":"rate_limit","retry_after": 7}}') == 7
+
+    def test_retry_after_absent(self):
+        assert parse_retry_after("just an error") is None
+
+    def test_classify_log_file(self, tmp_path):
+        p = tmp_path / "run.log"
+        p.write_text("...lots of output...\n429 rate limit. Retry-After: 30\n")
+        kind, ra = classify_log_file(str(p))
+        assert kind == "transient"
+        assert ra == 30
+
+    def test_classify_missing_file_is_permanent(self):
+        kind, ra = classify_log_file("/no/such/log")
+        assert kind == "permanent"
+        assert ra is None
+
+
+# ---------------------------------------------------------------------------
+# C. Backoff + available_at gating
+# ---------------------------------------------------------------------------
+class TestBackoff:
+    def test_backoff_grows_exponentially(self):
+        lr = AgentLauncher()
+        # base=10, cap=600 (defaults)
+        b1 = lr._backoff_seconds(1)
+        b2 = lr._backoff_seconds(2)
+        b3 = lr._backoff_seconds(3)
+        assert b1 == 20      # 2^1*10
+        assert b2 == 40      # 2^2*10
+        assert b3 == 80      # 2^3*10
+        assert b2 > b1 and b3 > b2
+
+    def test_backoff_capped(self):
+        lr = AgentLauncher()
+        assert lr._backoff_seconds(20) == 600  # capped at backoff_max_seconds
+
+    def test_retry_after_respected_when_larger(self):
+        lr = AgentLauncher()
+        # transient_attempts=1 -> base backoff 20; Retry-After=120 wins.
+        assert lr._backoff_seconds(1, retry_after=120) == 120
+
+    def test_retry_after_ignored_when_smaller(self):
+        lr = AgentLauncher()
+        assert lr._backoff_seconds(3, retry_after=5) == 80  # backoff bigger
+
+    def test_transient_requeue_sets_future_available_at_and_claim_skips(self):
+        jid = enqueue_job("developer", "r")
+        claim_next_job()
+        # Big backoff -> available_at far in the future.
+        mark_job_transient(jid, 3600, error="429")
+        job = get_job(jid)
+        assert job["status"] == "queued"
+        assert job["transient_attempts"] == 1
+        assert job["available_at"] is not None
+        # claim must NOT pick it up while available_at is in the future.
+        assert claim_next_job() is None
+
+    def test_transient_requeue_claimable_when_due(self):
+        jid = enqueue_job("developer", "r")
+        claim_next_job()
+        mark_job_transient(jid, -5, error="429")  # available_at in the past
+        c = claim_next_job()
+        assert c is not None and c["id"] == jid
+
+
+# ---------------------------------------------------------------------------
+# D. Launcher transient/permanent finalize (no Popen)
+# ---------------------------------------------------------------------------
+class TestFinalizeClassified:
+    def test_transient_failure_backoff_requeue(self, tmp_path, monkeypatch):
+        monkeypatch.setattr("src.notifications.send_telegram", lambda *a, **k: None)
+        log = tmp_path / "1.log"
+        log.write_text("Error 429 rate limit exceeded\n")
+        jid = enqueue_job("developer", "r", max_attempts=2)
+        claim_next_job()
+        AgentLauncher()._finalize_job(jid, "developer", run_id=1, exit_code=1,
+                                      output_path=str(log))
+        job = get_job(jid)
+        assert job["status"] == "queued"
+        assert job["transient_attempts"] == 1
+        assert job["available_at"] is not None     # backoff-gated
+        assert job["attempts"] == 1                 # code-fault budget NOT burned
+
+    def test_permanent_failure_uses_normal_attempts(self, tmp_path, monkeypatch):
+        monkeypatch.setattr("src.notifications.send_telegram", lambda *a, **k: None)
+        log = tmp_path / "2.log"
+        log.write_text("Traceback: ValueError\n")
+        jid = enqueue_job("developer", "r", max_attempts=2)
+        claim_next_job()
+        AgentLauncher()._finalize_job(jid, "developer", run_id=2, exit_code=1,
+                                      output_path=str(log))
+        job = get_job(jid)
+        assert job["status"] == "queued"
+        assert job["transient_attempts"] == 0       # not transient
+        assert job["available_at"] is None          # no backoff for code-fault
+
+    def test_transient_exhausts_to_failed(self, tmp_path, monkeypatch):
+        monkeypatch.setattr("src.notifications.send_telegram", lambda *a, **k: None)
+        monkeypatch.setattr(db.settings, "transient_max_attempts", 2)
+        log = tmp_path / "3.log"
+        log.write_text("overloaded_error\n")
+        lr = AgentLauncher()
+        jid = enqueue_job("developer", "r")
+        claim_next_job()
+        lr._finalize_job(jid, "developer", 1, exit_code=1, output_path=str(log))
+        assert get_job(jid)["status"] == "queued"   # transient 1 -> requeue
+        # force claimable and retry
+        mark_job_transient(jid, -1)                  # makes it due; transient=2 now
+        claim_next_job()
+        lr._finalize_job(jid, "developer", 2, exit_code=1, output_path=str(log))
+        assert get_job(jid)["status"] == "failed"    # transient budget exhausted
+
+
+# ---------------------------------------------------------------------------
+# E. Circuit breaker
+# ---------------------------------------------------------------------------
+class TestCircuitBreaker:
+    def test_opens_after_threshold(self):
+        cb = CircuitBreaker(threshold=3, pause_seconds=300)
+        assert cb.allow_claim() is True
+        cb.record_transient()
+        cb.record_transient()
+        assert cb.state == "closed"
+        cb.record_transient()                 # 3rd -> open
+        assert cb.state == "open"
+        assert cb.allow_claim() is False      # paused, no CLI calls
+
+    def test_recovered_resets_streak(self):
+        cb = CircuitBreaker(threshold=3)
+        cb.record_transient()
+        cb.record_transient()
+        cb.record_recovered()
+        assert cb.consecutive_transient == 0
+        assert cb.state == "closed"
+
+    def test_half_open_after_pause_then_closed_on_success(self, monkeypatch):
+        cb = CircuitBreaker(threshold=2, pause_seconds=300)
+        cb.record_transient()
+        cb.record_transient()                 # open
+        assert cb.state == "open"
+        # Simulate the pause elapsing.
+        cb.opened_at -= 301
+        assert cb.allow_claim() is True       # -> half-open (probe)
+        assert cb.state == "half-open"
+        cb.record_recovered()                 # probe succeeded
+        assert cb.state == "closed"
+
+    def test_half_open_reopens_on_transient(self):
+        cb = CircuitBreaker(threshold=2, pause_seconds=300)
+        cb.record_transient(); cb.record_transient()   # open
+        cb.opened_at -= 301
+        cb.allow_claim()                      # half-open
+        assert cb.state == "half-open"
+        cb.record_transient()                 # probe failed -> re-open
+        assert cb.state == "open"
+
+    def test_breaker_blocks_worker_claim(self, monkeypatch):
+        monkeypatch.setattr("src.queue_worker.preflight.check",
+                            lambda *a, **k: (True, "ok"))
+        called = {"launch": False}
+        monkeypatch.setattr("src.queue_worker.launcher.launch_job",
+                            lambda job: called.__setitem__("launch", True))
+        cb = CircuitBreaker(threshold=1, pause_seconds=300)
+        cb.record_transient()                 # open immediately
+        w = QueueWorker(max_concurrency=1, poll_interval=0.01, breaker=cb)
+        enqueue_job("analyst", "r")
+        w._drain_once()
+        assert called["launch"] is False      # breaker open -> no claim, no CLI
--- a/tests/test_stage_engine.py
+++ b/tests/test_stage_engine.py
@@ -0,0 +1,543 @@
+"""ORCH-4 / M-3: tests for the unified stage engine (src/stage_engine.advance_stage).
+
+These verify the MERGED behavior of what used to be two diverged
+_try_advance_stage implementations (launcher sync + plane async):
+
+  * happy-path advance for every stage launches the CORRECT agent
+    (the ORCH-4 fix: agent = get_agent_for_stage(current_stage), NOT next_stage);
+  * a QG failure does not advance;
+  * reviewer REQUEST_CHANGES -> rollback to development + enqueue developer;
+  * developer retries > 3 -> telegram alert, no further enqueue;
+  * tester FAIL -> rollback to development + enqueue developer;
+  * architect conflict (10-conflict.md) -> rollback to analysis + enqueue analyst;
+  * launcher AND plane both delegate to the engine.
+
+Network/Plane/Telegram side effects are mocked at the src.stage_engine level so
+the engine runs against a real isolated sqlite DB.
+"""
+
+import os
+import tempfile
+
+import pytest
+
+# Isolated test DB (same convention as the other suites).
+_test_db = os.path.join(tempfile.gettempdir(), "test_orchestrator_stage_engine.db")
+os.environ["ORCH_DB_PATH"] = _test_db
+os.environ["ORCH_REPOS_DIR"] = tempfile.gettempdir()
+os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
+os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token")
+
+from unittest.mock import MagicMock, patch  # noqa: E402
+
+import src.db as _db  # noqa: E402
+from src.db import init_db, get_db  # noqa: E402
+from src import stage_engine  # noqa: E402
+from src.stage_engine import advance_stage  # noqa: E402
+from src.stages import get_agent_for_stage  # noqa: E402
+
+
+# ---------------------------------------------------------------------------
+# Fixtures
+# ---------------------------------------------------------------------------
+@pytest.fixture(autouse=True)
+def fresh_db(monkeypatch):
+    """Fresh isolated DB per test."""
+    monkeypatch.setattr(_db.settings, "db_path", _test_db)
+    if os.path.exists(_test_db):
+        os.unlink(_test_db)
+    init_db()
+    yield
+
+
+@pytest.fixture(autouse=True)
+def silence_side_effects(monkeypatch):
+    """Mock all Plane/Telegram/notification side effects in the engine.
+
+    Everything imported into src.stage_engine that touches the network or sends
+    a message becomes a no-op MagicMock so tests are deterministic and offline.
+    """
+    for name in (
+        "notify_stage_change",
+        "notify_qg_failure",
+        "notify_approve_requested",
+        "send_telegram",
+        "plane_notify_stage",
+        "plane_notify_qg",
+        "plane_add_comment",
+        "set_issue_in_review",
+        "set_issue_needs_input",
+        "set_issue_in_progress",
+        "set_issue_blocked",
+        "set_issue_done",
+    ):
+        monkeypatch.setattr(stage_engine, name, MagicMock())
+
+
+def _make_task(stage, repo="enduro-trails", branch="feature/ET-001-x", wi="ET-001"):
+    conn = get_db()
+    cur = conn.execute(
+        "INSERT INTO tasks (plane_id, work_item_id, repo, branch, stage) "
+        "VALUES (?, ?, ?, ?, ?)",
+        (f"plane-{wi}", wi, repo, branch, stage),
+    )
+    task_id = cur.lastrowid
+    conn.commit()
+    conn.close()
+    return task_id
+
+
+def _stage(task_id):
+    conn = get_db()
+    row = conn.execute("SELECT stage FROM tasks WHERE id=?", (task_id,)).fetchone()
+    conn.close()
+    return row[0]
+
+
+def _jobs():
+    conn = get_db()
+    rows = conn.execute("SELECT agent, repo, task_id FROM jobs ORDER BY id").fetchall()
+    conn.close()
+    return [dict(r) for r in rows]
+
+
+def _add_developer_runs(task_id, n):
+    conn = get_db()
+    for _ in range(n):
+        conn.execute(
+            "INSERT INTO agent_runs (task_id, agent) VALUES (?, 'developer')",
+            (task_id,),
+        )
+    conn.commit()
+    conn.close()
+
+
+def _pass(*a, **k):
+    return (True, "ok")
+
+
+def _fail(reason):
+    def _f(*a, **k):
+        return (False, reason)
+    return _f
+
+
+# ---------------------------------------------------------------------------
+# Happy path: each stage advances and launches the CORRECT agent (ORCH-4 fix)
+# ---------------------------------------------------------------------------
+class TestHappyPathAgentSelection:
+    """The fixed agent-selection: when advancing FROM current_stage, the engine
+    must enqueue get_agent_for_stage(current_stage), NOT next_stage.
+    """
+
+    @pytest.mark.parametrize(
+        "current_stage,expected_next,expected_agent",
+        [
+            ("architecture", "development", "developer"),
+            ("development", "review", "reviewer"),
+            ("review", "testing", "tester"),
+            ("testing", "deploy", "deployer"),
+        ],
+    )
+    def test_advance_launches_current_stage_agent(
+        self, monkeypatch, current_stage, expected_next, expected_agent
+    ):
+        # All QG checks pass for this happy-path suite.
+        monkeypatch.setattr(
+            stage_engine, "QG_CHECKS",
+            {k: _pass for k in stage_engine.QG_CHECKS},
+        )
+        task_id = _make_task(current_stage)
+
+        res = advance_stage(
+            task_id, current_stage, "enduro-trails", "ET-001",
+            "feature/ET-001-x", finished_agent=None,
+        )
+
+        assert res.advanced is True
+        assert res.to_stage == expected_next
+        assert _stage(task_id) == expected_next
+        # The ORCH-4 fix: correct agent == get_agent_for_stage(current_stage).
+        assert expected_agent == get_agent_for_stage(current_stage)
+        assert res.enqueued_agent == expected_agent
+        jobs = _jobs()
+        assert len(jobs) == 1
+        assert jobs[0]["agent"] == expected_agent
+
+    def test_deploy_to_done_no_agent(self, monkeypatch):
+        """deploy -> done advances but launches no agent (terminal-ish)."""
+        monkeypatch.setattr(
+            stage_engine, "QG_CHECKS",
+            {k: _pass for k in stage_engine.QG_CHECKS},
+        )
+        task_id = _make_task("deploy")
+        res = advance_stage(task_id, "deploy", "enduro-trails", "ET-001",
+                            "feature/ET-001-x", finished_agent=None)
+        assert res.advanced is True
+        assert _stage(task_id) == "done"
+        assert res.enqueued_agent is None
+        assert _jobs() == []
+
+    def test_deploy_success_syncs_plane_to_terminal_done(self, monkeypatch):
+        """FIX 3: a successful deploy->done forces the Plane issue to terminal Done.
+
+        Previously the task could stick on In Progress because the merge webhook
+        completed it out-of-band. Now the engine drives set_issue_done() on the
+        deploy->done success transition.
+        """
+        monkeypatch.setattr(
+            stage_engine, "QG_CHECKS",
+            {k: _pass for k in stage_engine.QG_CHECKS},
+        )
+        task_id = _make_task("deploy", wi="ET-012")
+        res = advance_stage(
+            task_id, "deploy", "enduro-trails", "ET-012",
+            "feature/ET-012-x", finished_agent="deployer",
+        )
+        assert res.advanced is True
+        assert _stage(task_id) == "done"
+        # The terminal Plane sync was invoked with the work item id.
+        stage_engine.set_issue_done.assert_called_once_with("ET-012")
+
+    def test_non_terminal_advance_does_not_force_plane_done(self, monkeypatch):
+        """set_issue_done must only fire on the terminal deploy->done transition."""
+        monkeypatch.setattr(
+            stage_engine, "QG_CHECKS",
+            {k: _pass for k in stage_engine.QG_CHECKS},
+        )
+        task_id = _make_task("review")
+        advance_stage(
+            task_id, "review", "enduro-trails", "ET-001",
+            "feature/ET-001-x", finished_agent=None,
+        )
+        stage_engine.set_issue_done.assert_not_called()
+
+    def test_done_is_terminal(self):
+        task_id = _make_task("done")
+        res = advance_stage(task_id, "done", "enduro-trails", "ET-001",
+                            "feature/ET-001-x", finished_agent=None)
+        assert res.advanced is False
+        assert _stage(task_id) == "done"
+
+
+# ---------------------------------------------------------------------------
+# QG failure: do not advance
+# ---------------------------------------------------------------------------
+class TestQgFailureDoesNotAdvance:
+    def test_qg_fail_keeps_stage(self, monkeypatch):
+        monkeypatch.setattr(
+            stage_engine, "QG_CHECKS",
+            {**stage_engine.QG_CHECKS, "check_architecture_done": _fail("not done")},
+        )
+        task_id = _make_task("architecture")
+        res = advance_stage(task_id, "architecture", "enduro-trails", "ET-001",
+                            "feature/ET-001-x", finished_agent="architect")
+        assert res.advanced is False
+        assert res.qg_passed is False
+        assert _stage(task_id) == "architecture"
+        assert _jobs() == []
+
+    def test_webhook_path_emits_qg_failure_notification(self, monkeypatch):
+        """finished_agent=None -> generic QG-failure notification fires (plane parity).
+
+        development stage QG is now check_ci_green (was check_tests_local).
+        """
+        monkeypatch.setattr(
+            stage_engine, "QG_CHECKS",
+            {**stage_engine.QG_CHECKS, "check_ci_green": _fail("ci red")},
+        )
+        task_id = _make_task("development")
+        advance_stage(task_id, "development", "enduro-trails", "ET-001",
+                     "feature/ET-001-x", finished_agent=None)
+        assert stage_engine.notify_qg_failure.called
+        assert stage_engine.plane_notify_qg.called
+
+    def test_launcher_path_no_generic_qg_notification(self, monkeypatch):
+        """finished_agent set -> NO generic QG notification (launcher parity)."""
+        monkeypatch.setattr(
+            stage_engine, "QG_CHECKS",
+            {**stage_engine.QG_CHECKS, "check_architecture_done": _fail("not done")},
+        )
+        task_id = _make_task("architecture")
+        advance_stage(task_id, "architecture", "enduro-trails", "ET-001",
+                     "feature/ET-001-x", finished_agent="architect")
+        assert not stage_engine.notify_qg_failure.called
+
+
+# ---------------------------------------------------------------------------
+# Reviewer REQUEST_CHANGES -> rollback to development + enqueue developer
+# ---------------------------------------------------------------------------
+class TestReviewerRequestChanges:
+    def test_rollback_and_enqueue_developer(self, monkeypatch):
+        monkeypatch.setattr(
+            stage_engine, "QG_CHECKS",
+            {**stage_engine.QG_CHECKS,
+             "check_reviewer_verdict": _fail("verdict: REQUEST_CHANGES")},
+        )
+        task_id = _make_task("review")
+        res = advance_stage(task_id, "review", "enduro-trails", "ET-001",
+                           "feature/ET-001-x", finished_agent="reviewer")
+        assert res.advanced is False
+        assert res.rolled_back_to == "development"
+        assert _stage(task_id) == "development"
+        jobs = _jobs()
+        assert len(jobs) == 1
+        assert jobs[0]["agent"] == "developer"
+
+    def test_retry_over_3_alerts_no_enqueue(self, monkeypatch):
+        monkeypatch.setattr(
+            stage_engine, "QG_CHECKS",
+            {**stage_engine.QG_CHECKS,
+             "check_reviewer_verdict": _fail("verdict: REQUEST_CHANGES")},
+        )
+        task_id = _make_task("review")
+        _add_developer_runs(task_id, 3)  # already at the max
+        res = advance_stage(task_id, "review", "enduro-trails", "ET-001",
+                           "feature/ET-001-x", finished_agent="reviewer")
+        assert res.rolled_back_to == "development"
+        assert res.alerted is True
+        assert stage_engine.send_telegram.called
+        # No new developer job enqueued past the retry cap.
+        assert _jobs() == []
+
+
+# ---------------------------------------------------------------------------
+# Tester FAIL -> rollback to development + enqueue developer
+# ---------------------------------------------------------------------------
+class TestTesterFail:
+    def test_rollback_and_enqueue_developer(self, monkeypatch):
+        monkeypatch.setattr(
+            stage_engine, "QG_CHECKS",
+            {**stage_engine.QG_CHECKS, "check_tests_passed": _fail("2 tests failed")},
+        )
+        task_id = _make_task("testing")
+        res = advance_stage(task_id, "testing", "enduro-trails", "ET-001",
+                           "feature/ET-001-x", finished_agent="tester")
+        assert res.advanced is False
+        assert res.rolled_back_to == "development"
+        assert _stage(task_id) == "development"
+        jobs = _jobs()
+        assert len(jobs) == 1
+        assert jobs[0]["agent"] == "developer"
+
+    def test_retry_over_3_blocks_and_alerts(self, monkeypatch):
+        monkeypatch.setattr(
+            stage_engine, "QG_CHECKS",
+            {**stage_engine.QG_CHECKS, "check_tests_passed": _fail("still failing")},
+        )
+        task_id = _make_task("testing")
+        _add_developer_runs(task_id, 3)
+        res = advance_stage(task_id, "testing", "enduro-trails", "ET-001",
+                           "feature/ET-001-x", finished_agent="tester")
+        assert res.rolled_back_to == "development"
+        assert res.alerted is True
+        assert stage_engine.set_issue_blocked.called
+        assert _jobs() == []
+
+
+# ---------------------------------------------------------------------------
+# BUG 8: deploy verdict gates deploy -> done (not the LLM exit code)
+# ---------------------------------------------------------------------------
+class TestDeployVerdict:
+    """deploy -> done must be gated on check_deploy_status (the deployer's
+    machine-readable verdict), NOT on the LLM exit code (always 0)."""
+
+    def test_failed_verdict_rolls_back_to_development(self, monkeypatch):
+        # deployer finished (exit_code 0 from launcher), but verdict is FAILED.
+        monkeypatch.setattr(
+            stage_engine, "QG_CHECKS",
+            {**stage_engine.QG_CHECKS,
+             "check_deploy_status": _fail("Deploy status: FAILED")},
+        )
+        task_id = _make_task("deploy")
+        res = advance_stage(task_id, "deploy", "enduro-trails", "ET-011",
+                            "feature/ET-011-x", finished_agent="deployer")
+        assert res.advanced is False
+        assert res.rolled_back_to == "development"
+        assert _stage(task_id) == "development"   # NOT done
+        assert res.alerted is True
+        assert stage_engine.set_issue_blocked.called
+        assert stage_engine.send_telegram.called
+
+    def test_no_deploy_log_rolls_back(self, monkeypatch):
+        # No frontmatter field / no file -> check returns False -> rollback.
+        monkeypatch.setattr(
+            stage_engine, "QG_CHECKS",
+            {**stage_engine.QG_CHECKS,
+             "check_deploy_status": _fail("Deploy log not found (14-deploy-log.md)")},
+        )
+        task_id = _make_task("deploy")
+        res = advance_stage(task_id, "deploy", "enduro-trails", "ET-011",
+                            "feature/ET-011-x", finished_agent="deployer")
+        assert res.advanced is False
+        assert _stage(task_id) == "development"
+
+    def test_success_verdict_advances_to_done(self, monkeypatch):
+        monkeypatch.setattr(
+            stage_engine, "QG_CHECKS",
+            {**stage_engine.QG_CHECKS,
+             "check_deploy_status": _pass},
+        )
+        task_id = _make_task("deploy")
+        res = advance_stage(task_id, "deploy", "enduro-trails", "ET-011",
+                            "feature/ET-011-x", finished_agent="deployer")
+        assert res.advanced is True
+        assert res.to_stage == "done"
+        assert _stage(task_id) == "done"
+        assert res.enqueued_agent is None   # no agent leaves deploy
+        assert _jobs() == []
+
+
+# ---------------------------------------------------------------------------
+# Architect conflict -> rollback to analysis + enqueue analyst
+# ---------------------------------------------------------------------------
+class TestArchitectConflict:
+    def test_conflict_rolls_back_to_analysis(self, monkeypatch, tmp_path):
+        # 10-conflict.md must exist in the worktree path the engine inspects.
+        wt = tmp_path / "wt"
+        conflict_dir = wt / "docs" / "work-items" / "ET-001"
+        conflict_dir.mkdir(parents=True)
+        (conflict_dir / "10-conflict.md").write_text("conflict with TRZ")
+
+        monkeypatch.setattr(stage_engine, "get_worktree_path", lambda repo, branch: str(wt))
+        monkeypatch.setattr(
+            stage_engine, "QG_CHECKS",
+            {**stage_engine.QG_CHECKS, "check_architecture_done": _fail("conflict")},
+        )
+        task_id = _make_task("architecture")
+        res = advance_stage(task_id, "architecture", "enduro-trails", "ET-001",
+                           "feature/ET-001-x", finished_agent="architect")
+        assert res.advanced is False
+        assert res.rolled_back_to == "analysis"
+        assert _stage(task_id) == "analysis"
+        jobs = _jobs()
+        assert len(jobs) == 1
+        assert jobs[0]["agent"] == "analyst"
+
+    def test_no_conflict_file_no_rollback(self, monkeypatch, tmp_path):
+        wt = tmp_path / "wt"
+        (wt / "docs").mkdir(parents=True)
+        monkeypatch.setattr(stage_engine, "get_worktree_path", lambda repo, branch: str(wt))
+        monkeypatch.setattr(
+            stage_engine, "QG_CHECKS",
+            {**stage_engine.QG_CHECKS, "check_architecture_done": _fail("incomplete")},
+        )
+        task_id = _make_task("architecture")
+        res = advance_stage(task_id, "architecture", "enduro-trails", "ET-001",
+                           "feature/ET-001-x", finished_agent="architect")
+        assert res.advanced is False
+        assert res.rolled_back_to is None
+        assert _stage(task_id) == "architecture"
+        assert _jobs() == []
+
+
+# ---------------------------------------------------------------------------
+# Analyst approved-flow (analysis gate): never auto-advances
+# ---------------------------------------------------------------------------
+class TestAnalysisApprovedFlow:
+    def test_artifacts_ready_requests_approval_no_advance(self, monkeypatch):
+        monkeypatch.setattr(
+            stage_engine, "QG_CHECKS",
+            {**stage_engine.QG_CHECKS, "check_analysis_complete": _pass},
+        )
+        task_id = _make_task("analysis")
+        res = advance_stage(task_id, "analysis", "enduro-trails", "ET-001",
+                           "feature/ET-001-x", finished_agent="analyst")
+        assert res.advanced is False
+        assert _stage(task_id) == "analysis"
+        assert stage_engine.set_issue_in_review.called
+        assert stage_engine.notify_approve_requested.called
+        assert _jobs() == []
+
+    def test_approved_verdict_advances_analysis_to_architecture(self, monkeypatch):
+        """BUG 4: a human Approved STATUS (webhook path, finished_agent=None)
+        must satisfy the analysis gate and advance analysis -> architecture,
+        enqueuing the architect. The status-only approval must NOT re-run
+        check_analysis_approved (which looks for an :approved: COMMENT and would
+        otherwise wrongly block the advance).
+        """
+        # Make check_analysis_approved FAIL if it is ever called: the webhook
+        # path must bypass it entirely (status == approval). If the engine were
+        # to re-run the gate, this would block the advance and fail the test.
+        monkeypatch.setattr(
+            stage_engine, "QG_CHECKS",
+            {
+                **stage_engine.QG_CHECKS,
+                "check_analysis_approved": _fail("no :approved: comment"),
+            },
+        )
+        # Guard: the approval-flow (launcher-only) must NOT be invoked here.
+        flow = MagicMock()
+        monkeypatch.setattr(stage_engine, "_handle_analysis_approved_flow", flow)
+
+        task_id = _make_task("analysis")
+        res = advance_stage(
+            task_id, "analysis", "enduro-trails", "ET-001",
+            "feature/ET-001-x", finished_agent=None,
+        )
+
+        assert res.advanced is True
+        assert res.to_stage == "architecture"
+        assert _stage(task_id) == "architecture"
+        assert res.enqueued_agent == "architect"
+        # Sanity: agent for analysis is architect, never analyst (no re-run loop).
+        assert get_agent_for_stage("analysis") == "architect"
+        jobs = _jobs()
+        assert len(jobs) == 1
+        assert jobs[0]["agent"] == "architect"
+        # The launcher-only approval-flow was NOT called on the webhook path.
+        flow.assert_not_called()
+
+    def test_launcher_path_does_not_advance_and_calls_flow(self, monkeypatch):
+        """Regression: the launcher path (finished_agent='analyst') still routes
+        into _handle_analysis_approved_flow and does NOT advance.
+        """
+        flow = MagicMock()
+        monkeypatch.setattr(stage_engine, "_handle_analysis_approved_flow", flow)
+
+        task_id = _make_task("analysis")
+        res = advance_stage(
+            task_id, "analysis", "enduro-trails", "ET-001",
+            "feature/ET-001-x", finished_agent="analyst",
+        )
+
+        assert res.advanced is not True
+        assert _stage(task_id) == "analysis"
+        assert _jobs() == []
+        flow.assert_called_once()
+
+
+# ---------------------------------------------------------------------------
+# launcher + plane both delegate to the engine
+# ---------------------------------------------------------------------------
+class TestDelegation:
+    def test_launcher_calls_engine(self):
+        from src.agents.launcher import AgentLauncher
+        task_id = _make_task("development", branch="feature/ET-777-deleg")
+        with patch("src.stage_engine.advance_stage") as m:
+            AgentLauncher()._try_advance_stage(
+                run_id=1, agent="developer", repo="enduro-trails",
+                branch="feature/ET-777-deleg",
+            )
+        m.assert_called_once()
+        kwargs = m.call_args.kwargs
+        assert kwargs["task_id"] == task_id
+        assert kwargs["current_stage"] == "development"
+        assert kwargs["finished_agent"] == "developer"
+
+    def test_plane_calls_engine(self):
+        import asyncio
+        from src.webhooks import plane as plane_mod
+        with patch("src.stage_engine.advance_stage") as m:
+            asyncio.run(
+                plane_mod._try_advance_stage(
+                    task_id=5, current_stage="analysis", repo="enduro-trails",
+                    work_item_id="ET-001", branch="feature/ET-001-x",
+                )
+            )
+        m.assert_called_once()
+        # plane passes positional args; finished_agent (last positional) is None.
+        args = m.call_args.args
+        assert args[0] == 5
+        assert args[1] == "analysis"
+        assert args[-1] is None
--- a/tests/test_stage_visibility.py
+++ b/tests/test_stage_visibility.py
@@ -0,0 +1,94 @@
+"""Feature 3: stage visibility on the Plane board.
+
+  * PLANE_STATES carries the 6 new per-stage / verdict UUIDs.
+  * STAGE_TO_STATE maps architecture/development/review/testing to their
+    dedicated board statuses (not all -> In Progress anymore).
+  * set_issue_stage_state(work_item_id, stage) PATCHes the correct state UUID
+    for a visible stage, and is a no-op for stages without one (analysis/deploy).
+  * Needs Input / In Review / Blocked remain higher priority: their explicit
+    setters use their own state, never overwritten by the stage map.
+
+httpx is mocked; no network.
+"""
+
+import os
+
+os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token")
+os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
+
+from unittest.mock import patch, MagicMock  # noqa: E402
+
+from src import plane_sync as PS  # noqa: E402
+
+
+EXPECTED_UUIDS = {
+    "architecture": "3020bbb7-6122-4663-930c-0315ba8dfa3d",
+    "development": "9920609b-f140-4e46-ab95-89acda8412c8",
+    "review": "ba0d802c-5218-41d4-ab43-978b0ea123ed",
+    "testing": "7855d807-b1bf-42ef-8dae-6cde0df92d02",
+    "approved": "a519a341-dada-4a91-8910-7604f82b79c5",
+    "rejected": "ba958f3c-5db5-461d-8f82-89425e413b97",
+}
+
+
+def test_plane_states_has_new_uuids():
+    for key, uuid in EXPECTED_UUIDS.items():
+        assert PS.PLANE_STATES[key] == uuid
+
+
+def test_stage_to_state_maps_visible_stages():
+    assert PS.STAGE_TO_STATE["architecture"] == EXPECTED_UUIDS["architecture"]
+    assert PS.STAGE_TO_STATE["development"] == EXPECTED_UUIDS["development"]
+    assert PS.STAGE_TO_STATE["review"] == EXPECTED_UUIDS["review"]
+    assert PS.STAGE_TO_STATE["testing"] == EXPECTED_UUIDS["testing"]
+    # analysis / deploy stay on In Progress; done stays Done.
+    assert PS.STAGE_TO_STATE["analysis"] == PS.PLANE_STATES["in_progress"]
+    assert PS.STAGE_TO_STATE["deploy"] == PS.PLANE_STATES["in_progress"]
+    assert PS.STAGE_TO_STATE["done"] == PS.PLANE_STATES["done"]
+
+
+def _patch_resolution(monkey_targets):
+    """Helper: patch find_issue_id + _resolve_project_id to skip the DB/network."""
+    return monkey_targets
+
+
+@patch("src.plane_sync.httpx.patch")
+@patch("src.plane_sync.find_issue_id", return_value="issue-uuid")
+@patch("src.plane_sync._resolve_project_id", return_value="proj-1")
+def test_set_issue_stage_state_patches_correct_uuid(mock_proj, mock_find, mock_patch):
+    resp = MagicMock(); resp.raise_for_status.return_value = None
+    mock_patch.return_value = resp
+
+    PS.set_issue_stage_state("ET-1", "development")
+    # the PATCH carried the development state UUID
+    _, kwargs = mock_patch.call_args
+    assert kwargs["json"]["state"] == EXPECTED_UUIDS["development"]
+
+
+@patch("src.plane_sync.httpx.patch")
+@patch("src.plane_sync.find_issue_id", return_value="issue-uuid")
+@patch("src.plane_sync._resolve_project_id", return_value="proj-1")
+def test_set_issue_stage_state_noop_for_analysis(mock_proj, mock_find, mock_patch):
+    # analysis has no dedicated board status -> no PATCH at all.
+    PS.set_issue_stage_state("ET-1", "analysis")
+    mock_patch.assert_not_called()
+    PS.set_issue_stage_state("ET-1", "deploy")
+    mock_patch.assert_not_called()
+
+
+@patch("src.plane_sync.httpx.patch")
+@patch("src.plane_sync.find_issue_id", return_value="issue-uuid")
+@patch("src.plane_sync._resolve_project_id", return_value="proj-1")
+def test_priority_states_use_their_own_uuid(mock_proj, mock_find, mock_patch):
+    """Needs Input / In Review / Blocked are set explicitly and take priority."""
+    resp = MagicMock(); resp.raise_for_status.return_value = None
+    mock_patch.return_value = resp
+
+    PS.set_issue_needs_input("ET-1")
+    assert mock_patch.call_args.kwargs["json"]["state"] == PS.PLANE_STATES["needs_input"]
+
+    PS.set_issue_in_review("ET-1")
+    assert mock_patch.call_args.kwargs["json"]["state"] == PS.PLANE_STATES["in_review"]
+
+    PS.set_issue_blocked("ET-1")
+    assert mock_patch.call_args.kwargs["json"]["state"] == PS.PLANE_STATES["blocked"]
--- a/tests/test_status_only_verdict.py
+++ b/tests/test_status_only_verdict.py
@@ -0,0 +1,200 @@
+"""Status-only verdict model (bug 3 fix).
+
+The comment-based control mechanism (:approved: / :rejected: / answer-to-questions)
+was removed. The pipeline is driven SOLELY by Plane status changes. These tests
+lock in the new behaviour:
+
+  * test_inreview_comment_does_not_revert       — bug 3 root: an In Review task,
+    any comment arrives -> status NOT reverted, no agent launched.
+  * test_any_comment_no_pipeline_action         — :approved: / :rejected: / plain
+    text comment -> no status change, no enqueue.
+  * test_approved_status_advances_without_inprogress_reset — Approved status
+    advances WITHOUT an intermediate set_issue_in_progress reset.
+  * test_rejected_status_pulls_reason_from_comment — Rejected status pulls the
+    reason from the issue's latest comment (mocked GET comments).
+"""
+
+import os
+import tempfile
+
+_test_db = os.path.join(tempfile.gettempdir(), "test_orchestrator_status_only.db")
+os.environ["ORCH_DB_PATH"] = _test_db
+os.environ.setdefault("ORCH_PLANE_WEBHOOK_SECRET", "")
+os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
+os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token")
+
+import pytest  # noqa: E402
+from unittest.mock import patch, AsyncMock  # noqa: E402
+from fastapi.testclient import TestClient  # noqa: E402
+
+from src.main import app  # noqa: E402
+from src.db import init_db, get_db  # noqa: E402
+from src import projects as P  # noqa: E402
+from src.projects import reload_projects  # noqa: E402
+
+ENDURO_PLANE_ID = "7a79f0a9-5278-49cd-9007-9a338f238f9c"
+APPROVED = "a519a341-dada-4a91-8910-7604f82b79c5"
+REJECTED = "ba958f3c-5db5-461d-8f82-89425e413b97"
+IN_REVIEW = "38fb1f64-aa1e-48a3-92e0-0b109679046b"
+
+client = TestClient(app)
+
+
+@pytest.fixture(autouse=True)
+def setup(monkeypatch):
+    monkeypatch.setattr(P.settings, "db_path", _test_db)
+    import src.db as _db
+    monkeypatch.setattr(_db.settings, "db_path", _test_db)
+    if os.path.exists(_test_db):
+        os.unlink(_test_db)
+    init_db()
+    monkeypatch.setattr("src.webhooks.plane.verify_plane_signature", lambda body, sig: True)
+    registry_json = (
+        f'[{{"plane_project_id": "{ENDURO_PLANE_ID}", "repo": "enduro-trails",'
+        f' "work_item_prefix": "ET", "name": "enduro-trails"}}]'
+    )
+    monkeypatch.setattr(P.settings, "projects_json", registry_json)
+    reload_projects()
+    # Seed a task at the 'review' stage for plane_id 'r-1'.
+    conn = get_db()
+    conn.execute(
+        "INSERT INTO tasks (plane_id, work_item_id, repo, branch, stage, plane_issue_id) "
+        "VALUES (?, ?, ?, ?, ?, ?)",
+        ("r-1", "ET-700", "enduro-trails", "feature/ET-700-x", "review", "r-1"),
+    )
+    conn.commit()
+    conn.close()
+    yield
+    reload_projects()
+    if os.path.exists(_test_db):
+        os.unlink(_test_db)
+
+
+class _FakeResp:
+    def __init__(self, status_code, payload):
+        self.status_code = status_code
+        self._payload = payload
+
+    def json(self):
+        return self._payload
+
+
+def _comment(text, plane_id="r-1"):
+    return client.post("/webhook/plane", json={
+        "event": "issue_comment", "action": "created",
+        "data": {"work_item_id": plane_id, "comment_stripped": text,
+                 "project": ENDURO_PLANE_ID},
+    })
+
+
+def _status(state_id, plane_id="r-1", old="prev"):
+    return client.post("/webhook/plane", json={
+        "event": "issue", "action": "updated",
+        "data": {
+            "id": plane_id, "name": "Status task", "project": ENDURO_PLANE_ID,
+            "state": {"id": state_id, "name": "X", "group": "started"},
+        },
+        "activity": {"field": "state", "new_value": state_id, "old_value": old},
+    })
+
+
+def _stage(plane_id="r-1"):
+    conn = get_db()
+    row = conn.execute("SELECT stage FROM tasks WHERE plane_id=?", (plane_id,)).fetchone()
+    conn.close()
+    return row[0] if row else None
+
+
+# --------------------------------------------------------------------------- #
+# Bug 3 root: In Review must not revert on a comment.
+# --------------------------------------------------------------------------- #
+@patch("src.webhooks.plane.enqueue_job")
+@patch("src.plane_sync.set_issue_in_progress")
+@patch("src.plane_sync._set_issue_state_direct")
+@patch("src.plane_sync.update_issue_state")
+def test_inreview_comment_does_not_revert(
+    mock_update_state, mock_set_direct, mock_sip, mock_enqueue
+):
+    """Bug 3: task in In Review, ANY comment arrives -> status NOT reverted to
+    In Progress, NO agent launched. The analyst's own 'waiting for approval'
+    comment used to echo back and self-hit -> reverted In Review -> In Progress.
+    """
+    # analyst's own echo comment
+    resp = _comment("Готово, жду approved")
+    assert resp.status_code == 200
+    # no status changes whatsoever
+    mock_sip.assert_not_called()
+    mock_set_direct.assert_not_called()
+    mock_update_state.assert_not_called()
+    # no agent launched
+    mock_enqueue.assert_not_called()
+    # stage untouched
+    assert _stage() == "review"
+
+
+# --------------------------------------------------------------------------- #
+# Any comment -> zero pipeline side-effects.
+# --------------------------------------------------------------------------- #
+@pytest.mark.parametrize("text", [":approved:", ":rejected: bad", "plain text", ""])
+@patch("src.webhooks.plane.enqueue_job")
+@patch("src.webhooks.plane._try_advance_stage", new_callable=AsyncMock)
+@patch("src.webhooks.plane._rollback_stage", new_callable=AsyncMock)
+@patch("src.plane_sync.set_issue_in_progress")
+@patch("src.plane_sync._set_issue_state_direct")
+def test_any_comment_no_pipeline_action(
+    mock_set_direct, mock_sip, mock_rollback, mock_advance, mock_enqueue, text
+):
+    resp = _comment(text)
+    assert resp.status_code == 200
+    mock_advance.assert_not_called()
+    mock_rollback.assert_not_called()
+    mock_sip.assert_not_called()
+    mock_set_direct.assert_not_called()
+    mock_enqueue.assert_not_called()
+    assert _stage() == "review"
+
+
+# --------------------------------------------------------------------------- #
+# Approved status advances WITHOUT in_progress reset.
+# --------------------------------------------------------------------------- #
+@patch("src.plane_sync.set_issue_in_progress")
+@patch("src.webhooks.plane._try_advance_stage", new_callable=AsyncMock)
+def test_approved_status_advances_without_inprogress_reset(mock_advance, mock_sip):
+    resp = _status(APPROVED)
+    assert resp.status_code == 200
+    mock_advance.assert_awaited_once()
+    # work_item_id passed positionally
+    assert "ET-700" in mock_advance.call_args.args
+    # bug 3 (cause B): NO intermediate set_issue_in_progress before advance.
+    mock_sip.assert_not_called()
+
+
+# --------------------------------------------------------------------------- #
+# Rejected status pulls reason from latest comment.
+# --------------------------------------------------------------------------- #
+@patch("src.webhooks.plane.httpx.get")
+@patch("src.webhooks.plane._rollback_stage", new_callable=AsyncMock)
+def test_rejected_status_pulls_reason_from_comment(mock_rollback, mock_get):
+    mock_get.return_value = _FakeResp(200, {"results": [
+        {"comment_stripped": "old comment", "created_at": "2026-06-03T09:00:00Z"},
+        {"comment_html": "<p>Needs more test coverage</p>",
+         "created_at": "2026-06-03T11:30:00Z"},
+    ]})
+    resp = _status(REJECTED)
+    assert resp.status_code == 200
+    mock_rollback.assert_awaited_once()
+    reason = mock_rollback.call_args.args[-1]
+    # latest by created_at, HTML stripped
+    assert "Needs more test coverage" in reason
+    assert "<p>" not in reason
+
+
+@patch("src.webhooks.plane.httpx.get")
+@patch("src.webhooks.plane._rollback_stage", new_callable=AsyncMock)
+def test_rejected_status_no_comment_uses_fallback(mock_rollback, mock_get):
+    mock_get.return_value = _FakeResp(200, {"results": []})
+    resp = _status(REJECTED)
+    assert resp.status_code == 200
+    mock_rollback.assert_awaited_once()
+    reason = mock_rollback.call_args.args[-1]
+    assert "no reason comment" in reason
--- a/tests/test_status_trigger.py
+++ b/tests/test_status_trigger.py
@@ -0,0 +1,243 @@
+"""Feature 1: pipeline starts on status -> In Progress, not on creation.
+
+  * work_item.created / issue created -> NO task, NO branch, NO analyst.
+  * issue updated -> In Progress (from backlog) -> task created + analyst enqueued.
+  * a second In Progress update while the agent is busy -> NO duplicate, NO
+    restart (busy-guard).
+  * In Progress returned from Needs Input (agent idle) -> agent RELAUNCHED.
+
+launcher / Gitea network are mocked. Real FastAPI endpoint via TestClient.
+"""
+
+import os
+import tempfile
+
+_test_db = os.path.join(tempfile.gettempdir(), "test_orchestrator_status_trigger.db")
+os.environ["ORCH_DB_PATH"] = _test_db
+os.environ.setdefault("ORCH_PLANE_WEBHOOK_SECRET", "")
+os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
+os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token")
+
+import pytest  # noqa: E402
+from unittest.mock import patch, AsyncMock  # noqa: E402
+from fastapi.testclient import TestClient  # noqa: E402
+
+from src.main import app  # noqa: E402
+from src.db import init_db, get_db  # noqa: E402
+from src import projects as P  # noqa: E402
+from src.projects import reload_projects  # noqa: E402
+
+ENDURO_PLANE_ID = "7a79f0a9-5278-49cd-9007-9a338f238f9c"
+IN_PROGRESS = "b873d9eb-993c-48cd-97ac-99a9b1623967"
+BACKLOG = "113b24f6-cce8-4be9-9a22-a359b9cf0122"
+
+client = TestClient(app)
+
+
+@pytest.fixture(autouse=True)
+def setup(monkeypatch):
+    monkeypatch.setattr(P.settings, "db_path", _test_db)
+    import src.db as _db
+    monkeypatch.setattr(_db.settings, "db_path", _test_db)
+    if os.path.exists(_test_db):
+        os.unlink(_test_db)
+    init_db()
+    monkeypatch.setattr("src.webhooks.plane.verify_plane_signature", lambda body, sig: True)
+    registry_json = (
+        f'[{{"plane_project_id": "{ENDURO_PLANE_ID}", "repo": "enduro-trails",'
+        f' "work_item_prefix": "ET", "name": "enduro-trails"}}]'
+    )
+    monkeypatch.setattr(P.settings, "projects_json", registry_json)
+    reload_projects()
+    yield
+    reload_projects()
+    if os.path.exists(_test_db):
+        os.unlink(_test_db)
+
+
+def _created(plane_id="st-created"):
+    return client.post("/webhook/plane", json={
+        "event": "issue", "action": "created",
+        "data": {
+            "id": plane_id, "name": "A valid backlog item title",
+            "description_stripped": "A sufficiently long description for QG-0.",
+            "project": ENDURO_PLANE_ID,
+            "state": {"id": BACKLOG, "name": "Backlog", "group": "backlog"},
+        },
+    })
+
+
+def _to_in_progress(plane_id="st-1"):
+    return client.post("/webhook/plane", json={
+        "event": "issue", "action": "updated",
+        "data": {
+            "id": plane_id, "name": "A valid backlog item title",
+            "description_stripped": "A sufficiently long description for QG-0.",
+            "project": ENDURO_PLANE_ID,
+            "state": {"id": IN_PROGRESS, "name": "In Progress", "group": "started"},
+        },
+        "activity": {"field": "state", "new_value": IN_PROGRESS, "old_value": BACKLOG},
+    })
+
+
+def _count(plane_id):
+    conn = get_db()
+    n = conn.execute("SELECT COUNT(*) FROM tasks WHERE plane_id=?", (plane_id,)).fetchone()[0]
+    conn.close()
+    return n
+
+
+# --------------------------------------------------------------------------- #
+@patch("src.webhooks.plane.enqueue_job")
+@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
+@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
+def test_created_does_not_start_pipeline(mock_branch, mock_docs, mock_enqueue):
+    resp = _created("st-created")
+    assert resp.status_code == 200
+    assert resp.json()["status"] == "accepted"
+    # No task, no branch, no analyst enqueue.
+    assert _count("st-created") == 0
+    mock_branch.assert_not_called()
+    mock_enqueue.assert_not_called()
+
+
+@patch("src.webhooks.plane.enqueue_job")
+@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
+@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
+@patch("src.plane_sync.fetch_issue_sequence_id", return_value=5)
+def test_in_progress_starts_pipeline(mock_seq, mock_branch, mock_docs, mock_enqueue):
+    mock_enqueue.return_value = 1
+    resp = _to_in_progress("st-1")
+    assert resp.status_code == 200
+    assert resp.json()["status"] == "accepted"
+    assert _count("st-1") == 1
+    conn = get_db()
+    task = conn.execute("SELECT * FROM tasks WHERE plane_id='st-1'").fetchone()
+    conn.close()
+    assert task["stage"] == "analysis"
+    assert task["repo"] == "enduro-trails"
+    mock_branch.assert_called_once()
+    # analyst enqueued exactly once
+    assert mock_enqueue.call_count == 1
+    assert mock_enqueue.call_args.args[0] == "analyst"
+
+
+@patch("src.webhooks.plane.enqueue_job")
+@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
+@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
+@patch("src.plane_sync.fetch_issue_sequence_id", return_value=5)
+def test_repeat_in_progress_while_job_active_does_not_relaunch(
+    mock_seq, mock_branch, mock_docs, mock_enqueue
+):
+    """Status-only model busy-guard: a duplicate In Progress webhook that arrives
+    while the stage agent still has a queued/running job must NOT relaunch the
+    agent (no double launch).
+    """
+    mock_enqueue.return_value = 1
+    _to_in_progress("st-2")
+    assert _count("st-2") == 1
+    assert mock_enqueue.call_count == 1
+
+    # enqueue_job is mocked above, so no real job row exists. Seed an ACTIVE
+    # (queued) job for the task so has_active_job_for_task() reports the agent as
+    # busy -> the busy-guard fires.
+    conn = get_db()
+    task_id = conn.execute(
+        "SELECT id FROM tasks WHERE plane_id='st-2'"
+    ).fetchone()[0]
+    conn.execute(
+        "INSERT INTO jobs (agent, repo, task_id, status) VALUES (?, ?, ?, 'queued')",
+        ("analyst", "enduro-trails", task_id),
+    )
+    conn.commit()
+    conn.close()
+
+    # Second In Progress update. DISTINCT body (different activity old_value) so
+    # webhook dedup does NOT short-circuit it — this exercises the busy-guard in
+    # handle_status_start, not the delivery-dedup layer.
+    resp = client.post("/webhook/plane", json={
+        "event": "issue", "action": "updated",
+        "data": {
+            "id": "st-2", "name": "A valid backlog item title",
+            "description_stripped": "A sufficiently long description for QG-0.",
+            "project": ENDURO_PLANE_ID,
+            "state": {"id": IN_PROGRESS, "name": "In Progress", "group": "started"},
+        },
+        "activity": {"field": "state", "new_value": IN_PROGRESS, "old_value": "some-other-state"},
+    })
+    assert resp.status_code == 200
+    assert _count("st-2") == 1          # still exactly one task
+    assert mock_enqueue.call_count == 1  # analyst NOT re-enqueued (busy-guard)
+
+
+@patch("src.webhooks.plane.add_comment", create=True)
+@patch("src.webhooks.plane.enqueue_job")
+@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
+@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
+@patch("src.plane_sync.fetch_issue_sequence_id", return_value=5)
+def test_inprogress_from_needs_input_relaunches_analyst(
+    mock_seq, mock_branch, mock_docs, mock_enqueue, mock_comment
+):
+    """Status-only answer-to-questions flow: an existing analysis task whose agent
+    is IDLE (no active job — it went to Needs Input) is returned to In Progress
+    -> the analyst is relaunched to read Slava's fresh comments.
+
+    + double-webhook protection: a second In Progress while the relaunch job is
+    active does NOT relaunch again.
+    """
+    mock_enqueue.return_value = 1
+    # First In Progress: starts the pipeline (creates task + enqueues analyst).
+    _to_in_progress("st-ni")
+    assert _count("st-ni") == 1
+    assert mock_enqueue.call_count == 1
+
+    # The analyst finished and asked questions -> Needs Input. In our model that
+    # means NO active job for the task (enqueue_job is mocked, so no job row).
+    conn = get_db()
+    task_id = conn.execute(
+        "SELECT id FROM tasks WHERE plane_id='st-ni'"
+    ).fetchone()[0]
+    has_job = conn.execute(
+        "SELECT COUNT(*) FROM jobs WHERE task_id=? AND status IN ('queued','running')",
+        (task_id,),
+    ).fetchone()[0]
+    conn.close()
+    assert has_job == 0  # agent idle
+
+    # Slava answers + returns the issue to In Progress (distinct body).
+    resp = client.post("/webhook/plane", json={
+        "event": "issue", "action": "updated",
+        "data": {
+            "id": "st-ni", "name": "A valid backlog item title",
+            "description_stripped": "A sufficiently long description for QG-0.",
+            "project": ENDURO_PLANE_ID,
+            "state": {"id": IN_PROGRESS, "name": "In Progress", "group": "started"},
+        },
+        "activity": {"field": "state", "new_value": IN_PROGRESS, "old_value": "needs-input"},
+    })
+    assert resp.status_code == 200
+    assert _count("st-ni") == 1               # no duplicate task
+    assert mock_enqueue.call_count == 2        # analyst RELAUNCHED
+    assert mock_enqueue.call_args.args[0] == "analyst"
+
+    # Seed an active job for the relaunch, then a SECOND In Progress webhook must
+    # NOT relaunch again (busy-guard against double webhooks).
+    conn = get_db()
+    conn.execute(
+        "INSERT INTO jobs (agent, repo, task_id, status) VALUES (?, ?, ?, 'running')",
+        ("analyst", "enduro-trails", task_id),
+    )
+    conn.commit()
+    conn.close()
+    resp2 = client.post("/webhook/plane", json={
+        "event": "issue", "action": "updated",
+        "data": {
+            "id": "st-ni", "name": "A valid backlog item title",
+            "description_stripped": "A sufficiently long description for QG-0.",
+            "project": ENDURO_PLANE_ID,
+            "state": {"id": IN_PROGRESS, "name": "In Progress", "group": "started"},
+        },
+        "activity": {"field": "state", "new_value": IN_PROGRESS, "old_value": "x-y-z"},
+    })
+    assert resp2.status_code == 200
+    assert mock_enqueue.call_count == 2        # still 2 — busy-guard held
--- a/tests/test_taskmd_description.py
+++ b/tests/test_taskmd_description.py
@@ -0,0 +1,138 @@
+"""Tests for fix/taskmd-description (3 bugs at the analyst pipeline entry/exit):
+
+BUG A: start_pipeline built the analyst .task.md WITHOUT the description body
+       (only Title), so analyst received a ~101-byte file and reported the
+       "business request is empty". task_desc must now carry the description.
+
+BUG B: issue.updated ships only changed fields, so `name` is usually absent ->
+       slug/branch became "untitled". start_pipeline must pull the real name
+       from the Plane API (single fetch_issue_fields GET, above the slug build)
+       so the branch slug is NOT "untitled".
+
+BUG C: the analyst "artifacts ready" comment used the obsolete ":approved:"
+       wording. Under the status-only model it must ask for the **Approved**
+       status (not ":approved:", not "In Progress") and link the docs that
+       actually exist.
+"""
+
+import os
+import tempfile
+
+_test_db = os.path.join(tempfile.gettempdir(), "test_orchestrator_taskmd_desc.db")
+os.environ["ORCH_DB_PATH"] = _test_db
+os.environ.setdefault("ORCH_PLANE_WEBHOOK_SECRET", "")
+os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
+os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token")
+
+import pytest  # noqa: E402
+from unittest.mock import patch, AsyncMock  # noqa: E402
+from fastapi.testclient import TestClient  # noqa: E402
+
+from src.main import app  # noqa: E402
+from src.db import init_db, get_db  # noqa: E402
+from src import projects as P  # noqa: E402
+from src.projects import reload_projects  # noqa: E402
+
+ENDURO_PLANE_ID = "7a79f0a9-5278-49cd-9007-9a338f238f9c"
+IN_PROGRESS = "b873d9eb-993c-48cd-97ac-99a9b1623967"
+BACKLOG = "113b24f6-cce8-4be9-9a22-a359b9cf0122"
+
+client = TestClient(app)
+
+
+@pytest.fixture(autouse=True)
+def setup(monkeypatch):
+    monkeypatch.setattr(P.settings, "db_path", _test_db)
+    import src.db as _db
+    monkeypatch.setattr(_db.settings, "db_path", _test_db)
+    if os.path.exists(_test_db):
+        os.unlink(_test_db)
+    init_db()
+    monkeypatch.setattr("src.webhooks.plane.verify_plane_signature", lambda body, sig: True)
+    registry_json = (
+        f'[{{"plane_project_id": "{ENDURO_PLANE_ID}", "repo": "enduro-trails",'
+        f' "work_item_prefix": "ET", "name": "enduro-trails"}}]'
+    )
+    monkeypatch.setattr(P.settings, "projects_json", registry_json)
+    reload_projects()
+    yield
+    reload_projects()
+    if os.path.exists(_test_db):
+        os.unlink(_test_db)
+
+
+def _task(plane_id):
+    conn = get_db()
+    row = conn.execute("SELECT * FROM tasks WHERE plane_id=?", (plane_id,)).fetchone()
+    conn.close()
+    return row
+
+
+# --------------------------------------------------------------------------- #
+# BUG A: description reaches the analyst .task.md
+# --------------------------------------------------------------------------- #
+@patch("src.webhooks.plane.enqueue_job", return_value=1)
+@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
+@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
+@patch("src.plane_sync.fetch_issue_sequence_id", return_value=11)
+@patch("src.plane_sync.fetch_issue_fields",
+       return_value=("ET-011 real title",
+                     "REAL BUSINESS REQUEST BODY: user wants GPX upload with "
+                     "validation and a results map."))
+def test_taskdesc_includes_description(
+    mock_fields, mock_seq, mock_branch, mock_docs, mock_enqueue
+):
+    resp = client.post("/webhook/plane", json={
+        "event": "issue", "action": "updated",
+        "data": {
+            "id": "taskA",
+            # status change payload: NO name, NO description (only changed field)
+            "project": ENDURO_PLANE_ID,
+            "state": {"id": IN_PROGRESS, "name": "In Progress", "group": "started"},
+        },
+        "activity": {"field": "state", "new_value": IN_PROGRESS, "old_value": BACKLOG},
+    })
+    assert resp.status_code == 200
+    mock_enqueue.assert_called_once()
+    # task_desc is the 3rd positional arg of enqueue_job(agent, repo, task_desc, ...)
+    task_desc = mock_enqueue.call_args.args[2]
+    assert "Description:" in task_desc
+    # the actual description body (not just the Title) is in the file
+    assert "REAL BUSINESS REQUEST BODY" in task_desc
+    assert "results map" in task_desc
+
+
+# --------------------------------------------------------------------------- #
+# BUG B: name fetched from Plane API when payload is empty -> slug not untitled
+# --------------------------------------------------------------------------- #
+@patch("src.webhooks.plane.enqueue_job", return_value=1)
+@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
+@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
+@patch("src.plane_sync.fetch_issue_sequence_id", return_value=11)
+@patch("src.plane_sync.fetch_issue_fields",
+       return_value=("GPX upload feature",
+                     "A sufficiently long description so QG-0 passes cleanly."))
+def test_name_fetched_when_payload_empty(
+    mock_fields, mock_seq, mock_branch, mock_docs, mock_enqueue
+):
+    resp = client.post("/webhook/plane", json={
+        "event": "issue", "action": "updated",
+        "data": {
+            "id": "taskB",
+            # NO name, NO description in the payload (Plane status-change shape)
+            "project": ENDURO_PLANE_ID,
+            "state": {"id": IN_PROGRESS, "name": "In Progress", "group": "started"},
+        },
+        "activity": {"field": "state", "new_value": IN_PROGRESS, "old_value": BACKLOG},
+    })
+    assert resp.status_code == 200
+    mock_fields.assert_called_once()
+    row = _task("taskB")
+    assert row is not None
+    branch = row["branch"]
+    # slug derived from the fetched name -> "gpx-upload-feature", NOT untitled
+    assert "untitled" not in branch
+    assert "gpx-upload-feature" in branch
+    # Title in the analyst task file is the fetched name, not "untitled"
+    task_desc = mock_enqueue.call_args.args[2]
+    assert "Title: GPX upload feature" in task_desc
--- a/tests/test_telegram_tracker.py
+++ b/tests/test_telegram_tracker.py
@@ -0,0 +1,518 @@
+"""feat/telegram-live-tracker: tests for the live Telegram task tracker.
+
+Covers (per DEV_TASK_TELEGRAM_TRACKER.md):
+  * short_model_name: provider/claude- prefix trimming.
+  * render_task_tracker: per-stage line format (in↓/out↑, model, cost, minutes),
+    the "⏸️ Ревью БРД · твоё время" line, the 💰 totals, and the finish block
+    (⏱️ three times + 🔗/📦).
+  * first message -> sendMessage stores message_id; transition -> editMessageText.
+  * fallback: editMessageText fails -> a NEW message is sent and the id updated.
+  * which alerts go out SEPARATELY (approve-gate / deploy-fail / agent-fail /
+    error) vs which do NOT (QG-pending / agent-start / stage-transition).
+
+Isolated temp DB; no network (httpx is patched).
+"""
+
+import os
+import tempfile
+
+os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token")
+os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
+
+_test_db = os.path.join(tempfile.gettempdir(), "test_orchestrator_tracker.db")
+os.environ["ORCH_DB_PATH"] = _test_db
+
+from unittest.mock import MagicMock, patch  # noqa: E402
+
+import pytest  # noqa: E402
+
+import src.db as db_module  # noqa: E402
+from src.db import init_db, get_db  # noqa: E402
+from src import notifications as N  # noqa: E402
+from src import usage as U  # noqa: E402
+
+
+@pytest.fixture(autouse=True)
+def setup_db(monkeypatch):
+    monkeypatch.setattr(db_module.settings, "db_path", _test_db, raising=False)
+    if os.path.exists(_test_db):
+        os.unlink(_test_db)
+    init_db()
+    # Re-enable send_telegram (conftest stubs it to a no-op); these tests patch
+    # httpx / the lower-level helpers explicitly per case.
+    yield
+    if os.path.exists(_test_db):
+        os.unlink(_test_db)
+
+
+# --------------------------------------------------------------------------- #
+# helpers to build a task + runs in the DB
+# --------------------------------------------------------------------------- #
+def _mk_task(stage="development", title="\u0422\u0440\u0435\u043a\u0438 \u0441 \u0437\u0443\u043c\u0430 z5",
+             wid="ET-012", brd_start=None, brd_end=None):
+    conn = get_db()
+    cur = conn.execute(
+        "INSERT INTO tasks (plane_id, work_item_id, repo, branch, stage, title, "
+        "brd_review_started_at, brd_review_ended_at) "
+        "VALUES (?, ?, ?, ?, ?, ?, ?, ?)",
+        ("p1", wid, "enduro-trails", "feature/ET-012-x", stage, title,
+         brd_start, brd_end),
+    )
+    tid = cur.lastrowid
+    conn.commit()
+    conn.close()
+    return tid
+
+
+def _mk_run(task_id, agent, started, finished, in_tok, out_tok,
+            cache_read=0, cache_creation=0, cost=0.0, model=None, exit_code=0):
+    conn = get_db()
+    cur = conn.execute(
+        "INSERT INTO agent_runs (task_id, agent, started_at, finished_at, "
+        "exit_code, input_tokens, output_tokens, cache_read_tokens, "
+        "cache_creation_tokens, cost_usd, model) "
+        "VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)",
+        (task_id, agent, started, finished, exit_code, in_tok, out_tok,
+         cache_read, cache_creation, cost, model),
+    )
+    rid = cur.lastrowid
+    conn.commit()
+    conn.close()
+    return rid
+
+
+# --------------------------------------------------------------------------- #
+# short_model_name
+# --------------------------------------------------------------------------- #
+def test_short_model_name():
+    assert U.short_model_name("tokenator/claude-opus-4-8") == "opus-4-8"
+    assert U.short_model_name("vibecode/claude-sonnet-4.6") == "sonnet-4.6"
+    assert U.short_model_name("claude-opus-4-8") == "opus-4-8"
+    assert U.short_model_name("opus-4-8") == "opus-4-8"
+    assert U.short_model_name(None) == ""
+    assert U.short_model_name("") == ""
+
+
+def test_parse_usage_extracts_model_from_modelusage():
+    blob = (
+        '{"total_cost_usd":0.01,'
+        '"usage":{"input_tokens":10,"output_tokens":5},'
+        '"modelUsage":{"claude-opus-4-8":{"inputTokens":10,"outputTokens":5}}}'
+    )
+    u = U.parse_usage_from_text(blob)
+    assert u["model"] == "claude-opus-4-8"
+
+
+# --------------------------------------------------------------------------- #
+# render_task_tracker
+# --------------------------------------------------------------------------- #
+def test_render_in_progress_stage_lines_and_totals():
+    tid = _mk_task(stage="deploy", brd_start="2026-06-04 10:00:00",
+                   brd_end="2026-06-04 10:08:00")
+    # Analysis: 10м, 1.1M in (mostly cache) / 39.6k out, $2.38, opus-4-8
+    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
+            in_tok=1000, out_tok=39600, cache_read=1_100_000, cost=2.38,
+            model="tokenator/claude-opus-4-8")
+    _mk_run(tid, "architect", "2026-06-04 10:08:00", "2026-06-04 10:17:00",
+            in_tok=500, out_tok=34400, cache_read=1_500_000, cost=2.24,
+            model="tokenator/claude-opus-4-8")
+    _mk_run(tid, "developer", "2026-06-04 10:17:00", "2026-06-04 10:28:00",
+            in_tok=400, out_tok=45800, cache_read=8_400_000, cost=7.29,
+            model="tokenator/claude-opus-4-8")
+    _mk_run(tid, "reviewer", "2026-06-04 10:28:00", "2026-06-04 10:31:00",
+            in_tok=300, out_tok=12900, cache_read=1_200_000, cost=1.53,
+            model="vibecode/claude-sonnet-4.6")
+    _mk_run(tid, "tester", "2026-06-04 10:31:00", "2026-06-04 10:36:00",
+            in_tok=200, out_tok=19500, cache_read=1_200_000, cost=1.51,
+            model="vibecode/claude-sonnet-4.6")
+    # deployer started but not finished -> active "идёт" line.
+    _mk_run(tid, "deployer", "2026-06-04 10:36:00", None,
+            in_tok=0, out_tok=0, model=None, exit_code=None)
+
+    text = N.render_task_tracker(tid)
+
+    # Header in-progress
+    assert text.startswith("\U0001f6e0\ufe0f ET-012 \u00b7 \u0422\u0440\u0435\u043a\u0438")
+    # Per-stage format: in↓/out↑ · cost · model
+    assert "\u2705 Analysis" in text
+    assert "10\u043c" in text          # analysis duration
+    assert "39.6k\u2191" in text       # analysis out
+    assert "$2.38" in text
+    assert "opus-4-8" in text
+    assert "sonnet-4.6" in text        # reviewer/tester model
+    # BRD review line (human time, ended)
+    assert "\u0420\u0435\u0432\u044c\u044e \u0411\u0420\u0414" in text
+    assert "\u0442\u0432\u043e\u0451 \u0432\u0440\u0435\u043c\u044f" in text
+    # Active stage
+    assert "\U0001f504 Deploy" in text
+    assert "\u0438\u0434\u0451\u0442" in text
+    # Totals line present with 💰
+    assert "\U0001f4b0" in text
+    # In-progress: no final ⏱️ line
+    assert "\u0412\u0441\u0435\u0433\u043e" not in text
+
+
+def test_render_brd_review_waiting_shows_hourglass():
+    tid = _mk_task(stage="analysis", brd_start="2026-06-04 10:00:00",
+                   brd_end=None)
+    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
+            in_tok=1000, out_tok=39600, cache_read=1_100_000, cost=2.38,
+            model="tokenator/claude-opus-4-8")
+    text = N.render_task_tracker(tid)
+    assert "\u0420\u0435\u0432\u044c\u044e \u0411\u0420\u0414" in text
+    assert "\u23f3" in text  # hourglass while waiting
+
+
+def test_render_done_has_times_and_links():
+    tid = _mk_task(stage="done", brd_start="2026-06-04 10:00:00",
+                   brd_end="2026-06-04 10:08:00")
+    # set created/updated to compute wall clock
+    conn = get_db()
+    conn.execute(
+        "UPDATE tasks SET created_at='2026-06-04 09:00:00', "
+        "updated_at='2026-06-04 09:56:00' WHERE id=?", (tid,))
+    conn.commit()
+    conn.close()
+    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
+            in_tok=1000, out_tok=39600, cache_read=1_100_000, cost=2.38,
+            model="tokenator/claude-opus-4-8")
+    _mk_run(tid, "deployer", "2026-06-04 09:50:00", "2026-06-04 09:56:00",
+            in_tok=400, out_tok=22400, cache_read=1_600_000, cost=1.73,
+            model="tokenator/claude-opus-4-8")
+
+    with patch("src.notifications.httpx") as _hx:
+        # No PR found -> just "📦 deployed"
+        _resp = MagicMock(status_code=200)
+        _resp.json.return_value = []
+        _hx.get.return_value = _resp
+        text = N.render_task_tracker(tid)
+
+    assert text.startswith("\U0001f389 ET-012")
+    assert "\u0413\u041e\u0422\u041e\u0412\u041e" in text
+    # ⏱️ with three times
+    assert "\u23f1\ufe0f" in text
+    assert "\u0412\u0441\u0435\u0433\u043e" in text
+    assert "\u0430\u0433\u0435\u043d\u0442\u044b" in text
+    assert "\u0442\u0432\u043e\u0451" in text
+    # 📦 deployed line
+    assert "\U0001f4e6" in text
+
+
+def test_render_escapes_html_in_title():
+    tid = _mk_task(stage="analysis", title="A <b>& B</b>")
+    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
+            in_tok=10, out_tok=5, cost=0.0)
+    text = N.render_task_tracker(tid)
+    assert "&lt;b&gt;" in text
+    assert "&amp;" in text
+
+
+def test_render_omits_model_when_unknown():
+    tid = _mk_task(stage="analysis")
+    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
+            in_tok=10, out_tok=5, cost=0.0, model=None)
+    text = N.render_task_tracker(tid)
+    # No trailing " · <model>" — line ends at cost.
+    line = [l for l in text.splitlines() if l.startswith("\u2705 Analysis")][0]
+    assert line.rstrip().endswith("$0.00")
+
+
+# --------------------------------------------------------------------------- #
+# tracker send / edit / fallback
+# --------------------------------------------------------------------------- #
+def test_first_call_sends_message_and_stores_id(monkeypatch):
+    tid = _mk_task(stage="analysis")
+    _mk_run(tid, "analyst", "2026-06-04 09:00:00", None, in_tok=0, out_tok=0,
+            exit_code=None)
+
+    sent = {}
+    def _fake_send(text, disable_notification=False):
+        sent["text"] = text
+        sent["silent"] = disable_notification
+        return 555
+    monkeypatch.setattr(N, "send_telegram", _fake_send)
+    monkeypatch.setattr(N, "edit_telegram", lambda *a, **k: (_ for _ in ()).throw(AssertionError("should not edit on first call")))
+
+    N.update_task_tracker(tid)
+
+    from src.db import get_tracker_message_id
+    assert get_tracker_message_id(tid) == 555
+    assert sent["silent"] is True  # tracker is silent
+
+
+def test_second_call_edits_existing_message(monkeypatch):
+    tid = _mk_task(stage="development")
+    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
+            in_tok=10, out_tok=5, cost=0.1)
+    from src.db import set_tracker_message_id
+    set_tracker_message_id(tid, 777)
+
+    edited = {}
+    monkeypatch.setattr(N, "edit_telegram",
+                        lambda mid, text: edited.update(mid=mid) or N.EDIT_OK)
+    monkeypatch.setattr(N, "send_telegram",
+                        lambda *a, **k: (_ for _ in ()).throw(AssertionError("should not send when edit succeeds")))
+
+    N.update_task_tracker(tid)
+    assert edited["mid"] == 777
+
+
+def test_fallback_to_new_message_when_edit_gone(monkeypatch):
+    """edit returns 'gone' (message deleted/too old) -> send NEW + update id."""
+    tid = _mk_task(stage="development")
+    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
+            in_tok=10, out_tok=5, cost=0.1)
+    from src.db import set_tracker_message_id, get_tracker_message_id
+    set_tracker_message_id(tid, 100)
+
+    monkeypatch.setattr(N, "edit_telegram", lambda mid, text: N.EDIT_GONE)
+    monkeypatch.setattr(N, "send_telegram", lambda text, disable_notification=False: 200)
+
+    N.update_task_tracker(tid)
+    assert get_tracker_message_id(tid) == 200  # id updated to the new message
+
+
+def test_not_modified_does_not_send_new_message(monkeypatch):
+    """edit returns 'not_modified' -> NO new message, id unchanged (no dupe)."""
+    tid = _mk_task(stage="development")
+    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
+            in_tok=10, out_tok=5, cost=0.1)
+    from src.db import set_tracker_message_id, get_tracker_message_id
+    set_tracker_message_id(tid, 100)
+
+    monkeypatch.setattr(N, "edit_telegram", lambda mid, text: N.EDIT_NOT_MODIFIED)
+    monkeypatch.setattr(N, "send_telegram",
+                        lambda *a, **k: (_ for _ in ()).throw(AssertionError("must not send on not_modified")))
+
+    N.update_task_tracker(tid)
+    assert get_tracker_message_id(tid) == 100  # unchanged, no duplicate
+
+
+def test_transient_edit_failure_does_not_send_new_message(monkeypatch):
+    """edit returns 'failed' (network/timeout/5xx) -> NO new message (no dupe)."""
+    tid = _mk_task(stage="development")
+    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
+            in_tok=10, out_tok=5, cost=0.1)
+    from src.db import set_tracker_message_id, get_tracker_message_id
+    set_tracker_message_id(tid, 100)
+
+    monkeypatch.setattr(N, "edit_telegram", lambda mid, text: N.EDIT_FAILED)
+    monkeypatch.setattr(N, "send_telegram",
+                        lambda *a, **k: (_ for _ in ()).throw(AssertionError("must not send on transient failure")))
+
+    N.update_task_tracker(tid)
+    assert get_tracker_message_id(tid) == 100  # unchanged, no duplicate
+
+
+# --------------------------------------------------------------------------- #
+# edit_telegram outcome classification (httpx mocked)
+# --------------------------------------------------------------------------- #
+def _edit_resp(ok, description=None):
+    resp = MagicMock()
+    body = {"ok": ok}
+    if description is not None:
+        body["description"] = description
+    resp.json.return_value = body
+    return resp
+
+
+def _patch_tg_creds(monkeypatch):
+    monkeypatch.setattr(N._get_settings(), "telegram_bot_token", "T", raising=False)
+    monkeypatch.setattr(N._get_settings(), "telegram_chat_id", "C", raising=False)
+
+
+def test_edit_telegram_ok(monkeypatch):
+    _patch_tg_creds(monkeypatch)
+    with patch("src.notifications.httpx") as hx:
+        hx.post.return_value = _edit_resp(True)
+        assert N.edit_telegram(1, "x") == N.EDIT_OK
+
+
+def test_edit_telegram_not_modified_is_success(monkeypatch):
+    # 400 "message is not modified" -> success, not gone, no duplicate
+    _patch_tg_creds(monkeypatch)
+    with patch("src.notifications.httpx") as hx:
+        hx.post.return_value = _edit_resp(
+            False, "Bad Request: message is not modified: ...")
+        assert N.edit_telegram(1, "x") == N.EDIT_NOT_MODIFIED
+
+
+def test_edit_telegram_exactly_the_same_is_not_modified(monkeypatch):
+    _patch_tg_creds(monkeypatch)
+    with patch("src.notifications.httpx") as hx:
+        hx.post.return_value = _edit_resp(
+            False, "Bad Request: specified new message content and reply markup "
+                   "are exactly the same")
+        assert N.edit_telegram(1, "x") == N.EDIT_NOT_MODIFIED
+
+
+def test_edit_telegram_message_not_found_is_gone(monkeypatch):
+    _patch_tg_creds(monkeypatch)
+    with patch("src.notifications.httpx") as hx:
+        hx.post.return_value = _edit_resp(
+            False, "Bad Request: message to edit not found")
+        assert N.edit_telegram(1, "x") == N.EDIT_GONE
+
+
+def test_edit_telegram_cant_be_edited_is_gone(monkeypatch):
+    _patch_tg_creds(monkeypatch)
+    with patch("src.notifications.httpx") as hx:
+        hx.post.return_value = _edit_resp(
+            False, "Bad Request: message can't be edited")
+        assert N.edit_telegram(1, "x") == N.EDIT_GONE
+
+
+def test_edit_telegram_unknown_400_is_failed(monkeypatch):
+    # unknown 400 -> failed (NOT gone) -> caller won't duplicate
+    _patch_tg_creds(monkeypatch)
+    with patch("src.notifications.httpx") as hx:
+        hx.post.return_value = _edit_resp(
+            False, "Bad Request: some other unexpected error")
+        assert N.edit_telegram(1, "x") == N.EDIT_FAILED
+
+
+def test_edit_telegram_timeout_is_failed(monkeypatch):
+    _patch_tg_creds(monkeypatch)
+    with patch("src.notifications.httpx") as hx:
+        hx.post.side_effect = Exception("read timeout")
+        assert N.edit_telegram(1, "x") == N.EDIT_FAILED
+
+
+def test_edit_telegram_5xx_is_failed(monkeypatch):
+    # Telegram 5xx still returns ok:false w/o gone/not_modified markers
+    _patch_tg_creds(monkeypatch)
+    with patch("src.notifications.httpx") as hx:
+        hx.post.return_value = _edit_resp(False, "Internal Server Error")
+        assert N.edit_telegram(1, "x") == N.EDIT_FAILED
+
+
+# --------------------------------------------------------------------------- #
+# render: repeated stage attempt shows "попытка N"
+# --------------------------------------------------------------------------- #
+_POPYTKA = "\u043f\u043e\u043f\u044b\u0442\u043a\u0430"  # popytka
+
+
+def test_render_active_stage_shows_attempt_on_second_run():
+    # Two reviewer runs while in review -> active line shows attempt 2.
+    tid = _mk_task(stage="review")
+    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
+            in_tok=10, out_tok=5, cost=0.1, model="tokenator/claude-opus-4-8")
+    _mk_run(tid, "developer", "2026-06-04 09:10:00", "2026-06-04 09:20:00",
+            in_tok=10, out_tok=5, cost=0.1, model="tokenator/claude-opus-4-8")
+    # First review run finished (sent back to dev), second review run active.
+    _mk_run(tid, "reviewer", "2026-06-04 09:20:00", "2026-06-04 09:25:00",
+            in_tok=10, out_tok=5, cost=0.1, model="vibecode/claude-sonnet-4.6",
+            exit_code=0)
+    _mk_run(tid, "reviewer", "2026-06-04 09:30:00", None,
+            in_tok=0, out_tok=0, exit_code=None)
+
+    text = N.render_task_tracker(tid)
+    active = [l for l in text.splitlines()
+              if l.startswith("\U0001f504") and "Review" in l][0]
+    assert _POPYTKA in active
+    assert "2" in active
+    assert "\u0438\u0434\u0451\u0442" in active
+
+
+def test_render_active_stage_no_attempt_on_first_run():
+    # Single reviewer run -> active line has NO attempt marker.
+    tid = _mk_task(stage="review")
+    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
+            in_tok=10, out_tok=5, cost=0.1, model="tokenator/claude-opus-4-8")
+    _mk_run(tid, "developer", "2026-06-04 09:10:00", "2026-06-04 09:20:00",
+            in_tok=10, out_tok=5, cost=0.1, model="tokenator/claude-opus-4-8")
+    _mk_run(tid, "reviewer", "2026-06-04 09:20:00", None,
+            in_tok=0, out_tok=0, exit_code=None)
+
+    text = N.render_task_tracker(tid)
+    active = [l for l in text.splitlines()
+              if l.startswith("\U0001f504") and "Review" in l][0]
+    assert _POPYTKA not in active
+    assert "\u0438\u0434\u0451\u0442" in active
+
+
+def test_render_finished_lines_unaffected_by_attempt_logic():
+    # Completed (checkmark) lines never carry an attempt marker.
+    tid = _mk_task(stage="review")
+    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
+            in_tok=10, out_tok=5, cost=0.1, model="tokenator/claude-opus-4-8")
+    # developer ran twice (retry) but is a FINISHED stage now.
+    _mk_run(tid, "developer", "2026-06-04 09:10:00", "2026-06-04 09:15:00",
+            in_tok=10, out_tok=5, cost=0.1, model="tokenator/claude-opus-4-8")
+    _mk_run(tid, "developer", "2026-06-04 09:16:00", "2026-06-04 09:20:00",
+            in_tok=10, out_tok=5, cost=0.1, model="tokenator/claude-opus-4-8")
+    text = N.render_task_tracker(tid)
+    for l in text.splitlines():
+        if l.startswith("\u2705"):
+            assert _POPYTKA not in l
+
+
+# --------------------------------------------------------------------------- #
+# which alerts are SEPARATE vs tracker-only
+# --------------------------------------------------------------------------- #
+def test_approve_gate_sends_separate_message_and_starts_brd_clock(monkeypatch):
+    tid = _mk_task(stage="analysis")
+    calls = []
+    monkeypatch.setattr(N, "send_telegram",
+                        lambda text, disable_notification=False: calls.append((text, disable_notification)) or 1)
+    monkeypatch.setattr(N, "update_task_tracker", lambda task_id: None)
+
+    N.notify_approve_requested(tid)
+
+    # exactly one SEPARATE (notifying) send for the approve gate
+    assert len(calls) == 1
+    assert calls[0][1] is False  # notifying
+    assert "Approved" in calls[0][0]
+    # BRD clock started
+    conn = get_db()
+    row = conn.execute("SELECT brd_review_started_at FROM tasks WHERE id=?", (tid,)).fetchone()
+    conn.close()
+    assert row[0] is not None
+
+
+def test_error_sends_separate_message(monkeypatch):
+    tid = _mk_task(stage="development")
+    calls = []
+    monkeypatch.setattr(N, "send_telegram",
+                        lambda text, disable_notification=False: calls.append((text, disable_notification)) or 1)
+    N.notify_error(tid, "boom")
+    assert len(calls) == 1
+    assert calls[0][1] is False  # notifying
+    assert "ERROR" in calls[0][0]
+
+
+def test_stage_change_does_not_send_separate_message(monkeypatch):
+    tid = _mk_task(stage="development")
+    sent = []
+    monkeypatch.setattr(N, "send_telegram",
+                        lambda text, disable_notification=False: sent.append(text) or 1)
+    # tracker refresh is allowed (edit/send silent) but must NOT use send_telegram
+    # for a separate notification; stub update to isolate.
+    refreshed = []
+    monkeypatch.setattr(N, "update_task_tracker", lambda task_id: refreshed.append(task_id))
+
+    N.notify_stage_change(tid, "development", "review")
+    assert sent == []            # no separate message
+    assert refreshed == [tid]    # tracker refreshed instead
+
+
+def test_agent_started_does_not_send_separate_message(monkeypatch):
+    tid = _mk_task(stage="analysis")
+    sent = []
+    monkeypatch.setattr(N, "send_telegram",
+                        lambda text, disable_notification=False: sent.append(text) or 1)
+    refreshed = []
+    monkeypatch.setattr(N, "update_task_tracker", lambda task_id: refreshed.append(task_id))
+
+    N.notify_agent_started(1, "analyst", tid)
+    assert sent == []
+    assert refreshed == [tid]
+
+
+def test_qg_failure_does_not_send_separate_message(monkeypatch):
+    tid = _mk_task(stage="development")
+    sent = []
+    monkeypatch.setattr(N, "send_telegram",
+                        lambda text, disable_notification=False: sent.append(text) or 1)
+    N.notify_qg_failure(tid, "development", "check_ci_green", "CI state: pending")
+    assert sent == []  # QG-pending is log-only, never a separate ping
--- a/tests/test_usage.py
+++ b/tests/test_usage.py
@@ -0,0 +1,309 @@
+"""Feature 4: token / cost accounting tests.
+
+Covers:
+  * parse_usage_from_text on a REAL claude --output-format json result blob
+    (captured live from CLI 2.1.142), including a leading text line.
+  * parse on garbage / missing JSON -> None (never raises).
+  * record_usage writes the columns; NULLs when usage is None.
+  * fmt_tokens / fmt_cost formatting.
+  * usage_comment string format.
+  * task_usage_summary / task_summary_comment aggregate over agent_runs.
+
+DB is an isolated temp file; no network or subprocess.
+"""
+
+import os
+import tempfile
+
+os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token")
+os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
+
+_test_db = os.path.join(tempfile.gettempdir(), "test_orchestrator_usage.db")
+os.environ["ORCH_DB_PATH"] = _test_db
+
+import pytest  # noqa: E402
+
+from src import db as db_module  # noqa: E402
+from src.db import init_db, get_db  # noqa: E402
+from src import usage as U  # noqa: E402
+
+
+# Real claude --output-format json result object (captured from CLI 2.1.142).
+REAL_RESULT_JSON = (
+    '{"type":"result","subtype":"success","is_error":false,"duration_ms":1795,'
+    '"num_turns":1,"result":"Hi!","session_id":"abc",'
+    '"total_cost_usd":0.0560175,'
+    '"usage":{"input_tokens":45231,"cache_creation_input_tokens":7418,'
+    '"cache_read_input_tokens":18500,"output_tokens":12100,'
+    '"service_tier":"standard"},'
+    '"modelUsage":{"claude-opus-4-7":{"inputTokens":6,"outputTokens":7}},'
+    '"permission_denials":[]}'
+)
+
+
+@pytest.fixture(autouse=True)
+def setup_db(monkeypatch):
+    # get_db() reads settings.db_path live; pin it to our isolated DB.
+    monkeypatch.setattr(db_module.settings, "db_path", _test_db, raising=False)
+    if os.path.exists(_test_db):
+        os.unlink(_test_db)
+    init_db()
+    yield
+    if os.path.exists(_test_db):
+        os.unlink(_test_db)
+
+
+# --------------------------------------------------------------------------- #
+# parsing
+# --------------------------------------------------------------------------- #
+def test_parse_real_result_json():
+    u = U.parse_usage_from_text(REAL_RESULT_JSON)
+    assert u is not None
+    assert u["input_tokens"] == 45231
+    assert u["output_tokens"] == 12100
+    assert u["cache_read_tokens"] == 18500
+    # FIX 2: cache_creation slice must now be parsed (was dropped before).
+    assert u["cache_creation_tokens"] == 7418
+    assert abs(u["cost_usd"] - 0.0560175) < 1e-9
+
+
+def test_parse_cache_creation_present():
+    u = U.parse_usage_from_text(REAL_RESULT_JSON)
+    assert u["cache_creation_tokens"] == 7418
+
+
+def test_parse_cache_creation_missing_defaults_zero():
+    blob = (
+        '{"total_cost_usd":0.01,'
+        '"usage":{"input_tokens":10,"output_tokens":5,'
+        '"cache_read_input_tokens":100}}'
+    )
+    u = U.parse_usage_from_text(blob)
+    assert u["cache_creation_tokens"] == 0
+    assert u["cache_read_tokens"] == 100
+
+
+def test_parse_with_leading_text():
+    """The agent may print text before the trailing JSON; we still find it."""
+    text = "some agent stdout line\nanother line\n" + REAL_RESULT_JSON
+    u = U.parse_usage_from_text(text)
+    assert u is not None
+    assert u["input_tokens"] == 45231
+    assert u["output_tokens"] == 12100
+
+
+def test_parse_garbage_returns_none():
+    assert U.parse_usage_from_text("not json at all { broken") is None
+    assert U.parse_usage_from_text("") is None
+    assert U.parse_usage_from_text(None) is None
+
+
+def test_parse_json_without_usage_returns_none():
+    assert U.parse_usage_from_text('{"hello":"world"}') is None
+
+
+def test_parse_from_log_missing_file_returns_none():
+    assert U.parse_usage_from_log("/no/such/file.log") is None
+
+
+# --------------------------------------------------------------------------- #
+# record_usage
+# --------------------------------------------------------------------------- #
+def _new_run(agent="developer", task_id=1):
+    conn = get_db()
+    cur = conn.execute("INSERT INTO agent_runs (task_id, agent) VALUES (?, ?)", (task_id, agent))
+    rid = cur.lastrowid
+    conn.commit()
+    conn.close()
+    return rid
+
+
+def test_record_usage_writes_columns():
+    rid = _new_run()
+    u = U.parse_usage_from_text(REAL_RESULT_JSON)
+    U.record_usage(rid, u)
+    conn = get_db()
+    row = conn.execute(
+        "SELECT input_tokens, output_tokens, cache_read_tokens, "
+        "cache_creation_tokens, cost_usd "
+        "FROM agent_runs WHERE id=?", (rid,)
+    ).fetchone()
+    conn.close()
+    assert row["input_tokens"] == 45231
+    assert row["output_tokens"] == 12100
+    assert row["cache_read_tokens"] == 18500
+    # FIX 2: cache_creation column is now persisted.
+    assert row["cache_creation_tokens"] == 7418
+    assert abs(row["cost_usd"] - 0.0560175) < 1e-9
+
+
+def test_record_usage_none_writes_nulls():
+    rid = _new_run()
+    U.record_usage(rid, None)  # must not raise
+    conn = get_db()
+    row = conn.execute("SELECT input_tokens, cost_usd FROM agent_runs WHERE id=?", (rid,)).fetchone()
+    conn.close()
+    assert row["input_tokens"] is None
+    assert row["cost_usd"] is None
+
+
+# --------------------------------------------------------------------------- #
+# formatting
+# --------------------------------------------------------------------------- #
+def test_fmt_tokens():
+    assert U.fmt_tokens(6) == "6"
+    assert U.fmt_tokens(1234) == "1.2k"
+    assert U.fmt_tokens(45231) == "45.2k"
+    assert U.fmt_tokens(2_500_000) == "2.5M"
+    assert U.fmt_tokens(None) == "0"
+
+
+def test_fmt_cost():
+    assert U.fmt_cost(0.21) == "$0.21"
+    assert U.fmt_cost(0.0560175) == "$0.06"
+    assert U.fmt_cost(None) == "$0.00"
+
+
+def test_usage_comment_format():
+    # No cache -> in_total == input_tokens, no cached breakdown shown.
+    u = {"input_tokens": 45231, "output_tokens": 12100, "cost_usd": 0.21}
+    c = U.usage_comment("developer", u)
+    assert "Developer" in c
+    assert "45.2k in" in c
+    assert "cached" not in c
+    assert "12.1k out" in c
+    assert "$0.21" in c
+
+
+def test_usage_comment_shows_full_input_with_cached():
+    """FIX 2: in = input + cache_read + cache_creation, with cached breakdown."""
+    u = {
+        "input_tokens": 81,
+        "cache_read_tokens": 8_400_000,
+        "cache_creation_tokens": 100_000,
+        "output_tokens": 45_800,
+        "cost_usd": 7.29,
+    }
+    c = U.usage_comment("developer", u)
+    # total in = 8_500_081 -> 8.5M ; cached = 8_500_000 -> 8.5M
+    assert "8.5M in (8.5M cached)" in c
+    assert "45.8k out" in c
+    assert "$7.29" in c
+
+
+def test_usage_comment_no_cached_when_zero():
+    u = {"input_tokens": 1234, "cache_read_tokens": 0,
+         "cache_creation_tokens": 0, "output_tokens": 50, "cost_usd": 0.01}
+    c = U.usage_comment("developer", u)
+    assert "1.2k in" in c
+    assert "cached" not in c
+
+
+# --------------------------------------------------------------------------- #
+# FIX 4: per-agent artifact links in finish comments
+# --------------------------------------------------------------------------- #
+def _ctx():
+    return dict(repo="enduro-trails", branch="feature/ET-012-x",
+               work_item_id="ET-012")
+
+
+def test_usage_comment_reviewer_links_review_doc():
+    c = U.usage_comment("reviewer", {"input_tokens": 5}, **_ctx())
+    assert "12-review.md" in c
+    assert "ET-012" in c
+
+
+def test_usage_comment_tester_links_test_report():
+    c = U.usage_comment("tester", {"input_tokens": 5}, **_ctx())
+    assert "13-test-report.md" in c
+
+
+def test_usage_comment_deployer_links_deploy_log():
+    c = U.usage_comment("deployer", {"input_tokens": 5}, **_ctx())
+    assert "14-deploy-log.md" in c
+
+
+def test_usage_comment_developer_links_pr_and_branch():
+    c = U.usage_comment("developer", {"input_tokens": 5}, pr_number=7, **_ctx())
+    assert "pulls/7" in c
+    assert "feature/ET-012-x" in c
+
+
+def test_usage_comment_architect_links_adr():
+    c = U.usage_comment("architect", {"input_tokens": 5}, **_ctx())
+    assert "06-adr" in c
+
+
+def test_usage_comment_no_links_without_context():
+    """Without repo/branch context, no links are appended (no crash)."""
+    c = U.usage_comment("reviewer", {"input_tokens": 5})
+    assert "12-review.md" not in c
+    assert "http" not in c
+
+
+# --------------------------------------------------------------------------- #
+# task summary
+# --------------------------------------------------------------------------- #
+def test_task_summary_aggregates_over_agents():
+    # two runs for the same task: developer + tester
+    for agent, ti, to, cost in [("developer", 1000, 200, 0.10), ("tester", 500, 100, 0.05)]:
+        rid = _new_run(agent=agent, task_id=42)
+        U.record_usage(rid, {"input_tokens": ti, "output_tokens": to,
+                             "cache_read_tokens": 0, "cost_usd": cost})
+
+    s = U.task_usage_summary(42)
+    assert s["total_in"] == 1500
+    assert s["total_out"] == 300
+    assert abs(s["total_cost"] - 0.15) < 1e-9
+    agents = {a for a, *_ in s["per_agent"]}
+    assert agents == {"developer", "tester"}
+
+    comment = U.task_summary_comment(42)
+    assert "1.5k" in comment       # total in
+    assert "$0.15" in comment       # total cost
+    assert "Developer" in comment
+    assert "Tester" in comment
+
+
+def test_task_summary_sums_all_three_input_components():
+    """FIX 2: total_in = SUM(input + cache_read + cache_creation); total_cached too."""
+    rid = _new_run(agent="developer", task_id=77)
+    U.record_usage(rid, {
+        "input_tokens": 100,
+        "cache_read_tokens": 2000,
+        "cache_creation_tokens": 900,
+        "output_tokens": 50,
+        "cost_usd": 0.10,
+    })
+    rid2 = _new_run(agent="tester", task_id=77)
+    U.record_usage(rid2, {
+        "input_tokens": 10,
+        "cache_read_tokens": 500,
+        "cache_creation_tokens": 0,
+        "output_tokens": 5,
+        "cost_usd": 0.05,
+    })
+    s = U.task_usage_summary(77)
+    # total_in = (100+2000+900) + (10+500+0) = 3510
+    assert s["total_in"] == 3510
+    # total_cached = (2000+900) + (500+0) = 3400
+    assert s["total_cached"] == 3400
+    assert s["total_out"] == 55
+    comment = U.task_summary_comment(77)
+    assert "cached" in comment
+
+
+def test_task_summary_handles_null_cache_creation():
+    """Pre-existing rows (NULL cache_creation) must not break aggregation."""
+    rid = _new_run(agent="developer", task_id=88)
+    conn = get_db()
+    conn.execute(
+        "UPDATE agent_runs SET input_tokens=100, cache_read_tokens=200, "
+        "cache_creation_tokens=NULL, output_tokens=10, cost_usd=0.01 WHERE id=?",
+        (rid,),
+    )
+    conn.commit()
+    conn.close()
+    s = U.task_usage_summary(88)  # must not raise
+    assert s["total_in"] == 300   # 100 + 200 + (NULL->0)
+    assert s["total_cached"] == 200
--- a/tests/test_verdict_status.py
+++ b/tests/test_verdict_status.py
@@ -0,0 +1,171 @@
+"""Status-only verdict model: verdict statuses Approved / Rejected.
+
+  * issue updated -> Approved  : calls _try_advance_stage, with NO intermediate
+    set_issue_in_progress reset (bug 3 fix).
+  * issue updated -> Rejected  : calls _rollback_stage, with the reason pulled
+    from the issue's latest comment.
+  * COMMENTS NEVER trigger the pipeline: a :approved: / :rejected: comment is a
+    pure no-op (the comment-based control mechanism was removed).
+
+We mock the shared engine entry points (_try_advance_stage / _rollback_stage)
+and assert they fire ONLY for the status trigger, never for a comment.
+"""
+
+import os
+import tempfile
+
+_test_db = os.path.join(tempfile.gettempdir(), "test_orchestrator_verdict.db")
+os.environ["ORCH_DB_PATH"] = _test_db
+os.environ.setdefault("ORCH_PLANE_WEBHOOK_SECRET", "")
+os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
+os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token")
+
+import pytest  # noqa: E402
+from unittest.mock import patch, AsyncMock  # noqa: E402
+from fastapi.testclient import TestClient  # noqa: E402
+
+from src.main import app  # noqa: E402
+from src.db import init_db, get_db  # noqa: E402
+from src import projects as P  # noqa: E402
+from src.projects import reload_projects  # noqa: E402
+
+ENDURO_PLANE_ID = "7a79f0a9-5278-49cd-9007-9a338f238f9c"
+APPROVED = "a519a341-dada-4a91-8910-7604f82b79c5"
+REJECTED = "ba958f3c-5db5-461d-8f82-89425e413b97"
+
+client = TestClient(app)
+
+
+@pytest.fixture(autouse=True)
+def setup(monkeypatch):
+    monkeypatch.setattr(P.settings, "db_path", _test_db)
+    import src.db as _db
+    monkeypatch.setattr(_db.settings, "db_path", _test_db)
+    if os.path.exists(_test_db):
+        os.unlink(_test_db)
+    init_db()
+    monkeypatch.setattr("src.webhooks.plane.verify_plane_signature", lambda body, sig: True)
+    registry_json = (
+        f'[{{"plane_project_id": "{ENDURO_PLANE_ID}", "repo": "enduro-trails",'
+        f' "work_item_prefix": "ET", "name": "enduro-trails"}}]'
+    )
+    monkeypatch.setattr(P.settings, "projects_json", registry_json)
+    reload_projects()
+    # Seed a task at the 'review' stage for plane_id 'v-1'.
+    conn = get_db()
+    conn.execute(
+        "INSERT INTO tasks (plane_id, work_item_id, repo, branch, stage, plane_issue_id) "
+        "VALUES (?, ?, ?, ?, ?, ?)",
+        ("v-1", "ET-500", "enduro-trails", "feature/ET-500-x", "review", "v-1"),
+    )
+    conn.commit()
+    conn.close()
+    yield
+    reload_projects()
+    if os.path.exists(_test_db):
+        os.unlink(_test_db)
+
+
+def _status(state_id, plane_id="v-1", old="prev"):
+    return client.post("/webhook/plane", json={
+        "event": "issue", "action": "updated",
+        "data": {
+            "id": plane_id, "name": "Verdict task", "project": ENDURO_PLANE_ID,
+            "state": {"id": state_id, "name": "X", "group": "started"},
+        },
+        "activity": {"field": "state", "new_value": state_id, "old_value": old},
+    })
+
+
+def _comment(text, plane_id="v-1"):
+    return client.post("/webhook/plane", json={
+        "event": "issue_comment", "action": "created",
+        "data": {"work_item_id": plane_id, "comment_stripped": text,
+                 "project": ENDURO_PLANE_ID},
+    })
+
+
+class _FakeResp:
+    def __init__(self, status_code, payload):
+        self.status_code = status_code
+        self._payload = payload
+
+    def json(self):
+        return self._payload
+
+
+def _comments_response(comments):
+    return _FakeResp(200, {"results": comments})
+
+
+# --------------------------------------------------------------------------- #
+# Approved status -> advance (no in_progress reset)
+# --------------------------------------------------------------------------- #
+@patch("src.plane_sync.set_issue_in_progress")
+@patch("src.webhooks.plane._try_advance_stage", new_callable=AsyncMock)
+def test_approved_status_advances(mock_advance, mock_sip):
+    resp = _status(APPROVED)
+    assert resp.status_code == 200
+    mock_advance.assert_awaited_once()
+    # advanced the right task (ET-500 at review)
+    args = mock_advance.call_args.args
+    assert "ET-500" in args  # work_item_id is passed positionally
+    # bug 3 fix: handle_verdict no longer resets the status to In Progress.
+    mock_sip.assert_not_called()
+
+
+@patch("src.plane_sync.set_issue_in_progress")
+@patch("src.webhooks.plane._rollback_stage", new_callable=AsyncMock)
+@patch("src.webhooks.plane._try_advance_stage", new_callable=AsyncMock)
+def test_approved_comment_is_noop(mock_advance, mock_rollback, mock_sip):
+    """Status-only model: a :approved: comment NEVER advances the pipeline."""
+    resp = _comment(":approved:")
+    assert resp.status_code == 200
+    mock_advance.assert_not_called()
+    mock_rollback.assert_not_called()
+    mock_sip.assert_not_called()
+
+
+# --------------------------------------------------------------------------- #
+# Rejected status -> rollback (reason from latest comment)
+# --------------------------------------------------------------------------- #
+@patch("src.webhooks.plane.httpx.get")
+@patch("src.webhooks.plane._rollback_stage", new_callable=AsyncMock)
+def test_rejected_status_rolls_back(mock_rollback, mock_get):
+    mock_get.return_value = _comments_response(
+        [{"comment_stripped": "ADR missing tradeoffs",
+          "created_at": "2026-06-03T10:00:00Z"}]
+    )
+    resp = _status(REJECTED)
+    assert resp.status_code == 200
+    mock_rollback.assert_awaited_once()
+    # reason pulled from the latest comment
+    reason = mock_rollback.call_args.args[-1]
+    assert "ADR missing tradeoffs" in reason
+
+
+@patch("src.webhooks.plane.httpx.get")
+@patch("src.plane_sync.set_issue_in_progress")
+@patch("src.webhooks.plane._rollback_stage", new_callable=AsyncMock)
+@patch("src.webhooks.plane._try_advance_stage", new_callable=AsyncMock)
+def test_rejected_comment_is_noop(mock_advance, mock_rollback, mock_sip, mock_get):
+    """Status-only model: a :rejected: comment NEVER rolls back the pipeline."""
+    resp = _comment(":rejected: bad ADR")
+    assert resp.status_code == 200
+    mock_advance.assert_not_called()
+    mock_rollback.assert_not_called()
+    mock_sip.assert_not_called()
+    mock_get.assert_not_called()
+
+
+# --------------------------------------------------------------------------- #
+# Unknown verdict status -> no-op
+# --------------------------------------------------------------------------- #
+@patch("src.webhooks.plane._rollback_stage", new_callable=AsyncMock)
+@patch("src.webhooks.plane._try_advance_stage", new_callable=AsyncMock)
+def test_other_status_no_verdict_action(mock_advance, mock_rollback):
+    # In Review status is not a verdict -> neither advance nor rollback.
+    resp = _status("38fb1f64-aa1e-48a3-92e0-0b109679046b")  # in_review
+    assert resp.status_code == 200
+    mock_advance.assert_not_called()
+    mock_rollback.assert_not_called()
--- a/tests/test_webhook_dedup.py
+++ b/tests/test_webhook_dedup.py
@@ -0,0 +1,284 @@
+"""ORCH-5 (M-7): webhook delivery de-duplication tests.
+
+A retried/replayed webhook delivery must be processed exactly once. We mock
+enqueue_job (imported into the gitea/plane module namespaces) and assert its
+call_count does not grow on a repeat. HMAC is bypassed here by forcing the
+webhook secrets empty (the 9 pre-existing 401 webhook tests are a separate
+baseline and are NOT touched). A dedicated test keeps the 401-on-bad-signature
+guarantee by re-enabling the secret.
+"""
+
+import os
+import tempfile
+from unittest.mock import patch, AsyncMock
+
+import pytest
+
+# Override DB path + project registry BEFORE importing app (same pattern as
+# tests/test_webhooks.py).
+_test_db = os.path.join(tempfile.gettempdir(), "test_orchestrator_dedup.db")
+os.environ["ORCH_DB_PATH"] = _test_db
+os.environ["ORCH_REPOS_DIR"] = tempfile.gettempdir()
+os.environ["ORCH_GITEA_TOKEN"] = "test-token"
+os.environ["ORCH_PLANE_API_TOKEN"] = "test-token"
+os.environ["ORCH_GITEA_OWNER"] = "admin"
+os.environ["ORCH_DEFAULT_REPO"] = "enduro-trails"
+os.environ["ORCH_PROJECTS_JSON"] = (
+    '[{"plane_project_id": "proj-1", "repo": "enduro-trails", '
+    '"work_item_prefix": "ET", "name": "enduro-trails"}]'
+)
+
+from fastapi.testclient import TestClient  # noqa: E402
+from src.main import app  # noqa: E402
+from src.db import init_db, get_db  # noqa: E402
+from src import db as db_module  # noqa: E402
+from src.webhooks import gitea as gitea_mod  # noqa: E402
+from src.webhooks import plane as plane_mod  # noqa: E402
+from src import projects as projects_mod  # noqa: E402
+
+
+@pytest.fixture(autouse=True)
+def setup_db(monkeypatch):
+    # settings is a process-wide singleton; another test module may have fixed
+    # settings.db_path to its own file at import time. get_db() reads it live, so
+    # pin it to OUR db for the duration of each test here.
+    monkeypatch.setattr(db_module.settings, "db_path", _test_db, raising=False)
+    if os.path.exists(_test_db):
+        os.unlink(_test_db)
+    init_db()
+    yield
+    if os.path.exists(_test_db):
+        os.unlink(_test_db)
+
+
+@pytest.fixture(autouse=True)
+def proj_registry():
+    """Pin the shared project registry to proj-1/enduro-trails.
+
+    The registry (projects.PROJECTS / _BY_PLANE_ID) is a process-wide singleton
+    built at import; test_projects.py rebuilds it via reload_projects(), which can
+    leave it on the built-in default where proj-1 is unknown -> ORCH-6 would
+    ignore our fixtures. Force ours for each test, then rebuild after.
+    """
+    os.environ["ORCH_PROJECTS_JSON"] = (
+        '[{"plane_project_id": "proj-1", "repo": "enduro-trails", '
+        '"work_item_prefix": "ET", "name": "enduro-trails"}]'
+    )
+    projects_mod.settings.projects_json = os.environ["ORCH_PROJECTS_JSON"]
+    projects_mod.reload_projects()
+    yield
+    projects_mod.reload_projects()
+
+
+@pytest.fixture(autouse=True)
+def no_hmac(monkeypatch):
+    """Bypass HMAC so dedup behavior (not signing) is under test.
+
+    settings is shared, so override the secret on the module-level settings that
+    each verify_* function reads.
+    """
+    monkeypatch.setattr(gitea_mod.settings, "gitea_webhook_secret", "", raising=False)
+    monkeypatch.setattr(plane_mod.settings, "plane_webhook_secret", "", raising=False)
+    yield
+
+
+client = TestClient(app)
+
+
+def _events_count():
+    conn = get_db()
+    n = conn.execute("SELECT COUNT(*) FROM events").fetchone()[0]
+    conn.close()
+    return n
+
+
+# ---------------------------------------------------------------------------
+# Migration
+# ---------------------------------------------------------------------------
+
+def test_migration_adds_delivery_id_and_index():
+    """events has delivery_id + a partial unique index idx_events_delivery."""
+    conn = get_db()
+    cols = [r[1] for r in conn.execute("PRAGMA table_info(events)").fetchall()]
+    idxs = [r[1] for r in conn.execute("PRAGMA index_list(events)").fetchall()]
+    conn.close()
+    assert "delivery_id" in cols
+    assert "idx_events_delivery" in idxs
+
+
+def test_migration_on_old_db_without_column_does_not_crash():
+    """init_db() over a pre-existing events table WITHOUT delivery_id is safe."""
+    if os.path.exists(_test_db):
+        os.unlink(_test_db)
+    import sqlite3
+    conn = sqlite3.connect(_test_db)
+    # Old-shape events table (no delivery_id) + a legacy row with NULL delivery_id.
+    conn.executescript(
+        """
+        CREATE TABLE events (
+            id INTEGER PRIMARY KEY AUTOINCREMENT,
+            timestamp TEXT DEFAULT (datetime('now')),
+            source TEXT NOT NULL,
+            event_type TEXT NOT NULL,
+            payload TEXT NOT NULL,
+            processed INTEGER DEFAULT 0
+        );
+        INSERT INTO events (source, event_type, payload) VALUES ('plane','old','{}');
+        INSERT INTO events (source, event_type, payload) VALUES ('gitea','old2','{}');
+        """
+    )
+    conn.commit()
+    conn.close()
+
+    # Should add the column + index without raising and keep the legacy rows.
+    init_db()
+
+    conn = get_db()
+    cols = [r[1] for r in conn.execute("PRAGMA table_info(events)").fetchall()]
+    n = conn.execute("SELECT COUNT(*) FROM events").fetchone()[0]
+    conn.close()
+    assert "delivery_id" in cols
+    assert n == 2  # legacy NULL-delivery rows preserved, partial index lets them coexist
+
+
+# ---------------------------------------------------------------------------
+# Gitea dedup
+# ---------------------------------------------------------------------------
+
+@patch.object(gitea_mod, "enqueue_job")
+def test_gitea_duplicate_delivery_id_skips_dispatch(mock_enqueue):
+    """Repeated X-Gitea-Delivery -> first processed, second {"status":"duplicate"}."""
+    # Task at architecture so the ADR push would enqueue.
+    conn = get_db()
+    conn.execute(
+        "INSERT INTO tasks (plane_id, work_item_id, repo, branch, stage) "
+        "VALUES (?, ?, ?, ?, ?)",
+        ("gd-001", "ET-100", "enduro-trails", "feature/ET-100-x", "architecture"),
+    )
+    conn.commit()
+    conn.close()
+
+    body = {
+        "ref": "refs/heads/feature/ET-100-x",
+        "repository": {"name": "enduro-trails"},
+        "commits": [
+            {"added": ["docs/work-items/ET-100/06-adr/001-d.md"], "modified": []}
+        ],
+    }
+    hdrs = {"X-Gitea-Event": "push", "X-Gitea-Delivery": "guid-AAA"}
+
+    r1 = client.post("/webhook/gitea", json=body, headers=hdrs)
+    assert r1.status_code == 200
+    assert r1.json()["status"] == "accepted"
+    assert mock_enqueue.call_count == 1
+    assert _events_count() == 1
+
+    # Same delivery id again -> duplicate, no new enqueue, no new event row.
+    r2 = client.post("/webhook/gitea", json=body, headers=hdrs)
+    assert r2.status_code == 200
+    assert r2.json()["status"] == "duplicate"
+    assert mock_enqueue.call_count == 1
+    assert _events_count() == 1
+
+
+@patch.object(gitea_mod, "enqueue_job")
+def test_gitea_two_distinct_delivery_ids_both_processed(mock_enqueue):
+    body = {"ref": "refs/heads/feature/none", "repository": {"name": "enduro-trails"}, "commits": []}
+    r1 = client.post("/webhook/gitea", json=body,
+                     headers={"X-Gitea-Event": "push", "X-Gitea-Delivery": "guid-1"})
+    r2 = client.post("/webhook/gitea", json=body,
+                     headers={"X-Gitea-Event": "push", "X-Gitea-Delivery": "guid-2"})
+    assert r1.json()["status"] == "accepted"
+    assert r2.json()["status"] == "accepted"
+    assert _events_count() == 2
+
+
+def test_gitea_fallback_hash_when_no_delivery_header():
+    """No X-Gitea-Delivery -> sha256 fallback; identical body repeat = duplicate."""
+    body = {"ref": "refs/heads/feature/none", "repository": {"name": "enduro-trails"}, "commits": []}
+    r1 = client.post("/webhook/gitea", json=body, headers={"X-Gitea-Event": "push"})
+    r2 = client.post("/webhook/gitea", json=body, headers={"X-Gitea-Event": "push"})
+    assert r1.json()["status"] == "accepted"
+    assert r2.json()["status"] == "duplicate"
+    assert _events_count() == 1
+
+
+# ---------------------------------------------------------------------------
+# Plane dedup
+# ---------------------------------------------------------------------------
+
+@patch.object(plane_mod, "enqueue_job")
+@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
+@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
+def test_plane_fallback_hash_dedup(mock_docs, mock_branch, mock_enqueue):
+    """Repeated identical Plane body -> first accepted+enqueue, repeat duplicate.
+
+    Feature 1: the pipeline now starts on a status change to In Progress, not on
+    creation, so this drives the dedup test with an 'issue updated' event.
+    """
+    IN_PROGRESS = "b873d9eb-993c-48cd-97ac-99a9b1623967"
+    body = {
+        "event": "issue",
+        "action": "updated",
+        "data": {
+            "id": "pd-001",
+            "name": "Dedup plane task",
+            "description_stripped": "A sufficiently long description for QG-0 to pass.",
+            "project": "proj-1",
+            "state": {"id": IN_PROGRESS, "name": "In Progress", "group": "started"},
+        },
+    }
+    r1 = client.post("/webhook/plane", json=body)
+    assert r1.status_code == 200
+    assert r1.json()["status"] == "accepted"
+    assert mock_enqueue.call_count == 1
+    assert _events_count() == 1
+
+    r2 = client.post("/webhook/plane", json=body)
+    assert r2.status_code == 200
+    assert r2.json()["status"] == "duplicate"
+    assert mock_enqueue.call_count == 1  # not re-enqueued
+    assert _events_count() == 1
+
+
+@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
+@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
+def test_plane_unknown_project_first_delivery_still_ignored(mock_docs, mock_branch):
+    """ORCH-6 intact: first delivery of an unknown project -> {"status":"ignored"}."""
+    body = {
+        "event": "work_item.created",
+        "data": {"id": "unk-001", "name": "Unknown project task", "project": "proj-UNKNOWN"},
+    }
+    r1 = client.post("/webhook/plane", json=body)
+    assert r1.status_code == 200
+    assert r1.json()["status"] == "ignored"
+    # Event WAS logged (dedup happens before the project filter), so a retry of the
+    # SAME body is a duplicate, not re-evaluated.
+    assert _events_count() == 1
+    r2 = client.post("/webhook/plane", json=body)
+    assert r2.json()["status"] == "duplicate"
+    assert _events_count() == 1
+
+
+# ---------------------------------------------------------------------------
+# HMAC still guarded (acceptance #4) — independent of the dedup path
+# ---------------------------------------------------------------------------
+
+def test_gitea_invalid_signature_still_401(monkeypatch):
+    monkeypatch.setattr(gitea_mod.settings, "gitea_webhook_secret", "s3cr3t", raising=False)
+    r = client.post(
+        "/webhook/gitea",
+        json={"ref": "refs/heads/feature/x", "repository": {"name": "enduro-trails"}, "commits": []},
+        headers={"X-Gitea-Event": "push", "X-Gitea-Signature": "deadbeef"},
+    )
+    assert r.status_code == 401
+
+
+def test_plane_invalid_signature_still_401(monkeypatch):
+    monkeypatch.setattr(plane_mod.settings, "plane_webhook_secret", "s3cr3t", raising=False)
+    r = client.post(
+        "/webhook/plane",
+        json={"event": "work_item.created", "data": {"id": "z", "project": "proj-1"}},
+        headers={"X-Plane-Signature": "deadbeef"},
+    )
+    assert r.status_code == 401
--- a/tests/test_webhooks.py
+++ b/tests/test_webhooks.py
@@ -1,4 +1,5 @@
 import pytest
+import asyncio
 import os
 import tempfile
 from unittest.mock import patch, MagicMock, AsyncMock
@@ -95,27 +96,32 @@ def test_plane_webhook_generates_sequential_ids(mock_docs, mock_branch):
    assert ids[1] == "ET-002"


+APPROVED_STATE = "a519a341-dada-4a91-8910-7604f82b79c5"
+REJECTED_STATE = "ba958f3c-5db5-461d-8f82-89425e413b97"
+
+
@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
@patch("src.webhooks.plane.launcher")
 def test_plane_approved_advances_stage(mock_launcher, mock_docs, mock_branch, tmp_path, monkeypatch):
-    """Comment :approved: at stage=analysis → advance to architecture."""
+    """Status-only model: Approved STATUS at stage=analysis -> advance to
+    architecture. A comment never triggers this.
+    """
    # Patch repos_dir for QG check
    monkeypatch.setattr("src.qg.checks.settings.repos_dir", str(tmp_path))

-    # Create task first
-    client.post("/webhook/plane", json={
-        "event": "work_item.created",
-        "data": {"id": "adv-001", "name": "Advance test", "project": "proj-1"}
-    })
-
-    # Get the task to find work_item_id
+    # Seed an analysis task directly (creation no longer makes a task post-PR#11).
    conn = get_db()
-    task = conn.execute("SELECT * FROM tasks WHERE plane_id = 'adv-001'").fetchone()
+    conn.execute(
+        "INSERT INTO tasks (plane_id, work_item_id, repo, branch, stage, plane_issue_id) "
+        "VALUES (?, ?, ?, ?, ?, ?)",
+        ("adv-001", "ET-001", "enduro-trails", "feature/ET-001-x", "analysis", "adv-001"),
+    )
+    conn.commit()
    conn.close()
-    work_item_id = task["work_item_id"]
+    work_item_id = "ET-001"

-    # Create required analysis files
+    # Create required analysis files so the analysis QG passes.
    wi_dir = tmp_path / "enduro-trails" / "docs" / "work-items" / work_item_id
    wi_dir.mkdir(parents=True)
    (wi_dir / "01-brd.md").write_text("# BRD")
@@ -123,16 +129,15 @@ def test_plane_approved_advances_stage(mock_launcher, mock_docs, mock_branch, tm
    (wi_dir / "03-acceptance-criteria.md").write_text("# AC")
    (wi_dir / "04-test-plan.yaml").write_text("tests: []")

-    # Mock launcher
    mock_launcher.launch.return_value = 1

-    # Send approved comment
+    # Send Approved STATUS change.
    resp = client.post("/webhook/plane", json={
-        "event": "comment.created",
+        "event": "issue", "action": "updated",
        "data": {
-            "work_item_id": "adv-001",
-            "comment": "Looks good :approved:"
-        }
+            "id": "adv-001", "name": "Advance test", "project": "proj-1",
+            "state": {"id": APPROVED_STATE, "name": "Approved", "group": "completed"},
+        },
    })
    assert resp.status_code == 200

@@ -143,29 +148,39 @@ def test_plane_approved_advances_stage(mock_launcher, mock_docs, mock_branch, tm
    assert task["stage"] == "architecture"


+@patch("src.webhooks.plane.httpx.get")
@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
-def test_plane_rejected_rolls_back(mock_docs, mock_branch):
-    """Comment :rejected: rolls back stage."""
-    # Create task
-    client.post("/webhook/plane", json={
-        "event": "work_item.created",
-        "data": {"id": "rej-001", "name": "Reject test", "project": "proj-1"}
-    })
+def test_plane_rejected_rolls_back(mock_docs, mock_branch, mock_get):
+    """Status-only model: Rejected STATUS rolls back stage. A comment never
+    triggers this; the reason is pulled from the latest comment.
+    """
+    class _R:
+        status_code = 200
+        @staticmethod
+        def json():
+            return {"results": [
+                {"comment_stripped": "missing ADR", "created_at": "2026-06-03T10:00:00Z"}
+            ]}
+    mock_get.return_value = _R()

-    # Manually set stage to architecture
+    # Seed an architecture task directly.
    conn = get_db()
-    conn.execute("UPDATE tasks SET stage = 'architecture' WHERE plane_id = 'rej-001'")
+    conn.execute(
+        "INSERT INTO tasks (plane_id, work_item_id, repo, branch, stage, plane_issue_id) "
+        "VALUES (?, ?, ?, ?, ?, ?)",
+        ("rej-001", "ET-002", "enduro-trails", "feature/ET-002-x", "architecture", "rej-001"),
+    )
    conn.commit()
    conn.close()

-    # Send rejected comment
+    # Send Rejected STATUS change.
    resp = client.post("/webhook/plane", json={
-        "event": "comment.created",
+        "event": "issue", "action": "updated",
        "data": {
-            "work_item_id": "rej-001",
-            "comment": "Not ready :rejected:"
-        }
+            "id": "rej-001", "name": "Reject test", "project": "proj-1",
+            "state": {"id": REJECTED_STATE, "name": "Rejected", "group": "cancelled"},
+        },
    })
    assert resp.status_code == 200

@@ -258,6 +273,46 @@ def test_gitea_ci_success_advances_to_review(mock_launcher, mock_ci):
    assert task["stage"] == "review"


+@patch("src.webhooks.gitea.notify_qg_failure")
+@patch("src.webhooks.gitea.launcher")
+def test_gitea_ci_failure_on_development_notifies_qg_failure(mock_launcher, mock_notify):
+    """BUG 6: CI failure at development is now the authoritative QG gate failing.
+
+    It must notify QG failure (not silently suppress) and must NOT advance the stage.
+    """
+    conn = get_db()
+    conn.execute(
+        "INSERT INTO tasks (plane_id, work_item_id, repo, branch, stage) VALUES (?, ?, ?, ?, ?)",
+        ("ci-fail-001", "ET-011", "enduro-trails", "feature/ET-011-test", "development"),
+    )
+    conn.commit()
+    conn.close()
+
+    resp = client.post(
+        "/webhook/gitea",
+        json={
+            "state": "failure",
+            "branches": [{"name": "feature/ET-011-test"}],
+            "repository": {"name": "enduro-trails"},
+        },
+        headers={"X-Gitea-Event": "status"},
+    )
+    assert resp.status_code == 200
+
+    # QG failure was reported for the development stage with check_ci_green.
+    assert mock_notify.called
+    args, kwargs = mock_notify.call_args
+    call = list(args) + list(kwargs.values())
+    assert "development" in call
+    assert "check_ci_green" in call
+
+    # Stage did NOT advance.
+    conn = get_db()
+    task = conn.execute("SELECT * FROM tasks WHERE plane_id = 'ci-fail-001'").fetchone()
+    conn.close()
+    assert task["stage"] == "development"
+
+
 def test_gitea_webhook_pr():
    """PR event is accepted."""
    resp = client.post(
@@ -287,3 +342,158 @@ def test_plane_webhook_event_logged():
    conn.close()
    assert event is not None
    assert event["source"] == "plane"
+
+
+# ---------------------------------------------------------------------------
+# BUG 7: red CI on development must bounce the task back to the developer
+# (capped retries, symmetric to review REQUEST_CHANGES). These are pure-logic
+# tests: they invoke handle_ci_status() directly with mocked helpers so they do
+# not pass through the TestClient HMAC barrier (baseline 401s are off-limits).
+# ---------------------------------------------------------------------------
+
+def _ci_failure_payload():
+    return {
+        "state": "failure",
+        "branches": [{"name": "feature/ET-011-test"}],
+        "repository": {"name": "enduro-trails"},
+    }
+
+
+def _mock_db_with_retry_count(count):
+    """Build a get_db() mock whose retry_count query returns `count`."""
+    conn = MagicMock()
+    conn.execute.return_value.fetchone.return_value = {"cnt": count}
+    return conn
+
+
+@patch("src.webhooks.gitea.notify_error")
+@patch("src.webhooks.gitea.notify_qg_failure")
+@patch("src.webhooks.gitea.enqueue_job")
+@patch("src.webhooks.gitea.update_task_stage")
+@patch("src.webhooks.gitea.get_db")
+@patch("src.webhooks.gitea.get_task_by_repo_branch")
+@patch("src.webhooks.gitea.get_project_by_repo")
+def test_ci_failure_development_retries_developer_under_limit(
+    mock_proj, mock_task, mock_get_db, mock_update_stage,
+    mock_enqueue, mock_qg, mock_err,
+):
+    """retry_count < MAX_DEV_RETRIES → relaunch developer, stage untouched."""
+    from src.webhooks.gitea import handle_ci_status
+
+    mock_proj.return_value = {"repo": "enduro-trails"}
+    mock_task.return_value = {
+        "id": 1, "stage": "development", "work_item_id": "ET-011",
+    }
+    mock_get_db.return_value = _mock_db_with_retry_count(0)
+    mock_enqueue.return_value = 42
+
+    asyncio.run(handle_ci_status(_ci_failure_payload()))
+
+    # QG failure was still reported (Slava sees both the failure and the retry).
+    assert mock_qg.called
+    # developer was re-enqueued.
+    assert mock_enqueue.called
+    assert mock_enqueue.call_args[0][0] == "developer"
+    # No escalation.
+    assert not mock_err.called
+    # Stage stays on development — no update_task_stage in the CI-failure path.
+    assert not mock_update_stage.called
+
+
+@patch("src.webhooks.gitea.notify_error")
+@patch("src.webhooks.gitea.notify_qg_failure")
+@patch("src.webhooks.gitea.enqueue_job")
+@patch("src.webhooks.gitea.update_task_stage")
+@patch("src.webhooks.gitea.get_db")
+@patch("src.webhooks.gitea.get_task_by_repo_branch")
+@patch("src.webhooks.gitea.get_project_by_repo")
+def test_ci_failure_development_escalates_at_limit(
+    mock_proj, mock_task, mock_get_db, mock_update_stage,
+    mock_enqueue, mock_qg, mock_err,
+):
+    """retry_count >= MAX_DEV_RETRIES → escalate via notify_error, no relaunch."""
+    from src.webhooks.gitea import handle_ci_status, MAX_DEV_RETRIES
+
+    mock_proj.return_value = {"repo": "enduro-trails"}
+    mock_task.return_value = {
+        "id": 1, "stage": "development", "work_item_id": "ET-011",
+    }
+    mock_get_db.return_value = _mock_db_with_retry_count(MAX_DEV_RETRIES)
+
+    asyncio.run(handle_ci_status(_ci_failure_payload()))
+
+    # QG failure still reported.
+    assert mock_qg.called
+    # developer NOT re-enqueued at the cap.
+    assert not mock_enqueue.called
+    # Escalation message mentions CI failure.
+    assert mock_err.called
+    err_msg = " ".join(str(a) for a in mock_err.call_args[0])
+    assert "Max developer retries" in err_msg
+    assert "after CI failure" in err_msg
+    # Stage untouched.
+    assert not mock_update_stage.called
+
+
+# ---------------------------------------------------------------------------
+# BUG 8 (second door): a merged-PR webhook must NOT fake-complete a task that is
+# still in the deploy stage. On `deploy` done is gated by the deployer's verdict
+# (check_deploy_status via advance_stage), not by the merge event. For every
+# other stage the merge->done behaviour is preserved. Pure-logic tests: invoke
+# handle_pr() directly with mocked helpers (no HMAC barrier).
+# ---------------------------------------------------------------------------
+
+def _merged_pr_payload(branch="feature/ET-012-x"):
+    return {
+        "action": "closed",
+        "pull_request": {
+            "merged": True,
+            "number": 7,
+            "head": {"ref": branch},
+        },
+        "repository": {"name": "enduro-trails"},
+    }
+
+
+@patch("src.webhooks.gitea.notify_stage_change")
+@patch("src.webhooks.gitea.update_task_stage")
+@patch("src.webhooks.gitea.get_task_by_repo_branch")
+@patch("src.webhooks.gitea.get_project_by_repo")
+def test_merge_on_deploy_stage_does_not_set_done(
+    mock_proj, mock_task, mock_update_stage, mock_notify,
+):
+    """FIX 1: merge at deploy stage is ignored — done is gated by deployer verdict."""
+    from src.webhooks.gitea import handle_pr
+
+    mock_proj.return_value = {"repo": "enduro-trails"}
+    mock_task.return_value = {
+        "id": 1, "stage": "deploy", "work_item_id": "ET-012",
+    }
+
+    asyncio.run(handle_pr(_merged_pr_payload()))
+
+    # The merge-driven done path must NOT run on deploy.
+    assert not mock_update_stage.called
+    assert not mock_notify.called
+
+
+@patch("src.webhooks.gitea.notify_stage_change")
+@patch("src.webhooks.gitea.update_task_stage")
+@patch("src.webhooks.gitea.get_task_by_repo_branch")
+@patch("src.webhooks.gitea.get_project_by_repo")
+def test_merge_on_non_deploy_stage_sets_done(
+    mock_proj, mock_task, mock_update_stage, mock_notify,
+):
+    """FIX 1: merge behaviour is preserved for non-deploy stages (e.g. review)."""
+    from src.webhooks.gitea import handle_pr
+
+    mock_proj.return_value = {"repo": "enduro-trails"}
+    mock_task.return_value = {
+        "id": 2, "stage": "review", "work_item_id": "ET-013",
+    }
+
+    asyncio.run(handle_pr(_merged_pr_payload(branch="feature/ET-013-x")))
+
+    # Non-deploy stages still get the merge-driven done.
+    mock_update_stage.assert_called_once_with(2, "done")
+    assert mock_notify.called