fix(tracker): no duplicate Telegram messages on not-modified/transient edits

edit_telegram now returns a distinguishable outcome (ok|not_modified|gone| failed) instead of a bare bool. update_task_tracker only sends a NEW message when the original is truly gone; not_modified and transient failures no longer spawn duplicate trackers or orphan the live one. render_task_tracker shows "попытка N" on an actively re-run stage (>=2 agent runs) so the text changes between review<->development cycles. Finished (✅) lines are unchanged. Tests: edit_telegram classification (ok/not_modified/gone/failed via mocked httpx), update_task_tracker (not_modified/failed -> no send, gone -> send+id), render attempt marker.
Merge pull request 'feat(telegram): live editable task tracker (Variant B+)' (#21 ) from feat/telegram-live-tracker into main
2026-06-04 13:20:40 +03:00 · 2026-06-04 11:46:21 +03:00 · 2026-06-04 11:42:46 +03:00 · 2026-06-04 11:21:50 +03:00 · 2026-06-04 11:17:58 +03:00 · 2026-06-04 02:46:52 +03:00
28 changed files with 4708 additions and 260 deletions
--- a/src/agents/launcher.py
+++ b/src/agents/launcher.py
@@ -209,9 +209,15 @@ class AgentLauncher:
        # No git fetch/checkout here: ensure_worktree() already put the worktree on
        # the right branch. The agent simply runs inside its isolated work_path.
        # Feature 4 (token usage): --output-format json makes claude emit a single
        # result JSON (with usage + total_cost_usd) at the end of stdout. The log
        # still captures it; _monitor_agent parses the trailing JSON after the run
        # to record per-agent tokens/cost. _monitor_agent's failure handling keys
        # off the process exit_code (not stdout shape), so this is safe.
        cmd = (
            f'cd {work_path} && '
            f'{self.CLAUDE_BIN} --print '
            f'--output-format json '
            f'{model_flag}'
            f'"$(cat {task_file})" '
            f'--system-prompt "$(cat {system_prompt})" '
@@ -400,6 +406,17 @@ class AgentLauncher:
        notify_agent_finished(run_id, agent, exit_code, task_id=_task_id, duration_s=_duration_s)
        # Feature 4: parse token usage / cost from the (json) run log and record
        # it on the agent_runs row. Never fatal — a garbled/missing JSON records
        # NULLs and logs a warning so a broken run can't crash the monitor.
        try:
            from ..usage import parse_usage_from_log, record_usage
            _usage = parse_usage_from_log(output_path) if output_path else None
            record_usage(run_id, _usage)
        except Exception as e:
            logger.warning(f"run_id={run_id}: usage accounting failed: {e}")
            _usage = None
        # Commit and push any changes — in the per-branch worktree (ORCH-2 / S-4),
        # NOT in the shared /repos/<repo>. The worktree is already on `branch`
        # (ensure_worktree did the checkout), so no checkout is needed here.
@@ -471,7 +488,8 @@ class AgentLauncher:
                set_issue_blocked(_wid)
                plane_add_comment(
                    _wid,
-                    "\u274c Deploy FAILED (smoke/healthcheck). Rolled back. Developer \u043d\u0443\u0436\u0435\u043d \u0434\u043b\u044f \u0444\u0438\u043a\u0441\u0430."
+                    "\u274c Deploy FAILED (smoke/healthcheck). Rolled back. Developer \u043d\u0443\u0436\u0435\u043d \u0434\u043b\u044f \u0444\u0438\u043a\u0441\u0430.",
                    author="deployer",
                )
                from ..notifications import send_telegram
                send_telegram(f"\U0001f6a8 {_wid}: Deploy failed! Rolled back. Needs fix.")
@@ -489,6 +507,14 @@ class AgentLauncher:
                from ..notifications import send_telegram
                send_telegram(f"\u26a0\ufe0f {_wid}: Agent {agent} failed (exit_code={exit_code}). Check logs: /app/data/runs/{run_id}.log")
        # Feature 4: post the per-agent usage comment under that agent's bot, and
        # — for the deployer finishing the task — the per-task usage summary.
        if exit_code == 0:
            try:
                self._post_usage_comments(run_id, agent, repo, branch, _usage)
            except Exception as e:
                logger.warning(f"run_id={run_id}: usage comment failed: {e}")
        # Auto-advance stage if agent finished successfully and QG passes
        if exit_code == 0:
            self._try_advance_stage(run_id, agent, repo, branch)
@@ -653,6 +679,69 @@ class AgentLauncher:
            logger.error(f"Auto-advance failed for run_id={run_id}: {e}")
    def _post_usage_comments(self, run_id, agent, repo, branch, usage):
        """Feature 4: post the per-agent usage comment (and Deployer summary).
        - Always (on success, with a work_item_id): a per-agent finish comment
          with token/cost, authored by the finishing agent's Plane bot.
        - When the deployer finishes: also a per-task summary (SUM over
          agent_runs GROUP BY agent), authored by the deployer.
        """
        from ..usage import usage_comment, task_summary_comment
        conn = get_db()
        row = conn.execute(
            "SELECT id, work_item_id FROM tasks WHERE repo=? AND branch=?",
            (repo, branch),
        ).fetchone()
        conn.close()
        if not row:
            return
        task_id, work_item_id = row[0], row[1]
        if not work_item_id:
            return
        # Observability: every agent's finish comment links its artifact(s)
        # (reviewer->12-review, tester->13-test-report, deployer->14-deploy-log,
        # architect->ADR, developer->PR/branch). For the developer we resolve the
        # open PR number so the link points straight at it.
        pr_number = None
        if agent == "developer":
            pr_number = self._open_pr_number(repo, branch)
        plane_add_comment(
            work_item_id,
            usage_comment(
                agent,
                usage,
                repo=repo,
                branch=branch,
                work_item_id=work_item_id,
                pr_number=pr_number,
            ),
            author=agent,
        )
        if agent == "deployer":
            plane_add_comment(
                work_item_id, task_summary_comment(task_id), author="deployer"
            )
    def _open_pr_number(self, repo: str, branch: str):
        """Return the open PR number for `branch`, or None. Never raises."""
        try:
            import httpx
            owner = settings.gitea_owner
            headers = {"Authorization": f"token {settings.gitea_token}"}
            resp = httpx.get(
                f"{settings.gitea_url}/api/v1/repos/{owner}/{repo}/pulls",
                params={"state": "open", "head": branch},
                headers=headers, timeout=5,
            )
            if resp.status_code == 200:
                prs = resp.json()
                if prs:
                    return prs[0].get("number")
        except Exception:
            pass
        return None
    def _ensure_pr(self, repo: str, branch: str, run_id: int):
        import httpx
        owner = settings.gitea_owner
--- a/src/config.py
+++ b/src/config.py
@@ -9,8 +9,20 @@ class Settings(BaseSettings):
    plane_webhook_secret: str = ""
    plane_project_id: str = ""
    # Per-agent Plane bot tokens (feat: per-agent comment authorship).
    # When set, add_comment posts under the matching bot so Plane shows the
    # real author (Analyst/Architect/...). Empty -> fallback to plane_api_token.
    plane_bot_analyst: str = ""
    plane_bot_architect: str = ""
    plane_bot_developer: str = ""
    plane_bot_reviewer: str = ""
    plane_bot_tester: str = ""
    plane_bot_deployer: str = ""
    plane_bot_stream: str = ""
    # Gitea
    gitea_url: str = "http://localhost:3000"
    gitea_public_url: str = ""  # external URL for clickable links in comments; falls back to gitea_url
    gitea_token: str = ""
    gitea_webhook_secret: str = ""
    gitea_owner: str = "admin"
--- a/src/db.py
+++ b/src/db.py
@@ -77,6 +77,38 @@ def init_db():
        "CREATE UNIQUE INDEX IF NOT EXISTS idx_events_delivery "
        "ON events(delivery_id) WHERE delivery_id IS NOT NULL"
    )
    # Feature 4 (token usage): per-run token / cost accounting. Parsed from the
    # claude --output-format json result by the launcher monitor. Idempotent
    # ALTERs (no-op once the columns exist) so this is safe on the live prod DB.
    _ensure_column(conn, "agent_runs", "input_tokens", "INTEGER")
    _ensure_column(conn, "agent_runs", "output_tokens", "INTEGER")
    _ensure_column(conn, "agent_runs", "cache_read_tokens", "INTEGER")
    # Observability fix: also persist cache-CREATION input tokens. Claude CLI
    # reports the real input split across input_tokens (fresh, ~tens) +
    # cache_read_input_tokens (cache hit, millions) + cache_creation_input_tokens
    # (writing new cache). Without this column the cache_creation slice is lost
    # and the "X in" figure understates the true prompt size. Idempotent ALTER.
    _ensure_column(conn, "agent_runs", "cache_creation_tokens", "INTEGER")
    _ensure_column(conn, "agent_runs", "cost_usd", "REAL")
    # Telegram live tracker (feat/telegram-live-tracker): persist the FULL model
    # name (e.g. "tokenator/claude-opus-4-8") per agent_runs row so the tracker
    # can render a short model tag per stage. Parsed from the run-log result JSON
    # (modelUsage key) by the launcher monitor; NULL when unknown. Idempotent ALTER.
    _ensure_column(conn, "agent_runs", "model", "TEXT")
    # Telegram live tracker: one editable Telegram message per task. We store its
    # message_id so each stage transition can editMessageText the same message
    # instead of spamming a new one. Idempotent ALTER (safe on the live prod DB).
    _ensure_column(conn, "tasks", "tracker_message_id", "INTEGER")
    # Telegram live tracker: human-readable task title for the tracker header
    # ("🛠️ ET-012 · <title>"). Populated from the Plane work-item name at task
    # creation; falls back to the work_item_id when absent. Idempotent ALTER.
    _ensure_column(conn, "tasks", "title", "TEXT")
    # Telegram live tracker: "BRD review" is the only HUMAN gate time — the delta
    # between "BRD ready / approve requested" and the analysis->architecture
    # advance (human flipped Plane to Approved). Persisted on the task so the
    # tracker can show "твоё время" without recomputing from activity history.
    _ensure_column(conn, "tasks", "brd_review_started_at", "TEXT")
    _ensure_column(conn, "tasks", "brd_review_ended_at", "TEXT")
    conn.commit()
    conn.close()
@@ -124,6 +156,71 @@ def update_task_stage(task_id: int, stage: str):
    conn.close()
 # ---------------------------------------------------------------------------
 # Telegram live tracker helpers (feat/telegram-live-tracker)
 # ---------------------------------------------------------------------------
 def get_tracker_message_id(task_id: int) -> int | None:
    """Return the stored Telegram tracker message_id for a task, or None."""
    conn = get_db()
    try:
        row = conn.execute(
            "SELECT tracker_message_id FROM tasks WHERE id=?", (task_id,)
        ).fetchone()
    finally:
        conn.close()
    return row[0] if row and row[0] is not None else None
 def set_tracker_message_id(task_id: int, message_id: int) -> None:
    """Persist the Telegram tracker message_id for a task (idempotent overwrite)."""
    conn = get_db()
    try:
        conn.execute(
            "UPDATE tasks SET tracker_message_id=? WHERE id=?",
            (message_id, task_id),
        )
        conn.commit()
    finally:
        conn.close()
 def mark_brd_review_started(task_id: int) -> None:
    """Stamp when BRD review (the human approve gate) started, if not already set.
    Idempotent: only sets it the first time (a retried analyst run must not reset
    the clock). The delta to brd_review_ended_at is the only "твоё время".
    """
    conn = get_db()
    try:
        conn.execute(
            "UPDATE tasks SET brd_review_started_at=datetime('now') "
            "WHERE id=? AND brd_review_started_at IS NULL",
            (task_id,),
        )
        conn.commit()
    finally:
        conn.close()
 def mark_brd_review_ended(task_id: int) -> None:
    """Stamp when BRD review ended (analysis->architecture advance / Approved).
    Idempotent: only sets it the first time and only if a start exists.
    """
    conn = get_db()
    try:
        conn.execute(
            "UPDATE tasks SET brd_review_ended_at=datetime('now') "
            "WHERE id=? AND brd_review_started_at IS NOT NULL "
            "AND brd_review_ended_at IS NULL",
            (task_id,),
        )
        conn.commit()
    finally:
        conn.close()
 def get_next_work_item_id(repo: str, prefix: str = "ET") -> str:
    """Generate next work item ID (e.g., ET-003 / ORCH-001).
@@ -152,6 +249,44 @@ def get_next_work_item_id(repo: str, prefix: str = "ET") -> str:
    return f"{prefix}-{next_num:03d}"
 def ensure_unique_work_item_id(work_item_id: str, repo: str) -> str:
    """BUG 2a: guarantee work_item_id uniqueness within (repo) over M-6 derive.
    M-6 derives the work_item_id from the Plane sequence_id. That number can
    collide (e.g. an issue was deleted and the sequence reused, or two issues
    map to the same number) -> the SAME ET-NNN gets handed to two different
    tasks, which then physically share a branch/worktree slug prefix and step on
    each other (see ET-006: task 8 and task 25).
    This is a guard LAYERED ON TOP of the M-6 derive (it does NOT replace it):
    given the derived id, if that exact <PREFIX>-NNN already exists in the tasks
    table for this repo, walk forward (ET-007, ET-008, ...) until a free number
    is found and return that instead. If the derived id is free, it is returned
    unchanged.
    """
    if not work_item_id or "-" not in work_item_id:
        return work_item_id
    prefix, num_str = work_item_id.rsplit("-", 1)
    try:
        num = int(num_str)
    except ValueError:
        return work_item_id
    width = len(num_str)
    conn = get_db()
    try:
        candidate = work_item_id
        while conn.execute(
            "SELECT 1 FROM tasks WHERE repo = ? AND work_item_id = ? LIMIT 1",
            (repo, candidate),
        ).fetchone() is not None:
            num += 1
            candidate = f"{prefix}-{num:0{width}d}"
        return candidate
    finally:
        conn.close()
 # ---------------------------------------------------------------------------
 # ORCH-5 (M-7): idempotent webhook event logging
 # ---------------------------------------------------------------------------
@@ -306,6 +441,23 @@ def mark_job(
    conn.close()
 def has_active_job_for_task(task_id: int) -> bool:
    """True if the task already has a queued or running job.
    Used by the status-only verdict model (handle_status_start) to guard against
    double-launching an agent when a duplicate In Progress webhook arrives or a
    job is still in flight. The events de-dup absorbs identical webhook bodies;
    this guards against distinct webhooks while a job is pending/running.
    """
    conn = get_db()
    row = conn.execute(
        "SELECT 1 FROM jobs WHERE task_id = ? AND status IN ('queued','running') LIMIT 1",
        (task_id,),
    ).fetchone()
    conn.close()
    return row is not None
 def count_running_jobs() -> int:
    """Number of jobs currently in 'running' status (for max_concurrency)."""
    conn = get_db()
--- a/src/notifications.py
+++ b/src/notifications.py
@@ -1,6 +1,24 @@
-"""Notifications and logging for orchestrator events."""
+"""Notifications and logging for orchestrator events.
 feat/telegram-live-tracker (Variant B+): instead of ~15 separate Telegram
 messages per task (agent start / finish / stage transition / QG-pending / tech
 noise), the orchestrator now maintains ONE live tracker message per task that is
 edited in place (editMessageText) on every stage transition. Only events that
 NEED Slava's attention are sent as SEPARATE, notifying messages:
  * approve-gate  (notify_approve_requested)  — BRD/TZ/AC ready, flip to Approved
  * deploy failed / rolled back               — send_telegram from launcher/engine
  * agent failed (exit_code != 0)             — send_telegram from launcher
  * task error    (notify_error)
 The tracker itself is edited SILENTLY (disable_notification: true). Stage-change,
 agent-start, agent-finish and QG-pending no longer emit their own messages — they
 just refresh the tracker (or are log-only).
 """
 import html
 import logging
 import httpx
 logger = logging.getLogger("orchestrator")
@@ -17,25 +35,115 @@ def _get_settings():
    return _settings
-def send_telegram(text: str):
+# --------------------------------------------------------------------------- #
-    """Send notification to Telegram. Fire-and-forget, never raises."""
+# Low-level Telegram primitives
 # --------------------------------------------------------------------------- #
 def send_telegram(text: str, disable_notification: bool = False):
    """Send a notification to Telegram. Fire-and-forget, never raises.
    Returns the Telegram message_id on success, else None (so callers that want
    to track the message — the tracker — can store it; legacy callers ignore it).
    """
    s = _get_settings()
    if not s.telegram_bot_token or not s.telegram_chat_id:
-        return
+        return None
    try:
        url = f"https://api.telegram.org/bot{s.telegram_bot_token}/sendMessage"
-        httpx.post(
+        resp = httpx.post(
            url,
            json={
                "chat_id": s.telegram_chat_id,
                "text": text,
                "parse_mode": "HTML",
-                "disable_notification": False,
+                "disable_notification": disable_notification,
            },
            timeout=5,
        )
        data = resp.json()
        if data.get("ok"):
            return data["result"]["message_id"]
    except Exception:
        pass  # Never crash orchestrator due to notification failure
    return None
 # edit_telegram outcome codes -> let update_task_tracker decide what to do:
 #   "ok"           edit applied -> nothing else to do
 #   "not_modified" Telegram says text is identical (400 "message is not
 #                  modified" / "exactly the same") -> success, NO new message
 #   "gone"         original message can't be edited (deleted / too old /
 #                  invalid id) -> caller must fall back to a NEW message
 #   "failed"       transient failure (network / timeout / 5xx / unknown 400)
 #                  -> caller must NOT send a new message (avoid duplicates)
 EDIT_OK = "ok"
 EDIT_NOT_MODIFIED = "not_modified"
 EDIT_GONE = "gone"
 EDIT_FAILED = "failed"
 # Telegram error descriptions that mean the message is permanently un-editable
 # (it is gone / orphaned) -> fall back to a fresh message.
 _GONE_MARKERS = (
    "message to edit not found",
    "message can't be edited",
    "message_id_invalid",
 )
 # Telegram "nothing changed" -> treat as success, never a duplicate.
 _NOT_MODIFIED_MARKERS = (
    "message is not modified",
    "exactly the same",
 )
 def edit_telegram(message_id: int, text: str) -> str:
    """Edit an existing Telegram message. Never raises.
    Returns a distinguishable outcome (see EDIT_* constants) so the caller can
    tell apart "all good" / "nothing changed" / "message gone" / "transient
    failure" and only fall back to a NEW message when the original is truly gone.
    """
    s = _get_settings()
    if not s.telegram_bot_token or not s.telegram_chat_id:
        return EDIT_FAILED
    try:
        url = f"https://api.telegram.org/bot{s.telegram_bot_token}/editMessageText"
        resp = httpx.post(
            url,
            json={
                "chat_id": s.telegram_chat_id,
                "message_id": message_id,
                "text": text,
                "parse_mode": "HTML",
            },
            timeout=5,
        )
        data = resp.json()
        if data.get("ok"):
            return EDIT_OK
        # ok:false -> inspect the description to classify the 400.
        desc = str(data.get("description") or "").lower()
        if any(m in desc for m in _NOT_MODIFIED_MARKERS):
            # Text is identical between transitions (e.g. repeat review cycle
            # renders the same line). Nothing to do, NOT a duplicate.
            logger.debug(
                f"edit_telegram(mid={message_id}): not modified, skipping"
            )
            return EDIT_NOT_MODIFIED
        if any(m in desc for m in _GONE_MARKERS):
            logger.warning(
                f"edit_telegram(mid={message_id}): message gone ({desc!r}), "
                f"will fall back to a new message"
            )
            return EDIT_GONE
        # Unknown 400 / other non-ok -> transient/unknown, do NOT duplicate.
        logger.warning(
            f"edit_telegram(mid={message_id}): edit failed ({desc!r})"
        )
        return EDIT_FAILED
    except Exception as e:
        # Network / timeout / 5xx -> transient, do NOT duplicate.
        logger.warning(f"edit_telegram(mid={message_id}): transient error: {e}")
        return EDIT_FAILED
 def _get_work_item_id(task_id: int) -> str:
@@ -50,26 +158,355 @@ def _get_work_item_id(task_id: int) -> str:
        return f"task-{task_id}"
 # --------------------------------------------------------------------------- #
 # Live task tracker
 # --------------------------------------------------------------------------- #
 # Pipeline stages shown in the tracker, in order, with their display label and
 # the agent whose agent_runs rows describe that stage's work. "Ревью БРД" is NOT
 # an agent stage — it is the human approve gate rendered between Analysis and
 # Architecture from the task's brd_review_* timestamps.
 _TRACKER_STAGES = [
    ("analysis", "Analysis", "analyst"),
    ("architecture", "Architecture", "architect"),
    ("development", "Development", "developer"),
    ("review", "Review", "reviewer"),
    ("testing", "Testing", "tester"),
    ("deploy", "Deploy", "deployer"),
 ]
 # Map a pipeline stage -> the agent that is RUNNING while the task sits in it.
 # (development is entered after architecture finishes, etc.) Used to render the
 # "🔄 <Stage> … идёт" line for the currently-active stage.
 _BRD_LABEL = "\u0420\u0435\u0432\u044c\u044e \u0411\u0420\u0414"  # "Ревью БРД"
 _STAGE_ACTIVE_AGENT = {
    "analysis": "analyst",
    "architecture": "architect",
    "development": "developer",
    "review": "reviewer",
    "testing": "tester",
    "deploy": "deployer",
 }
 def _fmt_minutes(seconds) -> str:
    """Render a duration in whole minutes: 0..59s -> '<1м', else '<n>м'."""
    try:
        seconds = int(seconds or 0)
    except (TypeError, ValueError):
        seconds = 0
    if seconds <= 0:
        return "0м"
    if seconds < 60:
        return "<1м"
    return f"{seconds // 60}\u043c"
 def _parse_sql_ts(ts):
    """Parse a SQLite 'YYYY-MM-DD HH:MM:SS' UTC timestamp -> aware datetime/None."""
    if not ts:
        return None
    from datetime import datetime, timezone
    for fmt in ("%Y-%m-%d %H:%M:%S", "%Y-%m-%dT%H:%M:%S"):
        try:
            return datetime.strptime(str(ts)[:19], fmt).replace(tzinfo=timezone.utc)
        except (ValueError, TypeError):
            continue
    return None
 def _duration_seconds(started, finished):
    """Seconds between two SQL timestamps; None if either is missing/unparseable."""
    a = _parse_sql_ts(started)
    b = _parse_sql_ts(finished)
    if a is None or b is None:
        return None
    return max(int((b - a).total_seconds()), 0)
 def render_task_tracker(task_id: int) -> str:
    """Build the full live-tracker text for a task from the DB (stateless render).
    Pulls the task header (work_item_id, title, stage), every agent_runs row, and
    the BRD-review timestamps, then renders:
      - one '✅ <Stage> <dur> · <in>↓/<out>↑ · <cost> · <model>' line per finished
        stage (latest run per stage),
      - the '⏸️ Ревью БРД <dur> · твоё время[ ⏳]' line between Analysis/Architecture,
      - a '🔄 <Stage> … идёт' line for the active (in-progress) stage,
      - the '💰 <in>↓ / <out>↑ · <cost>' totals,
      - on done: '⏱️ Всего .. · агенты .. · твоё ..' and a '🔗 PR / 📦' line.
    Never raises (returns a minimal fallback string on error).
    """
    from .db import get_db
    from .usage import fmt_tokens, fmt_cost, _input_total, short_model_name
    try:
        conn = get_db()
        task = conn.execute(
            "SELECT id, work_item_id, title, stage, created_at, updated_at, "
            "brd_review_started_at, brd_review_ended_at "
            "FROM tasks WHERE id=?",
            (task_id,),
        ).fetchone()
        if not task:
            conn.close()
            return f"task-{task_id}"
        runs = conn.execute(
            "SELECT agent, started_at, finished_at, exit_code, input_tokens, "
            "output_tokens, cache_read_tokens, cache_creation_tokens, cost_usd, model "
            "FROM agent_runs WHERE task_id=? ORDER BY id ASC",
            (task_id,),
        ).fetchall()
        conn.close()
    except Exception as e:
        logger.warning(f"render_task_tracker({task_id}) DB error: {e}")
        return f"task-{task_id}"
    work_item_id = task["work_item_id"] or f"task-{task_id}"
    title = task["title"] or work_item_id
    stage = task["stage"] or "created"
    done = stage == "done"
    # Latest completed run per agent (a stage may have multiple runs on retry;
    # we show the most recent FINISHED, successful run for the stage line).
    last_done = {}
    agent_runs_by_agent = {}
    for r in runs:
        agent_runs_by_agent.setdefault(r["agent"], []).append(r)
        if r["finished_at"] and (r["exit_code"] == 0 or r["exit_code"] is None):
            last_done[r["agent"]] = r
    # Totals across ALL runs (every input/output token + cost counts).
    total_in = 0
    total_out = 0
    total_cost = 0.0
    agent_seconds = 0
    for r in runs:
        usage = {
            "input_tokens": r["input_tokens"],
            "cache_read_tokens": r["cache_read_tokens"],
            "cache_creation_tokens": r["cache_creation_tokens"],
        }
        total_in += _input_total(usage)
        total_out += int(r["output_tokens"] or 0)
        total_cost += float(r["cost_usd"] or 0.0)
        d = _duration_seconds(r["started_at"], r["finished_at"])
        if d is not None:
            agent_seconds += d
    esc_title = html.escape(title)
    header = (
        f"\U0001f389 {html.escape(work_item_id)} \u00b7 {esc_title} \u2014 \u0413\u041e\u0422\u041e\u0412\u041e"
        if done
        else f"\U0001f6e0\ufe0f {html.escape(work_item_id)} \u00b7 {esc_title}"
    )
    bar = "\u2501" * 22
    lines = [header, bar]
    def _stage_line(label, run):
        usage = {
            "input_tokens": run["input_tokens"],
            "cache_read_tokens": run["cache_read_tokens"],
            "cache_creation_tokens": run["cache_creation_tokens"],
        }
        in_tok = fmt_tokens(_input_total(usage))
        out_tok = fmt_tokens(run["output_tokens"])
        cost = fmt_cost(run["cost_usd"])
        dur = _fmt_minutes(_duration_seconds(run["started_at"], run["finished_at"]))
        model = short_model_name(run["model"])
        model_suffix = f" \u00b7 {model}" if model else ""
        return (
            f"\u2705 {label:<13} {dur} \u00b7 "
            f"{in_tok}\u2193/{out_tok}\u2191 \u00b7 {cost}{model_suffix}"
        )
    # BRD review line: between Analysis and Architecture, only once Analysis has
    # produced a run (i.e. the gate is live). Time = human review delta.
    brd_started = task["brd_review_started_at"]
    brd_ended = task["brd_review_ended_at"]
    review_seconds = _duration_seconds(brd_started, brd_ended)
    for stage_key, label, agent in _TRACKER_STAGES:
        run = last_done.get(agent)
        # The stage is "in progress" only when it is the task's current stage AND
        # there is an unfinished run for its agent (the agent is actually still
        # working). A finished run with no in-flight run -> show the \u2705 result,
        # even if the task still sits in that stage (just-finished snapshot).
        agent_runs = agent_runs_by_agent.get(agent, [])
        has_inflight = any(ar["finished_at"] is None for ar in agent_runs)
        is_active_stage = (
            _STAGE_ACTIVE_AGENT.get(stage) == agent
            and stage == stage_key
            and (has_inflight or run is None)
        )
        if is_active_stage:
            # Live "\U0001f504 ... \u0438\u0434\u0451\u0442" line. Count how many times THIS stage's
            # agent has run for this task; a 2nd+ run means we're re-doing the
            # stage (e.g. review->development->review), so show "\u043f\u043e\u043f\u044b\u0442\u043a\u0430 N"
            # to make the text change between cycles and to honestly show Slava
            # the stage is being re-worked.
            attempt = len(agent_runs)
            if attempt >= 2:
                lines.append(
                    f"\U0001f504 {label} \u00b7 \u043f\u043e\u043f\u044b\u0442\u043a\u0430 {attempt} "
                    f"\u2026 \u0438\u0434\u0451\u0442"
                )
            else:
                lines.append(
                    f"\U0001f504 {label:<13} \u2026   \u00b7 \u0438\u0434\u0451\u0442"
                )
        elif run is not None:
            lines.append(_stage_line(label, run))
        # else: not started yet -> not shown.
        # Insert the BRD review line right after Analysis.
        if stage_key == "analysis" and brd_started:
            brd_label = f"{_BRD_LABEL:<13}"
            if review_seconds is not None:
                dur = _fmt_minutes(review_seconds)
                lines.append(
                    f"\u23f8\ufe0f {brd_label} {dur} \u00b7 \u0442\u0432\u043e\u0451 \u0432\u0440\u0435\u043c\u044f"
                )
            else:
                # Still waiting on the human (ended not stamped yet).
                from datetime import datetime, timezone
                start_dt = _parse_sql_ts(brd_started)
                waited = None
                if start_dt is not None:
                    waited = int(
                        (datetime.now(timezone.utc) - start_dt).total_seconds()
                    )
                dur = _fmt_minutes(waited) if waited is not None else "\u2026"
                lines.append(
                    f"\u23f8\ufe0f {brd_label} {dur} \u00b7 \u0442\u0432\u043e\u0451 \u0432\u0440\u0435\u043c\u044f \u23f3"
                )
    lines.append(bar)
    lines.append(
        f"\U0001f4b0 {fmt_tokens(total_in)}\u2193 / {fmt_tokens(total_out)}\u2191 \u00b7 "
        f"{fmt_cost(total_cost)}"
    )
    if done:
        wall = _duration_seconds(task["created_at"], task["updated_at"])
        wall_str = _fmt_minutes(wall) if wall is not None else "?"
        review_str = _fmt_minutes(review_seconds) if review_seconds else "0м"
        lines.append(
            f"\u23f1\ufe0f \u0412\u0441\u0435\u0433\u043e {wall_str} \u00b7 "
            f"\u0430\u0433\u0435\u043d\u0442\u044b {_fmt_minutes(agent_seconds)} \u00b7 "
            f"\u0442\u0432\u043e\u0451 {review_str}"
        )
        link = _done_link(task_id, task["work_item_id"])
        if link:
            lines.append(link)
    return "\n".join(lines)
 def _done_link(task_id: int, work_item_id) -> str | None:
    """Build the final '🔗 PR #n · 📦 deployed' line. Never raises -> None."""
    try:
        from .config import settings
        from .db import get_db
        conn = get_db()
        row = conn.execute(
            "SELECT repo, branch FROM tasks WHERE id=?", (task_id,)
        ).fetchone()
        conn.close()
        if not row:
            return None
        repo, branch = row["repo"], row["branch"]
        pr_part = None
        try:
            owner = settings.gitea_owner
            headers = {"Authorization": f"token {settings.gitea_token}"}
            resp = httpx.get(
                f"{settings.gitea_url}/api/v1/repos/{owner}/{repo}/pulls",
                params={"state": "all", "head": branch},
                headers=headers, timeout=5,
            )
            if resp.status_code == 200:
                prs = resp.json()
                if prs:
                    pr_part = f"\U0001f517 PR #{prs[0].get('number')}"
        except Exception:
            pr_part = None
        parts = []
        if pr_part:
            parts.append(pr_part)
        parts.append("\U0001f4e6 deployed")
        return " \u00b7 ".join(parts)
    except Exception:
        return None
 def update_task_tracker(task_id: int):
    """Render + push the live tracker for a task. Never raises.
    First call (no stored tracker_message_id): sendMessage (silent) and store the
    returned message_id. Subsequent calls: editMessageText the stored message.
    A NEW message is sent ONLY when the original is truly gone (deleted / too old
    / invalid id). On "not modified" (text unchanged) or transient failures
    (network / timeout / 5xx / unknown 400) we do NOT send a new message — that
    is exactly what produced duplicate trackers and orphaned (lagging) messages.
    The tracker is always sent with disable_notification so it never pings —
    only the dedicated alert helpers ping.
    """
    try:
        from .db import get_tracker_message_id, set_tracker_message_id
        text = render_task_tracker(task_id)
        mid = get_tracker_message_id(task_id)
        if mid is not None:
            result = edit_telegram(mid, text)
            if result in (EDIT_OK, EDIT_NOT_MODIFIED):
                # Edited in place (or nothing to change) -> done, no duplicate.
                return
            if result == EDIT_FAILED:
                # Transient -> don't duplicate; tracker redraws next transition.
                logger.debug(
                    f"update_task_tracker({task_id}): edit failed transiently, "
                    f"keeping message {mid}"
                )
                return
            # result == EDIT_GONE -> the stored message is gone; fall through
            # to send a fresh one and re-point tracker_message_id at it.
        new_mid = send_telegram(text, disable_notification=True)
        if new_mid is not None:
            set_tracker_message_id(task_id, new_mid)
    except Exception as e:
        logger.warning(f"update_task_tracker({task_id}) failed: {e}")
 # --------------------------------------------------------------------------- #
 # Stage / agent lifecycle notifications  (now tracker-only, no separate message)
 # --------------------------------------------------------------------------- #
 def notify_stage_change(task_id: int, old_stage: str, new_stage: str, agent: str = None):
-    """Log and notify stage transition."""
+    """Log a stage transition and refresh the live tracker (no separate message)."""
    work_item_id = _get_work_item_id(task_id)
    msg = f"\U0001f504 {work_item_id}: {old_stage} \u2192 {new_stage}"
    if agent:
        msg += f" (\u0437\u0430\u043f\u0443\u0449\u0435\u043d {agent})"
    logger.info(msg)
-    send_telegram(msg)
+    update_task_tracker(task_id)
 def notify_agent_started(run_id: int, agent: str, task_id: int):
-    """Notify agent launch."""
+    """Log an agent launch and refresh the tracker (no separate message)."""
    work_item_id = _get_work_item_id(task_id)
-    msg = f"\U0001f680 {work_item_id}: {agent} \u0437\u0430\u043f\u0443\u0449\u0435\u043d (run_id={run_id})"
+    logger.info(f"\U0001f680 {work_item_id}: {agent} \u0437\u0430\u043f\u0443\u0449\u0435\u043d (run_id={run_id})")
-    logger.info(msg)
+    if task_id:
-    send_telegram(msg)
+        update_task_tracker(task_id)
 def notify_agent_finished(run_id: int, agent: str, exit_code: int, task_id: int = None, duration_s: int = None):
-    """Notify agent completion."""
+    """Log agent completion and refresh the tracker (no separate message).
    The agent-FAILED alert (exit_code != 0) is still sent separately by the
    launcher via send_telegram; this helper itself only logs + refreshes.
    """
    work_item_id = _get_work_item_id(task_id) if task_id else "?"
    if exit_code == 0:
        dur = f" ({duration_s // 60} \u043c\u0438\u043d)" if duration_s else ""
@@ -79,47 +516,66 @@ def notify_agent_finished(run_id: int, agent: str, exit_code: int, task_id: int
    else:
        msg = f"\u274c {work_item_id}: {agent} \u0443\u043f\u0430\u043b (exit_code={exit_code})"
    logger.info(msg)
-    send_telegram(msg)
+    if task_id:
        update_task_tracker(task_id)
 def notify_qg_result(task_id: int, check: str, passed: bool, reason: str = None):
-    """Notify QG check result."""
+    """Log a QG check result (NO separate Telegram message: QG-pending is noise).
    Kept for callers; QG outcomes are log-only now and reflected by the tracker
    through the resulting stage transition.
    """
    work_item_id = _get_work_item_id(task_id)
    if passed:
-        msg = f"\u2705 {work_item_id}: QG {check} \u2014 passed"
+        logger.info(f"\u2705 {work_item_id}: QG {check} \u2014 passed")
    else:
-        msg = f"\u26a0\ufe0f {work_item_id}: QG {check} \u2014 failed: {reason}"
+        logger.warning(f"\u26a0\ufe0f {work_item_id}: QG {check} \u2014 failed: {reason}")
    logger.info(msg)
    send_telegram(msg)
 def notify_qg_failure(task_id: int, stage: str, check: str, reason: str):
-    """Log and notify QG check failure."""
+    """Log a QG check failure (log-only).
    QG-pending / QG-failed are NOT pinged as separate messages anymore (they are
    not actionable for Slava). Real rollbacks/deploy-fails are alerted by their
    own dedicated send_telegram calls in the engine/launcher.
    """
    work_item_id = _get_work_item_id(task_id)
-    msg = f"\u26a0\ufe0f {work_item_id}: QG {check} \u2014 failed: {reason}"
+    logger.warning(f"\u26a0\ufe0f {work_item_id}: QG {check} \u2014 failed: {reason}")
    logger.warning(msg)
    send_telegram(msg)
 def notify_approve_requested(task_id: int):
-    """Notify that analyst requests :approved:."""
+    """ALERT (separate, notifying): BRD/TZ/AC ready -> flip Plane to Approved.
    Also starts the BRD-review clock and refreshes the tracker so the
    '⏸️ Ревью БРД · твоё время ⏳' line appears.
    """
    work_item_id = _get_work_item_id(task_id)
-    msg = f"\U0001f4cb {work_item_id}: BRD/\u0422\u0417/AC \u0433\u043e\u0442\u043e\u0432\u044b. \u0416\u0434\u0443 :approved: \u0432 Plane"
+    try:
        from .db import mark_brd_review_started
        mark_brd_review_started(task_id)
    except Exception as e:
        logger.warning(f"notify_approve_requested: brd clock start failed: {e}")
    msg = (
        f"\U0001f4cb {work_item_id}: BRD/\u0422\u0417/AC \u0433\u043e\u0442\u043e\u0432\u044b. "
        f"\u041f\u0435\u0440\u0435\u0432\u0435\u0434\u0438\u0442\u0435 \u0437\u0430\u0434\u0430\u0447\u0443 \u0432 \u0441\u0442\u0430\u0442\u0443\u0441 Approved "
        f"\u0432 Plane \u0434\u043b\u044f \u043f\u0440\u043e\u0434\u043e\u043b\u0436\u0435\u043d\u0438\u044f."
    )
    logger.info(msg)
-    send_telegram(msg)
+    update_task_tracker(task_id)
    send_telegram(msg)  # separate, notifying
 def notify_done(task_id: int):
-    """Notify task completion."""
+    """Task completion: refresh the tracker to its final ГОТОВО form (no separate ping)."""
    work_item_id = _get_work_item_id(task_id)
-    msg = f"\U0001f389 {work_item_id}: \u0437\u0430\u0434\u0430\u0447\u0430 \u0437\u0430\u0432\u0435\u0440\u0448\u0435\u043d\u0430!"
+    logger.info(f"\U0001f389 {work_item_id}: \u0437\u0430\u0434\u0430\u0447\u0430 \u0437\u0430\u0432\u0435\u0440\u0448\u0435\u043d\u0430!")
-    logger.info(msg)
+    update_task_tracker(task_id)
    send_telegram(msg)
 def notify_error(task_id: int, error: str):
-    """Log and notify error for a task."""
+    """ALERT (separate, notifying): task error."""
    work_item_id = _get_work_item_id(task_id) if task_id else "system"
    msg = f"\U0001f534 {work_item_id}: ERROR \u2014 {error}"
    logger.error(msg)
-    send_telegram(msg)
+    send_telegram(msg)  # separate, notifying
--- a/src/plane_sync.py
+++ b/src/plane_sync.py
@@ -15,6 +15,44 @@ EMOJI_DONE = "\u2705"           # task completed
 PLANE_BASE = f"{settings.plane_api_url}/api/v1"
 PLANE_HEADERS = {"X-API-Key": settings.plane_api_token}
 WORKSPACE = settings.plane_workspace_slug
 # feat(plane): per-agent comment authorship.
 # Map an agent role -> its dedicated Plane bot token (read from config / env).
 # When the token is present, add_comment() POSTs under that bot so Plane shows
 # the real author. Empty/unknown role -> fallback to the shared orchestrator
 # token (PLANE_HEADERS), so commenting stays autonomous.
 PLANE_BOT_TOKENS = {
    "analyst": settings.plane_bot_analyst,
    "architect": settings.plane_bot_architect,
    "developer": settings.plane_bot_developer,
    "reviewer": settings.plane_bot_reviewer,
    "tester": settings.plane_bot_tester,
    "deployer": settings.plane_bot_deployer,
    "stream": settings.plane_bot_stream,
 }
 # Map a pipeline stage -> the agent role that owns work in that stage. Used to
 # pick an author for rollback/stage notifications targeting a specific stage.
 STAGE_AUTHORS = {
    "analysis": "analyst",
    "architecture": "architect",
    "development": "developer",
    "review": "reviewer",
    "testing": "tester",
    "deploy": "deployer",
 }
 def _headers_for(author: str | None) -> dict:
    """Return X-API-Key headers for the given agent role.
    Falls back to the shared orchestrator token (PLANE_HEADERS /
    settings.plane_api_token) when the role is None, unknown, or its bot token
    is not configured. This keeps comment posting autonomous: a comment is
    always written, just attributed to the orchestrator if no bot is set.
    """
    tok = PLANE_BOT_TOKENS.get(author or "") if author else None
    return {"X-API-Key": tok} if tok else PLANE_HEADERS
 PROJECT_ID = settings.plane_project_id or "7a79f0a9-5278-49cd-9007-9a338f238f9c"
@@ -46,7 +84,12 @@ def _resolve_project_id(work_item_id: str = None, project_id: str = None) -> str
            logger.debug(f"_resolve_project_id fallback for {work_item_id}: {e}")
    return PROJECT_ID
-# Plane state IDs
+# Plane state IDs.
 # TODO(ORCH-10): these UUIDs are PER-PROJECT. The 6 stage-visibility / verdict
 # statuses below were created only in the enduro project (7a79f0a9-...). One
 # project is in prod today, so a single global dict is acceptable. When more
 # projects are onboarded these must be resolved per project (see ORCH-10 in
 # BACKLOG.md / the ORCH-6 project registry) — do NOT hardcode globally then.
 PLANE_STATES = {
    "backlog": "113b24f6-cce8-4be9-9a22-a359b9cf0122",
    "todo": "2c7d3df3-9eb9-419b-92b7-d7d560bcdd10",
@@ -56,16 +99,39 @@ PLANE_STATES = {
    "blocked": "6c4543f9-ac47-4ef7-ae0f-070020dc9920",
    "done": "381a2833-3c4e-4be5-bd0f-be84cb946ad8",
    "cancelled": "b1cae7f9-961d-4889-a179-f3acea697d17",
    # Feature 3 (stage visibility) — per-stage statuses on the board.
    "architecture": "3020bbb7-6122-4663-930c-0315ba8dfa3d",
    "development": "9920609b-f140-4e46-ab95-89acda8412c8",
    "review": "ba0d802c-5218-41d4-ab43-978b0ea123ed",
    "testing": "7855d807-b1bf-42ef-8dae-6cde0df92d02",
    # Feature 2 (verdict statuses) — Approved / Rejected.
    "approved": "a519a341-dada-4a91-8910-7604f82b79c5",
    "rejected": "ba958f3c-5db5-461d-8f82-89425e413b97",
 }
-# Map orchestrator stages to Plane states
+# Feature 3: map an orchestrator stage -> the Plane status to show on the board
 # when the pipeline ENTERS that stage. analysis stays driven by the existing
 # in_progress/in_review/needs_input logic (no dedicated status). deploy keeps
 # in_progress until done. Needs Input / In Review / Blocked remain higher
 # priority and are set explicitly elsewhere — do NOT override them from here.
 STAGE_VISIBILITY_STATE = {
    "architecture": "architecture",
    "development": "development",
    "review": "review",
    "testing": "testing",
 }
 # Map orchestrator stages to Plane states (used by update_issue_state /
 # notify_stage_change). Feature 3: architecture/development/review/testing now
 # point at their dedicated board statuses so the task physically moves across
 # columns. analysis -> in_progress, deploy -> in_progress, done -> done.
 STAGE_TO_STATE = {
    "created": PLANE_STATES["todo"],
    "analysis": PLANE_STATES["in_progress"],
-    "architecture": PLANE_STATES["in_progress"],
+    "architecture": PLANE_STATES["architecture"],
-    "development": PLANE_STATES["in_progress"],
+    "development": PLANE_STATES["development"],
-    "review": PLANE_STATES["in_progress"],
+    "review": PLANE_STATES["review"],
-    "testing": PLANE_STATES["in_progress"],
+    "testing": PLANE_STATES["testing"],
    "deploy": PLANE_STATES["in_progress"],
    "done": PLANE_STATES["done"],
 }
@@ -89,6 +155,84 @@ def fetch_issue_sequence_id(issue_id: str, project_id: str) -> int | None:
        return None
 import re as _re
 def _strip_html(html: str) -> str:
    """Crude HTML -> text: drop tags and collapse whitespace. Good enough to
    feed QG-0's length check when Plane only gives us description_html."""
    if not html:
        return ""
    text = _re.sub(r"<[^>]+>", " ", html)
    return _re.sub(r"\s+", " ", text).strip()
 def fetch_issue_description(issue_id: str, project_id: str) -> str:
    """BUG 1: GET the Plane issue by UUID and return its description text.
    Plane's ``issue.updated`` webhook (e.g. a status change) only carries the
    CHANGED fields, so ``description``/``description_stripped`` are usually
    absent there. start_pipeline calls this to pull the full description from the
    issue detail endpoint so QG-0 does not blow up on an empty payload field.
    Reuses the exact GET issue detail endpoint / shared token already used by
    ``fetch_issue_sequence_id`` (same URL, same PLANE_HEADERS). Prefers
    ``description_stripped``; falls back to stripping ``description_html``.
    Returns "" on network error, non-2xx, or a missing field - never raises, so
    a Plane outage degrades to the honest "empty description" QG-0 path instead
    of crashing the webhook.
    """
    url = f"{PLANE_BASE}/workspaces/{WORKSPACE}/projects/{project_id}/issues/{issue_id}/"
    try:
        resp = httpx.get(url, headers=PLANE_HEADERS, timeout=10)
        resp.raise_for_status()
        body = resp.json()
        desc = body.get("description_stripped")
        if desc and desc.strip():
            return desc
        return _strip_html(body.get("description_html") or "")
    except Exception as e:
        logger.warning(f"fetch_issue_description failed for {issue_id}: {e}")
        return ""
 def fetch_issue_fields(issue_id: str, project_id: str) -> tuple[str, str]:
    """BUG B: GET the Plane issue by UUID ONCE and return (name, description).
    Plane's ``issue.updated`` webhook (e.g. a status change) only carries the
    CHANGED fields, so BOTH ``name`` and ``description`` are usually absent in
    the payload. start_pipeline needs the real title (for the branch slug) and
    the real description (for the analyst .task.md). To avoid issuing two
    separate issue-detail GETs (one for name, one for description), this single
    request returns both.
    Reuses the exact GET issue detail endpoint / shared token already used by
    ``fetch_issue_sequence_id`` / ``fetch_issue_description``. For the
    description it applies the same logic as ``fetch_issue_description``
    (prefer ``description_stripped``, fall back to stripping
    ``description_html``).
    Returns ("", "") on network error, non-2xx, or missing body - never raises,
    so a Plane outage degrades gracefully (caller keeps its payload fallbacks).
    """
    url = f"{PLANE_BASE}/workspaces/{WORKSPACE}/projects/{project_id}/issues/{issue_id}/"
    try:
        resp = httpx.get(url, headers=PLANE_HEADERS, timeout=10)
        resp.raise_for_status()
        body = resp.json()
        name = (body.get("name") or "").strip()
        desc = body.get("description_stripped")
        if desc and desc.strip():
            description = desc
        else:
            description = _strip_html(body.get("description_html") or "")
        return name, description
    except Exception as e:
        logger.warning(f"fetch_issue_fields failed for {issue_id}: {e}")
        return "", ""
 def find_issue_id(work_item_id: str, project_id: str = None) -> str | None:
    """Find Plane issue UUID by work_item_id (e.g. 'ET-002')."""
    project_id = _resolve_project_id(work_item_id, project_id)
@@ -159,8 +303,14 @@ def update_issue_state(work_item_id: str, stage: str, project_id: str = None):
        logger.error(f"Failed to update Plane state for {work_item_id}: {e}")
-def add_comment(work_item_id: str, text: str, project_id: str = None):
+def add_comment(work_item_id: str, text: str, project_id: str = None, author: str = None):
-    """Add a comment to Plane issue."""
+    """Add a comment to a Plane issue.
    feat(plane): when ``author`` (an agent role) maps to a configured bot
    token, the comment is POSTed under that bot so Plane shows the real author.
    Otherwise it falls back to the shared orchestrator token (see
    ``_headers_for``). GET/PATCH calls elsewhere keep using PLANE_HEADERS.
    """
    project_id = _resolve_project_id(work_item_id, project_id)
    issue_id = find_issue_id(work_item_id, project_id)
    if not issue_id:
@@ -170,9 +320,9 @@ def add_comment(work_item_id: str, text: str, project_id: str = None):
    url = f"{PLANE_BASE}/workspaces/{WORKSPACE}/projects/{project_id}/issues/{issue_id}/comments/"
    html = f"<p>{text}</p>"
    try:
-        resp = httpx.post(url, headers=PLANE_HEADERS, json={"comment_html": html}, timeout=10)
+        resp = httpx.post(url, headers=_headers_for(author), json={"comment_html": html}, timeout=10)
        resp.raise_for_status()
-        logger.info(f"Plane: comment added to {work_item_id}")
+        logger.info(f"Plane: comment added to {work_item_id} (author={author or 'orchestrator'})")
    except Exception as e:
        logger.error(f"Failed to add comment to {work_item_id}: {e}")
@@ -193,11 +343,37 @@ def set_issue_blocked(work_item_id: str, project_id: str = None):
    _set_issue_state_direct(work_item_id, PLANE_STATES["blocked"], project_id)
 def set_issue_done(work_item_id: str, project_id: str = None):
    """Observability fix: force the issue into the TERMINAL Done state.
    Used by the deploy->done success path so a completed task always reaches the
    terminal Plane state (it used to stick on In Progress because the merge
    webhook bypassed the stage engine). Uses the existing PLANE_STATES['done']
    UUID — the mapping itself is NOT changed.
    """
    _set_issue_state_direct(work_item_id, PLANE_STATES["done"], project_id)
 def set_issue_in_progress(work_item_id: str, project_id: str = None):
    """Set issue to 'In Progress' state — agent working."""
    _set_issue_state_direct(work_item_id, PLANE_STATES["in_progress"], project_id)
 def set_issue_stage_state(work_item_id: str, stage: str, project_id: str = None):
    """Feature 3: move the issue to the board status for a pipeline stage.
    Only the visible-stage statuses (architecture/development/review/testing)
    are driven here — stages without a dedicated status (analysis/deploy) are a
    no-op so the existing in_progress/in_review/needs_input logic stays in
    charge. By design this does NOT touch Needs Input / In Review / Blocked,
    which are higher priority and set explicitly by their own helpers.
    """
    state_key = STAGE_VISIBILITY_STATE.get(stage)
    if not state_key:
        return
    _set_issue_state_direct(work_item_id, PLANE_STATES[state_key], project_id)
 def _set_issue_state_direct(work_item_id: str, state_id: str, project_id: str = None):
    """Set issue state directly by state_id."""
    project_id = _resolve_project_id(work_item_id, project_id)
@@ -252,16 +428,29 @@ def notify_stage_change(work_item_id: str, old_stage: str, new_stage: str, agent
    except Exception:
        pass
-    add_comment(work_item_id, msg, project_id)
+    # Stage transition is the orchestrator's own voice -> attribute to stream.
    add_comment(work_item_id, msg, project_id, author="stream")
 def notify_qg_failure(work_item_id: str, stage: str, check: str, reason: str, project_id: str = None):
    """Notify Plane about QG failure."""
-    add_comment(work_item_id, f"{EMOJI_QG_FAIL} QG failed at {stage}: {check} — {reason}", project_id)
+    # QG failure belongs to the agent that owns the failing stage.
    add_comment(
        work_item_id,
        f"{EMOJI_QG_FAIL} QG failed at {stage}: {check} — {reason}",
        project_id,
        author=STAGE_AUTHORS.get(stage, "stream"),
    )
 def notify_done(work_item_id: str, project_id: str = None):
    """Mark issue as Done in Plane."""
    project_id = _resolve_project_id(work_item_id, project_id)
    update_issue_state(work_item_id, "done", project_id)
-    add_comment(work_item_id, f"{EMOJI_DONE} Task completed! PR merged and deployed.", project_id)
+    # Deploy finished the task -> attribute the completion comment to Deployer.
    add_comment(
        work_item_id,
        f"{EMOJI_DONE} Task completed! PR merged and deployed.",
        project_id,
        author="deployer",
    )
--- a/src/qg/checks.py
+++ b/src/qg/checks.py
@@ -249,9 +249,17 @@ def check_reviewer_verdict(repo: str, work_item_id: str, branch: str | None = No
 def check_tests_local(repo: str, branch: str) -> tuple[bool, str]:
    """
    DEPRECATED: replaced by check_ci_green on the development stage (CI is now
    configured). Kept for backward-compat; not wired to any stage.
    S-1 fix: run the project test suite locally and judge by exit code, instead of
    depending on Gitea CI (which is not configured -> always false).
    БАГ 5 fix: invoke pytest directly instead of make test. make is not installed
    in the orchestrator container, so the previous ["make", "test"] call raised
    FileNotFoundError. This reproduces the Makefile test target 1:1
    (cd src/api && python -m pytest ../../tests/ -v).
    ORCH-2 / S-4: tests run inside the per-branch worktree (ensure_worktree), so this
    is safe for concurrent active tasks — no shared /repos checkout race.
    """
@@ -259,7 +267,8 @@ def check_tests_local(repo: str, branch: str) -> tuple[bool, str]:
    try:
        repo_path = ensure_worktree(repo, branch)
        r = subprocess.run(
-            ["make", "test"], cwd=repo_path,
+            ["python", "-m", "pytest", "../../tests/", "-v"],
            cwd=os.path.join(repo_path, "src", "api"),
            capture_output=True, text=True, timeout=600,
        )
        if r.returncode == 0:
@@ -272,6 +281,44 @@ def check_tests_local(repo: str, branch: str) -> tuple[bool, str]:
        return False, f"Local test run error: {e}"
 def check_deploy_status(repo: str, work_item_id: str, branch: str | None = None) -> tuple[bool, str]:
    """
    БАГ 8 fix: gate the deploy -> done transition on the deployer's machine-readable
    verdict in 14-deploy-log.md frontmatter, NOT on the LLM process exit code
    (which is always 0 on a successful agent session even when the deploy failed).
    Mirrors check_reviewer_verdict (S-5): reads ONLY `deploy_status:` from YAML
    frontmatter. Returns:
      (True, ...)  -> deploy_status: SUCCESS
      (False, ...) -> deploy_status: FAILED, missing field, or no frontmatter
    """
    import yaml
    repo_path = _repo_path(repo, branch)
    log_path = os.path.join(repo_path, f"docs/work-items/{work_item_id}/14-deploy-log.md")
    if not os.path.isfile(log_path):
        return False, "Deploy log not found (14-deploy-log.md)"
    try:
        with open(log_path, "r") as f:
            content = f.read()
        status = None
        if content.startswith("---"):
            parts = content.split("---", 2)
            if len(parts) >= 3:
                try:
                    fm = yaml.safe_load(parts[1]) or {}
                except yaml.YAMLError as e:
                    return False, f"Invalid YAML frontmatter in deploy log: {e}"
                status = str(fm.get("deploy_status", "")).upper().strip()
        if status == "SUCCESS":
            return True, "Deploy status: SUCCESS"
        if status == "FAILED":
            return False, "Deploy status: FAILED"
        return False, f"No machine-readable deploy_status in frontmatter (got: {status!r})"
    except OSError as e:
        return False, f"Error reading deploy log: {e}"
 # Registry for dynamic lookup by name
 QG_CHECKS = {
    "check_analysis_approved": check_analysis_approved,
@@ -282,4 +329,5 @@ QG_CHECKS = {
    "check_tests_passed": check_tests_passed,
    "check_reviewer_verdict": check_reviewer_verdict,
    "check_tests_local": check_tests_local,
    "check_deploy_status": check_deploy_status,
 }
--- a/src/stage_engine.py
+++ b/src/stage_engine.py
@@ -47,6 +47,7 @@ from .plane_sync import (
    set_issue_needs_input,
    set_issue_in_progress,
    set_issue_blocked,
    set_issue_done,
 )
 from .config import settings
@@ -189,36 +190,48 @@ def advance_stage(
        # --- Quality gate ----------------------------------------------------
        if qg_name and qg_name in QG_CHECKS:
-            # Human-approval gate: special analyst approved-flow (launcher only).
+            # Human-approval gate: split by path.
            if qg_name == "check_analysis_approved":
-                _handle_analysis_approved_flow(
+                # Launcher path (analyst just finished): set In Review + ask for
-                    task_id, current_stage, repo, work_item_id, branch, agent, result
+                # the Approved status. This gate never advances on its own -- a
-                )
+                # human Approved verdict does that.
-                return result
+                if agent == "analyst":
                    _handle_analysis_approved_flow(
                        task_id, current_stage, repo, work_item_id, branch, agent, result
                    )
                    return result
                # Webhook Approved-verdict path (agent is None): the human flipped
                # the Plane status to Approved, which IS the approval. The gate is
                # satisfied -- do NOT re-run check_analysis_approved (it looks for
                # an :approved: *comment* and would block on a status-only
                # approval). Mark it passed and fall through to the Advance block.
                result.qg_name = qg_name
                result.qg_passed = True
                result.qg_reason = "approved-via-status"
            else:
                passed, reason = _run_qg(qg_name, repo, work_item_id, branch)
                result.qg_passed = passed
                result.qg_reason = reason
-            passed, reason = _run_qg(qg_name, repo, work_item_id, branch)
+                if not passed:
-            result.qg_passed = passed
+                    logger.info(
-            result.qg_reason = reason
+                        f"Task {task_id}: QG '{qg_name}' not passed after {agent}: {reason}"
                    )
                    # Behaviour parity:
                    #  - webhook path (finished_agent is None): emit the generic
                    #    QG-failure notification, exactly like the old plane handler.
                    #  - launcher path (finished_agent set): NO generic notification;
                    #    the rollback branches below own their own messaging, exactly
                    #    like the old launcher handler.
                    if agent is None:
                        notify_qg_failure(task_id, current_stage, qg_name, reason)
                        plane_notify_qg(work_item_id, current_stage, qg_name, reason)
-            if not passed:
+                    _handle_qg_failure_rollbacks(
-                logger.info(
+                        task_id, current_stage, repo, work_item_id, branch,
-                    f"Task {task_id}: QG '{qg_name}' not passed after {agent}: {reason}"
+                        agent, qg_name, reason, result,
-                )
+                    )
-                # Behaviour parity:
+                    return result
                #  - webhook path (finished_agent is None): emit the generic
                #    QG-failure notification, exactly like the old plane handler.
                #  - launcher path (finished_agent set): NO generic notification;
                #    the rollback branches below own their own messaging, exactly
                #    like the old launcher handler.
                if agent is None:
                    notify_qg_failure(task_id, current_stage, qg_name, reason)
                    plane_notify_qg(work_item_id, current_stage, qg_name, reason)
                _handle_qg_failure_rollbacks(
                    task_id, current_stage, repo, work_item_id, branch,
                    agent, qg_name, reason, result,
                )
                return result
        elif qg_name:
            # QG name set but not registered — do not advance (launcher behavior).
@@ -227,6 +240,15 @@ def advance_stage(
        # --- Advance ---------------------------------------------------------
        update_task_stage(task_id, next_stage)
        # Telegram live tracker: the analysis->architecture advance is the human
        # Approved gate clearing -> stamp the END of "Ревью БРД" (the only
        # human time). Idempotent: only the first stamp counts.
        if current_stage == "analysis" and next_stage == "architecture":
            try:
                from .db import mark_brd_review_ended
                mark_brd_review_ended(task_id)
            except Exception as e:
                logger.warning(f"Task {task_id}: brd review end stamp failed: {e}")
        notify_stage_change(task_id, current_stage, next_stage)
        plane_notify_stage(work_item_id, current_stage, next_stage)
        result.advanced = True
@@ -235,6 +257,22 @@ def advance_stage(
            f"(auto-advance after {agent})"
        )
        # --- Terminal sync: deploy -> done must reach Plane's Done -----------
        # When the deployer's check_deploy_status passes we advance to the
        # terminal 'done' stage. Previously a merged-PR webhook completed the
        # task out-of-band and Plane stuck on In Progress. Now done flows through
        # here, so explicitly drive the Plane issue into the terminal Done state
        # (PLANE_STATES['done'] — mapping unchanged) in addition to the
        # stage-change comment above.
        if next_stage == "done" and work_item_id:
            try:
                set_issue_done(work_item_id)
                logger.info(
                    f"Task {task_id}: deploy->done, Plane state forced to Done"
                )
            except Exception as e:
                logger.error(f"Task {task_id}: failed to set Plane Done: {e}")
        # --- Launch the next agent (ORCH-4 fix: current_stage, not next) -----
        next_agent = get_agent_for_stage(current_stage)
        if next_agent:
@@ -257,6 +295,58 @@ def advance_stage(
        return result
 def _build_analyst_ready_comment(repo: str, work_item_id: str, branch: str) -> str:
    """BUG C: HTML comment posted when analyst artifacts are ready.
    Status-only model (PR #12): approval is the **Approved** status, NOT a
    ``:approved:`` comment and NOT moving back to In Progress. The comment asks
    the stakeholder to flip the status and links the documents the analyst
    actually produced.
    Links point at the Gitea web view:
      {gitea_url}/{owner}/{repo}/src/branch/{branch}/docs/work-items/{wid}/<file>
    Only files that REALLY exist in the worktree are listed (no invented docs).
    """
    text = (
        "\u2705 BRD/\u0422\u0417/AC \u0433\u043e\u0442\u043e\u0432\u044b. "
        "\u0414\u043b\u044f \u043f\u0440\u043e\u0434\u0432\u0438\u0436\u0435\u043d\u0438\u044f "
        "\u043f\u0435\u0440\u0435\u0432\u0435\u0434\u0438\u0442\u0435 \u0437\u0430\u0434\u0430\u0447\u0443 "
        "\u0432 \u0441\u0442\u0430\u0442\u0443\u0441 Approved. "
        "\u0414\u043b\u044f \u043e\u0442\u043a\u043b\u043e\u043d\u0435\u043d\u0438\u044f \u2014 "
        "\u043d\u0430\u043f\u0438\u0448\u0438\u0442\u0435 \u043f\u0440\u0438\u0447\u0438\u043d\u0443 "
        "\u043a\u043e\u043c\u043c\u0435\u043d\u0442\u043e\u043c \u0438 \u043f\u0435\u0440\u0435\u0432\u0435\u0434\u0438\u0442\u0435 "
        "\u0432 Rejected."
    )
    # Candidate analyst artifacts (label -> filename). Only existing ones linked.
    candidates = [
        ("Business request", "00-business-request.md"),
        ("BRD", "01-brd.md"),
        ("\u0422\u0417 (TRZ)", "02-trz.md"),
        ("Acceptance Criteria", "03-acceptance-criteria.md"),
        ("Test Plan", "04-test-plan.yaml"),
        ("UI Test Cases", "04b-ui-test-cases.md"),
    ]
    rel_dir = f"docs/work-items/{work_item_id}"
    try:
        wt_dir = os.path.join(get_worktree_path(repo, branch), rel_dir)
    except Exception:
        wt_dir = None
    owner = getattr(settings, "gitea_owner", "admin")
    base = (getattr(settings, "gitea_public_url", "") or settings.gitea_url).rstrip("/")
    links = []
    for label, fname in candidates:
        if wt_dir and not os.path.isfile(os.path.join(wt_dir, fname)):
            continue
        href = f"{base}/{owner}/{repo}/src/branch/{branch}/{rel_dir}/{fname}"
        links.append(f'<li><a href="{href}">{label}</a></li>')
    if links:
        text += "<br><b>\u0414\u043e\u043a\u0443\u043c\u0435\u043d\u0442\u044b:</b><ul>" + "".join(links) + "</ul>"
    return text
 def _handle_analysis_approved_flow(
    task_id, current_stage, repo, work_item_id, branch, agent, result: AdvanceResult
 ):
@@ -279,18 +369,17 @@ def _handle_analysis_approved_flow(
    files_ok, _ = files_check(repo, work_item_id, branch)
    if files_ok:
-        # Full artifacts ready -> In Review, ask for :approved:.
+        # Full artifacts ready -> In Review, ask for the Approved STATUS (BUG C).
        set_issue_in_review(work_item_id)
        plane_add_comment(
            work_item_id,
-            "\U0001f4cb BRD/\u0422\u0417/AC/TestPlan \u0433\u043e\u0442\u043e\u0432\u044b. "
+            _build_analyst_ready_comment(repo, work_item_id, branch),
-            "\u041f\u0440\u043e\u0448\u0443 review \u0438 \u0440\u0435\u0430\u043a\u0446\u0438\u044e :approved: "
+            author="analyst",
            "\u0434\u043b\u044f \u043f\u0440\u043e\u0434\u0432\u0438\u0436\u0435\u043d\u0438\u044f \u0432 Architecture.",
        )
        notify_approve_requested(task_id)
        result.note = "analysis-in-review"
        logger.info(
-            f"Task {task_id}: analyst finished, requested :approved: in Plane"
+            f"Task {task_id}: analyst finished, requested Approved status in Plane"
        )
        return
@@ -305,6 +394,7 @@ def _handle_analysis_approved_flow(
        plane_add_comment(
            work_item_id,
            f"\u2753 Analyst \u043d\u0443\u0436\u0434\u0430\u0435\u0442\u0441\u044f \u0432 \u0443\u0442\u043e\u0447\u043d\u0435\u043d\u0438\u0438:\n\n{questions_text}",
            author="analyst",
        )
        send_telegram(
            f"\u2753 {work_item_id}: Analyst \u0437\u0430\u0434\u0430\u0451\u0442 \u0432\u043e\u043f\u0440\u043e\u0441\u044b. \u041e\u0442\u0432\u0435\u0442\u044c \u0432 Plane."
@@ -316,6 +406,7 @@ def _handle_analysis_approved_flow(
    plane_add_comment(
        work_item_id,
        "\u26a0\ufe0f Analyst \u0437\u0430\u0432\u0435\u0440\u0448\u0438\u043b\u0441\u044f \u0431\u0435\u0437 \u0430\u0440\u0442\u0435\u0444\u0430\u043a\u0442\u043e\u0432 \u0438 \u0431\u0435\u0437 \u0432\u043e\u043f\u0440\u043e\u0441\u043e\u0432. \u041f\u0440\u043e\u0432\u0435\u0440\u044c\u0442\u0435 \u043b\u043e\u0433.",
        author="analyst",
    )
    result.note = "analysis-empty"
@@ -370,6 +461,7 @@ def _handle_qg_failure_rollbacks(
            work_item_id,
            f"\u274c \u0422\u0435\u0441\u0442\u044b \u043d\u0435 \u043f\u0440\u043e\u0448\u043b\u0438: {reason}. "
            f"Developer \u043f\u0435\u0440\u0435\u0437\u0430\u043f\u0443\u0449\u0435\u043d \u0434\u043b\u044f \u0444\u0438\u043a\u0441\u0430.",
            author="tester",
        )
        retry_count = _developer_retry_count(task_id)
        if retry_count < MAX_DEVELOPER_RETRIES:
@@ -410,6 +502,7 @@ def _handle_qg_failure_rollbacks(
                work_item_id,
                f"\u26a0\ufe0f Architect \u043d\u0430\u0448\u0451\u043b \u043a\u043e\u043d\u0444\u043b\u0438\u043a\u0442 \u0441 \u0422\u0417. "
                f"\u0412\u043e\u0437\u0432\u0440\u0430\u0442 \u0432 Analysis.\n\n{conflict_text}",
                author="architect",
            )
            task_desc = (
                f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\n"
@@ -423,3 +516,31 @@ def _handle_qg_failure_rollbacks(
                f"Task {task_id}: architect conflict, enqueued analyst "
                f"(job_id={new_job})"
            )
    # БАГ 8: deployer verdict FAILED -> roll deploy back to development.
    # The launcher's exit_code-based guard (launcher.py:475) never fires because
    # the LLM process exit code is always 0; this gate fires on the machine-readable
    # deploy_status verdict in 14-deploy-log.md instead. Mirrors the launcher block
    # (rollback + set_issue_blocked + notify) but is driven by the VERDICT.
    if agent == "deployer" and qg_name == "check_deploy_status":
        update_task_stage(task_id, "development")
        notify_stage_change(task_id, current_stage, "development")
        plane_notify_stage(work_item_id, current_stage, "development")
        result.rolled_back_to = "development"
        set_issue_blocked(work_item_id)
        notify_qg_failure(task_id, "deploy", "check_deploy_status", reason)
        plane_add_comment(
            work_item_id,
            f"\u274c Deploy FAILED ({reason}). Rolled back to development. "
            f"Developer \u043d\u0443\u0436\u0435\u043d \u0434\u043b\u044f \u0444\u0438\u043a\u0441\u0430.",
            author="deployer",
        )
        send_telegram(
            f"\U0001f6a8 {work_item_id}: Deploy FAILED ({reason}). "
            f"Rolled back to development. Needs fix."
        )
        result.alerted = True
        logger.error(
            f"Task {task_id}: deployer verdict FAILED, rolled back deploy -> "
            f"development ({reason})"
        )
--- a/src/stages.py
+++ b/src/stages.py
@@ -13,10 +13,10 @@ STAGE_TRANSITIONS = {
    "created": {"next": "analysis", "agent": "analyst", "qg": None},
    "analysis": {"next": "architecture", "agent": "architect", "qg": "check_analysis_approved"},
    "architecture": {"next": "development", "agent": "developer", "qg": "check_architecture_done"},
-    "development": {"next": "review", "agent": "reviewer", "qg": "check_tests_local"},
+    "development": {"next": "review", "agent": "reviewer", "qg": "check_ci_green"},
    "review": {"next": "testing", "agent": "tester", "qg": "check_reviewer_verdict"},
    "testing": {"next": "deploy", "agent": "deployer", "qg": "check_tests_passed"},
-    "deploy": {"next": "done", "agent": None, "qg": None},
+    "deploy": {"next": "done", "agent": None, "qg": "check_deploy_status"},
    "done": {"next": None, "agent": None, "qg": None},
 }
--- a/src/usage.py
+++ b/src/usage.py
@@ -0,0 +1,464 @@
 """Feature 4: token / cost accounting for agent runs.
 claude --output-format json emits a single result JSON object at the end of the
 run log with fields:
  total_cost_usd
  usage.input_tokens / output_tokens / cache_read_input_tokens /
       cache_creation_input_tokens
  modelUsage, num_turns, duration_ms
 This module parses that JSON out of a (text-or-json) run log, records the usage
 on the agent_runs row, formats a Plane comment for the finishing agent, and
 builds the per-task summary the Deployer posts on deploy/done.
 Everything here is defensive: a missing/garbled JSON never raises \u2014 we record
 NULL/0 and log a warning so a broken agent run can't crash the monitor.
 """
 import json
 import logging
 from .db import get_db
 logger = logging.getLogger("orchestrator.usage")
 def parse_usage_from_text(text: str) -> dict | None:
    """Extract the claude result-JSON usage from a run log's text.
    The log may contain plain text before/after the JSON; with
    --output-format json the JSON is the final object. We scan for the LAST
    top-level '{' ... '}' that parses and carries usage/total_cost_usd.
    Returns a normalised dict
      {input_tokens, output_tokens, cache_read_tokens, cache_creation_tokens,
       cost_usd}
    (ints / float, missing fields -> 0 / 0.0), or None if no usable JSON found.
    """
    if not text:
        return None
    candidate = _extract_last_json_object(text)
    if candidate is None:
        return None
    usage = candidate.get("usage") or {}
    if not isinstance(usage, dict):
        usage = {}
    cost = candidate.get("total_cost_usd")
    if cost is None:
        cost = candidate.get("cost_usd")
    # If there is neither a usage block nor a cost, this isn't a result object.
    if not usage and cost is None:
        return None
    def _int(v):
        try:
            return int(v)
        except (TypeError, ValueError):
            return 0
    def _float(v):
        try:
            return float(v)
        except (TypeError, ValueError):
            return 0.0
    return {
        "input_tokens": _int(usage.get("input_tokens")),
        "output_tokens": _int(usage.get("output_tokens")),
        "cache_read_tokens": _int(
            usage.get("cache_read_input_tokens", usage.get("cache_read_tokens"))
        ),
        # The cache-CREATION slice (writing new cache entries) is part of the
        # REAL input and used to be dropped on the floor. Persist it so the
        # "X in" figure reflects the full prompt size, not just fresh tokens.
        "cache_creation_tokens": _int(
            usage.get("cache_creation_input_tokens", usage.get("cache_creation_tokens"))
        ),
        "cost_usd": _float(cost),
        # Telegram live tracker: the model the run actually used. claude
        # --output-format json reports it under modelUsage (a dict keyed by the
        # full model id) and/or a top-level "model" field. We keep the FULL name
        # here; short_model_name() trims it for the tracker. None when unknown.
        "model": _extract_model(candidate),
    }
 def _extract_model(candidate: dict) -> str | None:
    """Best-effort: pull the model id out of a claude result JSON object.
    Prefers modelUsage (a dict keyed by full model ids, e.g.
    {"claude-opus-4-8": {...}}) and returns the key with the most output
    tokens; falls back to a top-level "model" string. Never raises -> None.
    """
    try:
        mu = candidate.get("modelUsage")
        if isinstance(mu, dict) and mu:
            def _out(v):
                try:
                    return int((v or {}).get("outputTokens", 0))
                except (TypeError, ValueError, AttributeError):
                    return 0
            best = max(mu.items(), key=lambda kv: _out(kv[1]))
            if best and best[0]:
                return str(best[0])
        model = candidate.get("model")
        if isinstance(model, str) and model:
            return model
    except Exception:
        pass
    return None
 def short_model_name(full: str | None) -> str:
    """Trim a full model id to a short tag for the tracker.
    'tokenator/claude-opus-4-8'  -> 'opus-4-8'
    'vibecode/claude-sonnet-4.6' -> 'sonnet-4.6'
    'claude-opus-4-8'            -> 'opus-4-8'
    Returns '' when full is falsy so callers can omit the ' · <model>' suffix.
    """
    if not full:
        return ""
    name = str(full).strip()
    # Drop any provider prefix up to and including the last '/'.
    if "/" in name:
        name = name.rsplit("/", 1)[-1]
    # Drop a leading 'claude-' marketing prefix.
    if name.startswith("claude-"):
        name = name[len("claude-"):]
    return name
 def _extract_last_json_object(text: str) -> dict | None:
    """Return the last balanced top-level JSON object in `text` that parses.
    Scans from the end for '}' and walks back to the matching '{' using a depth
    counter (string-aware), trying json.loads on each candidate. Robust to log
    lines or text emitted before the JSON.
    """
    # Fast path: the whole stripped text is the JSON.
    stripped = text.strip()
    try:
        obj = json.loads(stripped)
        if isinstance(obj, dict):
            return obj
    except (ValueError, TypeError):
        pass
    # Otherwise find the last balanced { ... } block.
    end = len(text)
    while True:
        close = text.rfind("}", 0, end)
        if close == -1:
            return None
        depth = 0
        in_str = False
        esc = False
        start = None
        for i in range(close, -1, -1):
            ch = text[i]
            if in_str:
                if esc:
                    esc = False
                elif ch == "\\":
                    esc = True
                elif ch == '"':
                    in_str = False
                continue
            if ch == '"':
                in_str = True
            elif ch == "}":
                depth += 1
            elif ch == "{":
                depth -= 1
                if depth == 0:
                    start = i
                    break
        if start is not None:
            blob = text[start:close + 1]
            try:
                obj = json.loads(blob)
                if isinstance(obj, dict):
                    return obj
            except (ValueError, TypeError):
                pass
        end = close  # keep scanning earlier in the text
 def parse_usage_from_log(path: str) -> dict | None:
    """Read a run log file and parse usage from it. Never raises."""
    try:
        with open(path, "r", encoding="utf-8", errors="replace") as f:
            return parse_usage_from_text(f.read())
    except OSError as e:
        logger.warning(f"parse_usage_from_log: cannot read {path}: {e}")
        return None
 def record_usage(run_id: int, usage: dict | None):
    """Write parsed usage onto the agent_runs row. NULLs if usage is None."""
    if usage is None:
        logger.warning(f"run_id={run_id}: no usage JSON parsed, recording NULLs")
        usage = {}
    conn = get_db()
    try:
        conn.execute(
            "UPDATE agent_runs SET input_tokens=?, output_tokens=?, "
            "cache_read_tokens=?, cache_creation_tokens=?, cost_usd=?, "
            "model=COALESCE(?, model) WHERE id=?",
            (
                usage.get("input_tokens"),
                usage.get("output_tokens"),
                usage.get("cache_read_tokens"),
                usage.get("cache_creation_tokens"),
                usage.get("cost_usd"),
                usage.get("model"),
                run_id,
            ),
        )
        conn.commit()
    finally:
        conn.close()
 def fmt_tokens(n) -> str:
    """Format a token count compactly: 1234 -> '1.2k', 2_500_000 -> '2.5M'."""
    try:
        n = int(n or 0)
    except (TypeError, ValueError):
        n = 0
    if n >= 1_000_000:
        return f"{n / 1_000_000:.1f}M"
    if n >= 1_000:
        return f"{n / 1_000:.1f}k"
    return str(n)
 def fmt_cost(c) -> str:
    """Format USD cost with 2 decimals: '$0.21'."""
    try:
        c = float(c or 0.0)
    except (TypeError, ValueError):
        c = 0.0
    return f"${c:.2f}"
 # Pretty agent names for comments (mirrors STAGE_AUTHORS roles).
 AGENT_DISPLAY = {
    "analyst": "Analyst",
    "architect": "Architect",
    "developer": "Developer",
    "reviewer": "Reviewer",
    "tester": "Tester",
    "deployer": "Deployer",
 }
 def _input_total(usage: dict) -> int:
    """FULL input = fresh input + cache-read + cache-creation tokens."""
    def _i(k):
        try:
            return int(usage.get(k) or 0)
        except (TypeError, ValueError):
            return 0
    return _i("input_tokens") + _i("cache_read_tokens") + _i("cache_creation_tokens")
 def _cached_total(usage: dict) -> int:
    """Cached portion of the input = cache-read + cache-creation tokens."""
    def _i(k):
        try:
            return int(usage.get(k) or 0)
        except (TypeError, ValueError):
            return 0
    return _i("cache_read_tokens") + _i("cache_creation_tokens")
 def fmt_in(usage: dict) -> str:
    """Render the input figure as full total with a cached breakdown.
    '8.5M in (8.4M cached)' when there is a cache; '45.2k in' when cached==0.
    """
    total = _input_total(usage)
    cached = _cached_total(usage)
    if cached > 0:
        return f"{fmt_tokens(total)} in ({fmt_tokens(cached)} cached)"
    return f"{fmt_tokens(total)} in"
 def usage_comment(
    agent: str,
    usage: dict | None,
    repo: str | None = None,
    branch: str | None = None,
    work_item_id: str | None = None,
    pr_number=None,
 ) -> str:
    """Build the per-agent finish comment, e.g.
    '\U0001f4bb Developer \u0433\u043e\u0442\u043e\u0432 \u00b7 8.5M in (8.4M cached) / 45.8k out \u00b7 $7.29'.
    When repo/branch/work_item_id are supplied, the agent's artifact link(s) are
    appended (BUG: only analyst used to link its docs). Missing artifacts are
    silently skipped — link building never raises.
    """
    usage = usage or {}
    name = AGENT_DISPLAY.get(agent, agent.capitalize())
    icon = AGENT_ICON.get(agent, "\u2705")
    line = (
        f"{icon} {name} \u0433\u043e\u0442\u043e\u0432 \u00b7 "
        f"{fmt_in(usage)} / "
        f"{fmt_tokens(usage.get('output_tokens'))} out \u00b7 "
        f"{fmt_cost(usage.get('cost_usd'))}"
    )
    links = artifact_links(agent, repo, branch, work_item_id, pr_number)
    if links:
        line += "\n" + "\n".join(links)
    return line
 # Per-agent artifact file under docs/work-items/{wid}/ (architect/developer use
 # special handling for ADR dirs / PR links, see artifact_links()).
 AGENT_ARTIFACT = {
    "reviewer": ("Review", "12-review.md"),
    "tester": ("Test report", "13-test-report.md"),
    "deployer": ("Deploy log", "14-deploy-log.md"),
 }
 def artifact_links(
    agent: str,
    repo: str | None,
    branch: str | None,
    work_item_id: str | None,
    pr_number=None,
 ) -> list[str]:
    """Markdown link(s) to the finishing agent's artifact(s) in Gitea.
    Uses gitea_public_url (falls back to gitea_url) for clickable links, mirroring
    the analyst doc links. Returns [] (never raises) when there is nothing to
    link or the required context is missing. analyst is intentionally NOT handled
    here — its richer doc list lives in stage_engine._build_analyst_ready_comment.
    """
    try:
        from .config import settings
        owner = getattr(settings, "gitea_owner", "admin")
        base = (
            getattr(settings, "gitea_public_url", "") or getattr(settings, "gitea_url", "")
        ).rstrip("/")
        if not base or not repo:
            return []
        links: list[str] = []
        if agent == "developer":
            if branch:
                links.append(
                    f"\U0001f4c2 [Branch {branch}]({base}/{owner}/{repo}/src/branch/{branch})"
                )
            if pr_number:
                links.append(
                    f"\U0001f517 [PR #{pr_number}]({base}/{owner}/{repo}/pulls/{pr_number})"
                )
            return links
        if agent == "architect":
            if branch and work_item_id:
                adr_dir = (
                    f"{base}/{owner}/{repo}/src/branch/{branch}/"
                    f"docs/work-items/{work_item_id}/06-adr"
                )
                links.append(f"\U0001f4d0 [ADR]({adr_dir})")
            return links
        spec = AGENT_ARTIFACT.get(agent)
        if spec and branch and work_item_id:
            label, fname = spec
            href = (
                f"{base}/{owner}/{repo}/src/branch/{branch}/"
                f"docs/work-items/{work_item_id}/{fname}"
            )
            links.append(f"\U0001f4c4 [{label}]({href})")
        return links
    except Exception:
        return []
 AGENT_ICON = {
    "analyst": "\U0001f50d",
    "architect": "\U0001f4d0",
    "developer": "\U0001f4bb",
    "reviewer": "\U0001f50e",
    "tester": "\U0001f9ea",
    "deployer": "\U0001f680",
 }
 def task_usage_summary(task_id: int) -> dict:
    """Aggregate agent_runs usage for a task.
    total_in counts the FULL input (input + cache_read + cache_creation), and
    total_cached counts the cached portion (cache_read + cache_creation).
    COALESCE(...,0) keeps pre-existing rows (NULL cache_creation) from breaking.
    Returns {total_in, total_cached, total_out, total_cost,
             per_agent: [(agent, in, cached, out, cost), ...]}.
    """
    conn = get_db()
    try:
        rows = conn.execute(
            "SELECT agent, "
            "COALESCE(SUM(input_tokens),0) "
            "  + COALESCE(SUM(cache_read_tokens),0) "
            "  + COALESCE(SUM(cache_creation_tokens),0), "
            "COALESCE(SUM(cache_read_tokens),0) "
            "  + COALESCE(SUM(cache_creation_tokens),0), "
            "COALESCE(SUM(output_tokens),0), "
            "COALESCE(SUM(cost_usd),0.0) "
            "FROM agent_runs WHERE task_id=? GROUP BY agent ORDER BY agent",
            (task_id,),
        ).fetchall()
    finally:
        conn.close()
    per_agent = [(r[0], int(r[1]), int(r[2]), int(r[3]), float(r[4])) for r in rows]
    total_in = sum(r[1] for r in per_agent)
    total_cached = sum(r[2] for r in per_agent)
    total_out = sum(r[3] for r in per_agent)
    total_cost = sum(r[4] for r in per_agent)
    return {
        "total_in": total_in,
        "total_cached": total_cached,
        "total_out": total_out,
        "total_cost": total_cost,
        "per_agent": per_agent,
    }
 def task_summary_comment(task_id: int) -> str:
    """Build the Deployer end-of-task summary comment (Feature 4, variant B)."""
    s = task_usage_summary(task_id)
    cached = s.get("total_cached", 0)
    head_in = (
        f"{fmt_tokens(s['total_in'])} \u0432\u0445\u043e\u0434 ({fmt_tokens(cached)} cached)"
        if cached > 0
        else f"{fmt_tokens(s['total_in'])} \u0432\u0445\u043e\u0434"
    )
    lines = [
        f"\U0001f4ca \u0418\u0442\u043e\u0433\u043e \u043f\u043e \u0437\u0430\u0434\u0430\u0447\u0435: "
        f"{head_in} / "
        f"{fmt_tokens(s['total_out'])} \u0432\u044b\u0445\u043e\u0434 \u00b7 "
        f"{fmt_cost(s['total_cost'])}"
    ]
    for agent, ti, tc, to, cost in s["per_agent"]:
        name = AGENT_DISPLAY.get(agent, agent.capitalize())
        in_str = (
            f"{fmt_tokens(ti)} in ({fmt_tokens(tc)} cached)"
            if tc > 0
            else f"{fmt_tokens(ti)} in"
        )
        lines.append(
            f"\u2022 {name}: {in_str} / {fmt_tokens(to)} out \u00b7 {fmt_cost(cost)}"
        )
    return "\n".join(lines)
--- a/src/webhooks/gitea.py
+++ b/src/webhooks/gitea.py
@@ -216,12 +216,31 @@ async def handle_ci_status(payload: dict):
        else:
            notify_qg_failure(task_id, current_stage, "check_ci_green", reason)
-    elif state == "failure":
+    elif state == "failure" and current_stage == "development":
-        # S-1: Gitea CI is NOT the authoritative gate anymore (the orchestrator runs
+        # CI is the authoritative gate for development -> review.
-        # tests locally via check_tests_local). Gitea CI is often unconfigured, so a
+        # On red CI: notify, then bounce the task back to the developer (capped retries),
-        # "failure"/empty status here is not actionable. Log only, do not alert.
+        # symmetric to the review REQUEST_CHANGES path.
-        logger.debug(f"Task {task_id}: Gitea CI state='failure' on branch '{branch}' "
+        notify_qg_failure(task_id, current_stage, "check_ci_green", f"Gitea CI failed on branch '{branch}'")
-                     f"(non-authoritative, suppressed — local tests are the gate)")
+        conn = get_db()
        retry_count = conn.execute(
            "SELECT COUNT(*) as cnt FROM agent_runs WHERE task_id = ? AND agent = 'developer'",
            (task_id,),
        ).fetchone()["cnt"]
        conn.close()
        if retry_count < MAX_DEV_RETRIES:
            # task already on 'development' — no stage change needed, just relaunch developer
            try:
                task_desc = (
                    f"Work item: {work_item_id}\nRepo: {repo_name}\nBranch: {branch}\n"
                    f"Stage: development\nNote: CI failed, fix and re-push (attempt {retry_count + 1}/{MAX_DEV_RETRIES})"
                )
                job_id = enqueue_job("developer", repo_name, task_desc, task_id=task_id)
                logger.info(f"Task {task_id}: CI failed, enqueued developer (attempt {retry_count + 1}, job_id={job_id})")
            except Exception as e:
                notify_error(task_id, f"Failed to relaunch developer after CI failure: {e}")
        else:
            notify_error(task_id, f"Max developer retries ({MAX_DEV_RETRIES}) reached after CI failure, escalating")
            logger.error(f"Task {task_id}: max retries reached after CI failure, needs manual intervention")
 async def handle_pr(payload: dict):
@@ -315,6 +334,20 @@ async def handle_pr(payload: dict):
                logger.error(f"Task {task_id}: max retries reached, needs manual intervention")
    elif action == "closed" and pr.get("merged", False):
        # BUG 8 (second door): at the deploy stage `done` is gated by the
        # deployer's verdict (check_deploy_status via advance_stage), NOT by the
        # fact that the PR was merged. The deployer merges the PR at the START of
        # its run, so a merged webhook arrives ~30s later while the deployer is
        # still working — blindly setting done here would fake-complete the task
        # and discard a later deploy_status: FAILED verdict. advance_stage will
        # drive deploy→done (and Plane→Done) when the deployer job finishes.
        # For every OTHER stage the merge-driven done behaviour is preserved.
        if current_stage == "deploy":
            logger.info(
                f"Task {task_id}: PR merged at deploy stage — done gated by "
                f"deployer verdict (check_deploy_status), ignoring merge-driven done."
            )
            return
        update_task_stage(task_id, "done")
        notify_stage_change(task_id, current_stage, "done")
        logger.info(f"Task {task_id}: PR merged, stage → done")
--- a/src/webhooks/plane.py
+++ b/src/webhooks/plane.py
@@ -13,6 +13,7 @@ from ..db import (
    get_db,
    get_task_by_plane_id,
    get_next_work_item_id,
    ensure_unique_work_item_id,
    update_task_stage,
    enqueue_job,
    insert_event_dedup,
@@ -92,38 +93,264 @@ async def plane_webhook(request: Request):
        return {"status": "ignored", "reason": "unknown project"}
    if (event == "work_item.created") or (event == "issue" and action == "created"):
        # Feature 1: creation NO LONGER starts the pipeline. Slava keeps the
        # backlog until he moves an issue to In Progress. We only run a soft
        # QG-0 sanity log here (no branch, no analyst, no task row).
        await handle_work_item_created(data, project_id)
    elif (event == "work_item.updated") or (event == "issue" and action == "updated"):
        # Status-only verdict model: status changes drive the pipeline.
        #   Backlog/Todo/Triage -> In Progress : START pipeline, or relaunch the
        #                                        stage agent if returned from
        #                                        Needs Input.
        #   -> Approved                         : advance to the next stage.
        #   -> Rejected                         : rollback (reason from latest comment).
        await handle_issue_updated(data, project_id)
    elif (event == "comment.created") or (event == "issue_comment" and action == "created"):
        await handle_comment(data, project_id)
    return {"status": "accepted"}
-async def handle_work_item_created(data: dict, project_id: str = ""):
+def _state_id(data: dict) -> str:
    """Extract the new Plane state UUID from an 'issue updated' payload.
    Real payload (verified from prod events): data.state is
    {id, name, color, group}. Some payloads carry state as a bare UUID string.
    """
-    New work item created in Plane.
+    state = data.get("state")
-    QG-0: validate title, description, priority.
+    if isinstance(state, dict):
-    If valid: create branch, init docs, launch analyst.
+        return state.get("id", "") or ""
-    If invalid: comment with what's missing, set Blocked.
+    if isinstance(state, str):
        return state
    return ""
 async def handle_issue_updated(data: dict, project_id: str = ""):
    """Feature 1 & 2: react to a Plane issue status change.
    Routes the NEW state UUID (data.state.id) to:
      - in_progress  : start the pipeline if this issue has no task yet; if a
        task already exists and the stage agent is idle (returned from Needs
        Input), relaunch the stage agent so it reads Slava's fresh comments.
      - approved     : advance to the next stage.
      - rejected     : rollback to the previous stage (reason from latest comment).
    Any other status (Needs Input, In Review, Blocked, Done, board stages, etc.)
    is ignored here — those are statuses the orchestrator itself sets.
    """
    from ..plane_sync import PLANE_STATES
    plane_id = str(data.get("id") or "")
    new_state = _state_id(data)
    if not plane_id or not new_state:
        logger.info("issue updated without id/state, ignoring")
        return
    if new_state == PLANE_STATES["in_progress"]:
        await handle_status_start(data, project_id)
    elif new_state == PLANE_STATES["approved"]:
        await handle_verdict(data, project_id, approved=True)
    elif new_state == PLANE_STATES["rejected"]:
        await handle_verdict(data, project_id, approved=False)
    else:
        logger.info(f"issue {plane_id} updated to state {new_state[:8]}..., no pipeline action")
 async def handle_status_start(data: dict, project_id: str = ""):
    """An issue moved into In Progress.
    Two cases under the status-only verdict model:
      1. No task yet for this plane_id  -> START the pipeline (start_pipeline).
      2. A task already exists          -> this is Slava returning the issue from
         Needs Input to In Progress after answering the analyst's questions. We
         must RELAUNCH the current stage's agent so it reads the fresh comments
         from Plane (the answer-to-questions flow used to live in handle_comment;
         it is now status-driven).
    KEY FORK — telling "answer to questions" apart from a plain duplicate In
    Progress webhook (the dedup-protection case):
      The tasks table stores no Plane status, and the issue.updated payload only
      carries the NEW state (In Progress), so we cannot read the previous status
      from here. Instead we use the only reliable local signal: whether the
      stage's agent is currently in flight.
      - The orchestrator sets In Progress itself while an agent runs. When the
        agent FINISHES it leaves the issue in Needs Input or In Review and has
        NO queued/running job. So: an existing task with NO active job means the
        agent is idle / waiting -> a return to In Progress is a genuine relaunch
        request -> enqueue the stage agent.
      - If a queued/running job already exists for the task, the agent is busy
        (or a duplicate webhook arrived) -> SKIP (no double launch). The events
        de-dup at the top of plane_webhook already absorbs identical webhook
        bodies; this job guard additionally covers distinct webhooks fired while
        a job is still pending/running.
    """
    from ..db import has_active_job_for_task
    plane_id = str(data.get("id") or "")
    existing = get_task_by_plane_id(plane_id)
    if not existing:
        logger.info(f"Status->In Progress for {plane_id}: starting pipeline")
        await start_pipeline(data, project_id)
        return
    task_id = existing["id"]
    current_stage = existing["stage"]
    repo = existing["repo"]
    work_item_id = existing.get("work_item_id", "")
    branch = existing.get("branch", "")
    # Duplicate / busy guard: a job is already pending or running for this task.
    if has_active_job_for_task(task_id):
        logger.info(
            f"Status->In Progress for {plane_id}: task {task_id} already has an "
            f"active job (stage={current_stage}), not relaunching"
        )
        return
    # Agent is idle -> Slava answered questions and returned the issue to In
    # Progress. Relaunch the current stage's agent to read the fresh comments.
    from ..plane_sync import STAGE_AUTHORS, add_comment as _add_comment
    stage_agent = STAGE_AUTHORS.get(current_stage)
    if not stage_agent:
        logger.info(
            f"Status->In Progress for {plane_id}: no agent for stage "
            f"'{current_stage}', not relaunching"
        )
        return
    task_desc = (
        f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\n"
        f"Stage: {current_stage}\nNote: Stakeholder returned the issue to In "
        f"Progress (answered your questions). Read the latest comments in Plane "
        f"and revise your artifacts."
    )
    job_id = enqueue_job(stage_agent, repo, task_desc, task_id=task_id)
    logger.info(
        f"Task {task_id}: returned to In Progress (Needs Input answered), "
        f"relaunched {stage_agent} for stage {current_stage} (job_id={job_id})"
    )
    try:
        _add_comment(
            work_item_id,
            "\U0001f504 \u0410\u0433\u0435\u043d\u0442 \u043f\u0435\u0440\u0435\u0437\u0430\u043f\u0443\u0449\u0435\u043d \u0441 \u043e\u0442\u0432\u0435\u0442\u0430\u043c\u0438 \u0441\u0442\u0435\u0439\u043a\u0445\u043e\u043b\u0434\u0435\u0440\u0430.",
            author=stage_agent,
        )
    except Exception as e:
        logger.error(f"Failed to post relaunch comment for {work_item_id}: {e}")
 async def handle_verdict(data: dict, project_id: str, approved: bool):
    """Status-only verdict: a Plane status change drives advance / rollback.
    Approved status -> _try_advance_stage. We do NOT touch the issue status here:
    _try_advance_stage -> advance_stage -> plane_notify_stage already PATCHes the
    issue to the NEXT stage's status. The old set_issue_in_progress call reset
    the status to In Progress first, which made the board flicker In Progress
    before the next stage (part of bug 3); it is removed.
    Rejected status -> rollback to the previous stage. The reason is pulled from
    the issue's latest comment (Slava writes the reason in a comment before/with
    flipping the status to Rejected).
    """
    plane_id = str(data.get("id") or "")
    task = get_task_by_plane_id(plane_id)
    if not task:
        logger.warning(f"Verdict status for {plane_id} but no task found, ignoring")
        return
    task_id = task["id"]
    current_stage = task["stage"]
    repo = task["repo"]
    work_item_id = task.get("work_item_id", "")
    branch = task.get("branch", "")
    if approved:
        # NOTE: no set_issue_in_progress here — _try_advance_stage sets the next
        # stage's status itself (advance_stage -> plane_notify_stage).
        logger.info(f"Task {task_id}: Approved status -> advance from {current_stage}")
        await _try_advance_stage(task_id, current_stage, repo, work_item_id, branch)
        return
    # Rejected: pull the rejection reason from the issue's latest comment.
    issue_id = task.get("plane_issue_id") or task.get("plane_id") or plane_id
    reason = _latest_comment_reason(issue_id, repo, project_id)
    await _rollback_stage(
        task_id, current_stage, repo, work_item_id, branch, reason
    )
 def _latest_comment_reason(issue_id: str, repo: str, project_id: str = "") -> str:
    """Fetch the issue's most recent comment text (HTML stripped) as the reject
    reason. Slava writes the reason in a comment before/with flipping the status
    to Rejected.
    Returns a fixed fallback when there is no comment / the API call fails.
    """
    from ..plane_sync import (
        PLANE_BASE,
        PLANE_HEADERS,
        WORKSPACE,
        PROJECT_ID as _DEFAULT_PROJECT_ID,
    )
    fallback = "Rejected via status, no reason comment"
    if not issue_id:
        return fallback
    _proj = get_project_by_repo(repo)
    pid = _proj.plane_project_id if _proj else (project_id or _DEFAULT_PROJECT_ID)
    url = (
        f"{PLANE_BASE}/workspaces/{WORKSPACE}/projects/{pid}/issues/"
        f"{issue_id}/comments/"
    )
    try:
        resp = httpx.get(url, headers=PLANE_HEADERS, timeout=10)
        if resp.status_code != 200:
            logger.warning(
                f"reject-reason: GET comments for {issue_id} returned "
                f"{resp.status_code}"
            )
            return fallback
        payload = resp.json()
        comments = payload.get("results", payload) if isinstance(payload, dict) else payload
        if not comments:
            return fallback
        latest = max(comments, key=lambda c: c.get("created_at", "") or "")
        raw = (
            latest.get("comment_stripped")
            or latest.get("comment_html")
            or latest.get("comment")
            or ""
        )
        text = re.sub(r"<[^>]+>", "", raw).strip()
        return text[:300] if text else fallback
    except Exception as e:
        logger.error(f"reject-reason: failed to fetch comments for {issue_id}: {e}")
        return fallback
 async def handle_work_item_created(data: dict, project_id: str = ""):
    """Feature 1: creation does NOT start the pipeline anymore.
    The pipeline is started when Slava moves the issue into In Progress
    (handle_status_start -> start_pipeline). On creation we only run a SOFT QG-0
    sanity check and log the result — NO branch, NO docs, NO analyst, NO task row
    — so the issue can sit in the backlog until Slava is ready.
    """
    plane_id = data.get("id", "")
    name = data.get("name", "untitled")
    description = data.get("description_stripped", data.get("description", ""))
-    priority = data.get("priority", {})
+    errors = _qg0_errors(name, description)
-    priority_name = priority if isinstance(priority, str) else priority.get("name", "")
+    if errors:
        logger.info(f"work_item.created {plane_id}: soft QG-0 warnings: {errors}")
    else:
        logger.info(f"work_item.created {plane_id} ('{name}'): in backlog, awaiting In Progress")
    # ORCH-6: resolve repo / prefix / Plane project from the registry instead of
    # the single hardcoded default_repo.
    if not project_id:
        project_id = data.get("project") or data.get("project_id") or ""
    proj = get_project_by_plane_id(project_id)
    if not proj:
        logger.warning(f"handle_work_item_created: unknown project '{project_id}', ignoring {plane_id}")
        return
    repo = proj.repo
    plane_project_id = proj.plane_project_id
-    # QG-0 validation
+def _qg0_errors(name: str, description: str) -> list:
    """QG-0 validation: returns a list of human-readable problems (empty = OK)."""
    errors = []
    if not name or len(name) < 5:
        errors.append("Title \u0441\u043b\u0438\u0448\u043a\u043e\u043c \u043a\u043e\u0440\u043e\u0442\u043a\u0438\u0439 (\u043d\u0443\u0436\u043d\u043e >= 5 \u0441\u0438\u043c\u0432\u043e\u043b\u043e\u0432)")
@@ -132,6 +359,66 @@ async def handle_work_item_created(data: dict, project_id: str = ""):
    if not description or len(description.strip()) < 20:
        errors.append("Description \u0441\u043b\u0438\u0448\u043a\u043e\u043c \u043a\u043e\u0440\u043e\u0442\u043a\u0438\u0439 (\u043d\u0443\u0436\u043d\u043e >= 20 \u0441\u0438\u043c\u0432\u043e\u043b\u043e\u0432)")
    return errors
 async def start_pipeline(data: dict, project_id: str = ""):
    """Feature 1: start the pipeline for an issue (moved to In Progress).
    This is the body extracted from the old handle_work_item_created: resolve the
    project, run QG-0 (hard — blocks on failure), create the work item id +
    branch + initial docs, insert the task row, and enqueue the analyst.
    Callers (handle_status_start) already guarantee no existing task for this
    plane_id, so this never duplicates.
    """
    plane_id = data.get("id", "")
    name = data.get("name", "untitled")
    description = data.get("description_stripped", data.get("description", ""))
    # ORCH-6: resolve repo / prefix / Plane project from the registry instead of
    # the single hardcoded default_repo.
    if not project_id:
        project_id = data.get("project") or data.get("project_id") or ""
    proj = get_project_by_plane_id(project_id)
    if not proj:
        logger.warning(f"start_pipeline: unknown project '{project_id}', ignoring {plane_id}")
        return
    repo = proj.repo
    plane_project_id = proj.plane_project_id
    # BUG 1 + BUG B: Plane's issue.updated webhook (status change -> In Progress)
    # sends only the CHANGED fields, so BOTH description / description_stripped
    # AND name are usually empty here even though the issue HAS them. Pull the
    # full title + description from the Plane issue detail API in a SINGLE GET
    # (fetch_issue_fields: same endpoint + shared token already used by
    # fetch_issue_sequence_id) before QG-0 and before the branch slug is built.
    # If the API is also empty, QG-0 legitimately fails (truly empty ticket) and
    # name falls back to "untitled".
    name_missing = (not name) or name.strip().lower() == "untitled" or len(name.strip()) < 3
    desc_missing = (not description) or len(description.strip()) < 20
    if name_missing or desc_missing:
        from ..plane_sync import fetch_issue_fields
        fetched_name, fetched_desc = fetch_issue_fields(plane_id, plane_project_id)
        if desc_missing and fetched_desc and len(fetched_desc.strip()) >= len(description.strip()):
            description = fetched_desc
            logger.info(
                f"start_pipeline: pulled description from Plane API for {plane_id} "
                f"({len(description.strip())} chars)"
            )
        if name_missing and fetched_name and len(fetched_name.strip()) >= 3:
            name = fetched_name
            logger.info(
                f"start_pipeline: pulled name from Plane API for {plane_id} "
                f"('{name}')"
            )
    # BUG B fallback: if name is still empty/blank after the API pull, keep the
    # legacy "untitled" so the slug/branch build never crashes on an empty name.
    if not name or not name.strip():
        name = "untitled"
    # QG-0 validation (hard gate on pipeline start)
    errors = _qg0_errors(name, description)
    if errors:
        # QG-0 failed
        error_text = "\u26a0\ufe0f QG-0 failed:\n" + "\n".join(f"\u2022 {e}" for e in errors)
@@ -169,15 +456,47 @@ async def handle_work_item_created(data: dict, project_id: str = ""):
            f"fell back to DB increment: {work_item_id}"
        )
    # BUG 2a: uniqueness-guard LAYERED ON TOP of the M-6 derive above (the derive
    # itself is untouched). If the derived ET-NNN is already taken by another
    # task in this repo (collision -> two tasks would share branch/worktree, see
    # ET-006), bump to the next free number.
    _derived = work_item_id
    work_item_id = ensure_unique_work_item_id(work_item_id, repo)
    if work_item_id != _derived:
        logger.warning(
            f"work_item_id collision: derived {_derived} already in use for "
            f"{repo}; reassigned {plane_id} -> {work_item_id}"
        )
    # Create slug from name
    slug = re.sub(r"[^a-z0-9]+", "-", name.lower()).strip("-")[:30]
    branch = f"feature/{work_item_id}-{slug}"
    # BUG 2b (defense-in-depth): the worktree/path is keyed by BRANCH
    # (git_worktree.get_worktree_path) and tasks are reverse-resolved by
    # (repo, branch). With 2a the work_item_id is unique, so the branch prefix is
    # too; but the slug could still collide (e.g. two issues with the same title
    # under different ids -> fine) or, worse, an identical branch already exist.
    # Guard physically: if this exact branch is already owned by another task in
    # this repo, disambiguate with the (now unique) work_item_id so two tasks can
    # never share a worktree.
    _conn_b = get_db()
    _branch_taken = _conn_b.execute(
        "SELECT 1 FROM tasks WHERE repo = ? AND branch = ? LIMIT 1", (repo, branch)
    ).fetchone()
    _conn_b.close()
    if _branch_taken is not None:
        branch = f"feature/{work_item_id}-{plane_id[:8]}"
        logger.warning(
            f"branch collision for {repo}; disambiguated to unique branch {branch}"
        )
    # Insert task into DB
    conn = get_db()
    conn.execute(
-        "INSERT INTO tasks (plane_id, work_item_id, repo, branch, stage, plane_issue_id) VALUES (?, ?, ?, ?, ?, ?)",
+        "INSERT INTO tasks (plane_id, work_item_id, repo, branch, stage, plane_issue_id, title) "
-        (plane_id, work_item_id, repo, branch, "analysis", plane_id),
+        "VALUES (?, ?, ?, ?, ?, ?, ?)",
        (plane_id, work_item_id, repo, branch, "analysis", plane_id, name),
    )
    conn.commit()
    conn.close()
@@ -204,133 +523,104 @@ async def handle_work_item_created(data: dict, project_id: str = ""):
        task_row = get_db().execute("SELECT id FROM tasks WHERE work_item_id=?", (work_item_id,)).fetchone()
        if task_row:
            task_id = task_row[0]
-            task_desc = f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\nStage: analysis\nTitle: {name}"
+            task_desc = (
                f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\n"
                f"Stage: analysis\nTitle: {name}\n\nDescription:\n{description}"
            )
            job_id = enqueue_job("analyst", repo, task_desc, task_id=task_id)
            logger.info(f"Task {task_id}: enqueued analyst (job_id={job_id})")
            # Post start comment to Plane
            from ..plane_sync import add_comment as _add_comment
-            _add_comment(work_item_id, "\U0001f50d Analyst \u0437\u0430\u043f\u0443\u0449\u0435\u043d. BRD/\u0422\u0417/AC/TestPlan \u0432 \u0440\u0430\u0431\u043e\u0442\u0435 (\u043e\u0436\u0438\u0434\u0430\u0439\u0442\u0435 8-15 \u043c\u0438\u043d).")
+            _add_comment(work_item_id, "\U0001f50d Analyst \u0437\u0430\u043f\u0443\u0449\u0435\u043d. BRD/\u0422\u0417/AC/TestPlan \u0432 \u0440\u0430\u0431\u043e\u0442\u0435 (\u043e\u0436\u0438\u0434\u0430\u0439\u0442\u0435 8-15 \u043c\u0438\u043d).", author="analyst")
    except Exception as e:
        logger.error(f"Failed to launch analyst for {work_item_id}: {e}")
 async def handle_comment(data: dict, project_id: str = ""):
    """Status-only verdict model: comments NEVER drive the pipeline.
    The whole comment-based control mechanism (``:approved:`` / ``:rejected:``
    and the analysis answer-to-questions flow) was removed. It caused bug 3
    (echo self-hit): the analyst posts its own "waiting for approval" comment,
    handle_comment catches its own comment and reverts In Review -> In Progress.
    Comments are now logged only — no status change, no enqueue, no side effect.
    The pipeline is driven solely by status changes (handle_issue_updated):
      - Approved  -> advance
      - Rejected  -> rollback (reason pulled from the latest comment)
      - In Progress (returned from Needs Input) -> relaunch the stage agent
    """
-    Handle comment event — check for :approved: or :rejected:.
+    plane_id = str(
-    Advance or rollback stage accordingly.
+        data.get("work_item_id") or data.get("issue_id") or data.get("issue") or ""
    )
    logger.info(
        f"comment.created for {plane_id}: logged only, no pipeline action "
        f"(status-only verdict model)"
    )
 async def _rollback_stage(
    task_id: int, current_stage: str, repo: str, work_item_id: str, branch: str,
    reason: str,
 ):
    """Rollback triggered by a status change to Rejected.
      - at analysis: relaunch the analyst with the rejection reason;
      - otherwise: roll back to the previous stage and relaunch its agent
        (via the existing rollback notify + an enqueue of the prev-stage agent).
    """
-    comment_body = data.get("comment_stripped", data.get("comment", data.get("body", data.get("comment_html", ""))))
+    if current_stage == "analysis":
-    plane_id = str(data.get("work_item_id") or data.get("issue_id") or data.get("issue") or "")
+        # Already in analysis — just relaunch analyst with rejection reason
    if not plane_id:
        logger.warning("Comment event without work_item_id, skipping")
        return
    task = get_task_by_plane_id(plane_id)
    if not task:
        logger.warning(f"No task found for plane_id={plane_id}")
        return
    task_id = task["id"]
    current_stage = task["stage"]
    repo = task["repo"]
    work_item_id = task.get("work_item_id", "")
    branch = task.get("branch", "")
    if ":rejected:" in comment_body:
        # Extract reason (text after :rejected:)
        reason = comment_body.split(":rejected:", 1)[-1].strip()[:300]
        if current_stage == "analysis":
            # Already in analysis — just relaunch analyst with rejection reason
            from ..plane_sync import set_issue_in_progress
            set_issue_in_progress(work_item_id)
            task_desc = (
                f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\n"
                f"Stage: analysis\nNote: Stakeholder REJECTED your artifacts. "
                f"Reason: {reason}\nRevise and improve."
            )
            new_job = enqueue_job("analyst", repo, task_desc, task_id=task_id)
            from ..plane_sync import add_comment as _plane_comment
            _plane_comment(work_item_id, f"\U0001f504 Analyst \u043f\u0435\u0440\u0435\u0437\u0430\u043f\u0443\u0449\u0435\u043d. \u041f\u0440\u0438\u0447\u0438\u043d\u0430 \u043e\u0442\u043a\u043b\u043e\u043d\u0435\u043d\u0438\u044f: {reason}")
            logger.info(f"Task {task_id}: rejected at analysis, enqueued analyst (job_id={new_job})")
        else:
            # Rollback to previous stage
            prev_stage = get_previous_stage(current_stage)
            if prev_stage:
                update_task_stage(task_id, prev_stage)
                from ..plane_sync import set_issue_in_progress
                set_issue_in_progress(work_item_id)
                notify_stage_change(task_id, current_stage, prev_stage)
                plane_notify_stage(work_item_id, current_stage, prev_stage)
                from ..plane_sync import add_comment as _plane_comment
                _plane_comment(work_item_id, f"\U0001f504 \u041e\u0442\u043a\u0430\u0442: {current_stage} \u2192 {prev_stage}. \u041f\u0440\u0438\u0447\u0438\u043d\u0430: {reason}")
                logger.info(f"Task {task_id}: rejected, rolled back {current_stage} \u2192 {prev_stage}")
        return
    if ":approved:" in comment_body:
        from ..plane_sync import set_issue_in_progress
        set_issue_in_progress(work_item_id)
-        # Try to advance stage
+        task_desc = (
-        await _try_advance_stage(task_id, current_stage, repo, work_item_id, branch)
+            f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\n"
            f"Stage: analysis\nNote: Stakeholder REJECTED your artifacts. "
            f"Reason: {reason}\nRevise and improve."
        )
        new_job = enqueue_job("analyst", repo, task_desc, task_id=task_id)
        from ..plane_sync import add_comment as _plane_comment
        _plane_comment(work_item_id, f"\U0001f504 Analyst \u043f\u0435\u0440\u0435\u0437\u0430\u043f\u0443\u0449\u0435\u043d. \u041f\u0440\u0438\u0447\u0438\u043d\u0430 \u043e\u0442\u043a\u043b\u043e\u043d\u0435\u043d\u0438\u044f: {reason}", author="analyst")
        logger.info(f"Task {task_id}: rejected at analysis, enqueued analyst (job_id={new_job})")
        return
-    # Task 3: If neither :approved: nor :rejected: — check if this is an answer to questions
+    # Rollback to previous stage
-    if current_stage == "analysis":
+    prev_stage = get_previous_stage(current_stage)
-        from ..plane_sync import PLANE_STATES, set_issue_in_progress
+    if not prev_stage:
-        issue_id = task.get("plane_issue_id") or task.get("plane_id")
+        logger.info(f"Task {task_id}: rejected at {current_stage} but no previous stage")
-        if not issue_id:
+        return
-            issue_id = plane_id
+    update_task_stage(task_id, prev_stage)
-        if issue_id:
+    notify_stage_change(task_id, current_stage, prev_stage)
-            from ..plane_sync import PLANE_BASE, PLANE_HEADERS, WORKSPACE
+    # Feature 3: plane_notify_stage moves the board to the prev stage's status.
-            from ..plane_sync import PROJECT_ID as _DEFAULT_PROJECT_ID
+    plane_notify_stage(work_item_id, current_stage, prev_stage)
-            # ORCH-6: route to this task's own Plane project (resolved from repo).
+    # Then put it back to In Progress so the relaunched agent is clearly working.
-            _proj = get_project_by_repo(repo)
+    from ..plane_sync import set_issue_in_progress
-            _pid = _proj.plane_project_id if _proj else (project_id or _DEFAULT_PROJECT_ID)
+    set_issue_in_progress(work_item_id)
-            import httpx as _httpx
+    from ..plane_sync import add_comment as _plane_comment, STAGE_AUTHORS
-            try:
+    _plane_comment(
-                _resp = _httpx.get(
+        work_item_id,
-                    f"{PLANE_BASE}/workspaces/{WORKSPACE}/projects/{_pid}/issues/{issue_id}/",
+        f"\U0001f504 \u041e\u0442\u043a\u0430\u0442: {current_stage} \u2192 {prev_stage}. \u041f\u0440\u0438\u0447\u0438\u043d\u0430: {reason}",
-                    headers=PLANE_HEADERS, timeout=10
+        author=STAGE_AUTHORS.get(prev_stage, "stream"),
-                )
+    )
-                if _resp.status_code == 200:
+    # Relaunch the previous stage's agent so the rollback actually re-runs work.
-                    issue_data = _resp.json()
+    # STAGE_AUTHORS maps a stage directly to the role that OWNS work in it
-                    if issue_data.get("state") == PLANE_STATES["needs_input"]:
+    # (analysis->analyst, architecture->architect, ...), which is exactly the
-                        # Task 11: Check analyst retry count (max 3 question rounds)
+    # agent we must re-run on a rollback into prev_stage.
-                        conn3 = get_db()
+    from ..plane_sync import STAGE_AUTHORS as _STAGE_AUTHORS
-                        analyst_runs = conn3.execute(
+    prev_agent = _STAGE_AUTHORS.get(prev_stage)
-                            "SELECT COUNT(*) FROM agent_runs WHERE task_id=? AND agent='analyst'",
+    if prev_agent:
-                            (task_id,)
+        task_desc = (
-                        ).fetchone()[0]
+            f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\n"
-                        conn3.close()
+            f"Stage: {prev_stage}\nNote: Stakeholder REJECTED. Reason: {reason}\n"
-
+            f"Revise and improve."
-                        if analyst_runs >= 4:  # initial + 3 retries
+        )
-                            from ..plane_sync import set_issue_blocked, add_comment as _pc
+        new_job = enqueue_job(prev_agent, repo, task_desc, task_id=task_id)
-                            set_issue_blocked(work_item_id)
+        logger.info(
-                            _pc(
+            f"Task {task_id}: rejected, rolled back {current_stage} \u2192 {prev_stage}, "
-                                work_item_id,
+            f"enqueued {prev_agent} (job_id={new_job})"
-                                "\U0001f6a8 3 \u0440\u0430\u0443\u043d\u0434\u0430 \u0443\u0442\u043e\u0447\u043d\u0435\u043d\u0438\u0439 \u0438\u0441\u0447\u0435\u0440\u043f\u0430\u043d\u044b. Analyst \u043d\u0435 \u043c\u043e\u0436\u0435\u0442 \u0441\u0444\u043e\u0440\u043c\u0438\u0440\u043e\u0432\u0430\u0442\u044c \u0422\u0417. "
+        )
-                                "\u0422\u0440\u0435\u0431\u0443\u0435\u0442\u0441\u044f \u0431\u043e\u043b\u0435\u0435 \u0434\u0435\u0442\u0430\u043b\u044c\u043d\u043e\u0435 \u043e\u043f\u0438\u0441\u0430\u043d\u0438\u0435 \u0438\u043b\u0438 \u0432\u0441\u0442\u0440\u0435\u0447\u0430."
+    else:
-                            )
+        logger.info(f"Task {task_id}: rejected, rolled back {current_stage} \u2192 {prev_stage}")
                            from ..notifications import send_telegram
                            send_telegram(f"\U0001f6a8 {work_item_id}: 3 \u0440\u0430\u0443\u043d\u0434\u0430 \u0432\u043e\u043f\u0440\u043e\u0441\u043e\u0432 analyst'\u0430 \u0438\u0441\u0447\u0435\u0440\u043f\u0430\u043d\u044b. \u041d\u0443\u0436\u043d\u0430 \u043f\u043e\u043c\u043e\u0449\u044c.")
                            return
                        # This is an answer to analyst's questions — relaunch
                        set_issue_in_progress(work_item_id)
                        task_desc = (
                            f"Work item: {work_item_id}\nRepo: {repo}\nBranch: {branch}\n"
                            f"Stage: analysis\nNote: Stakeholder answered your questions. "
                            f"Read the latest comment in Plane and revise your artifacts.\n"
                            f"Answer: {comment_body[:500]}"
                        )
                        new_job = enqueue_job("analyst", repo, task_desc, task_id=task_id)
                        from ..plane_sync import add_comment as _pc2
                        _pc2(work_item_id, "\U0001f504 Analyst \u043f\u0435\u0440\u0435\u0437\u0430\u043f\u0443\u0449\u0435\u043d \u0441 \u043e\u0442\u0432\u0435\u0442\u0430\u043c\u0438 \u0441\u0442\u0435\u0439\u043a\u0445\u043e\u043b\u0434\u0435\u0440\u0430.")
                        logger.info(f"Task {task_id}: stakeholder answered questions, enqueued analyst (job_id={new_job})")
                        return
            except Exception as e:
                logger.error(f"Failed to check issue state: {e}")
 async def _try_advance_stage(
@@ -343,10 +633,10 @@ async def _try_advance_stage(
    is synchronous. We run it off the event loop via asyncio.to_thread so there
    is exactly one implementation shared with the launcher.
-    finished_agent is None on this webhook path (a human :approved: comment, not
+    finished_agent is None on this webhook path (a human Approved status change,
-    a finished agent), so the agent-specific rollback branches inside the engine
+    not a finished agent), so the agent-specific rollback branches inside the
-    intentionally do not trigger — identical to the old plane behavior, which
+    engine intentionally do not trigger — the webhook path only runs the QG and
-    only ran the QG and either advanced or reported the failure.
+    either advances or reports the failure.
    """
    import asyncio
    from ..stage_engine import advance_stage
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -0,0 +1,40 @@
 """Global pytest fixtures.
 test(conftest): mute Telegram in ALL tests to stop prod leakage.
 Background: a pytest run on prod was sending REAL Telegram messages to Slava,
 because some tests (e.g. test_webhook_dedup advancing a stage) reach
 notify_stage_change -> send_telegram, which reads the live .env
 telegram_bot_token/chat_id and actually POSTs to Telegram.
 This autouse fixture stubs send_telegram to a no-op for every test:
  - "src.notifications.send_telegram" is the SOURCE. All the notify_* helpers in
    notifications.py call the module-global send_telegram, and every other module
    that does a *local* `from .notifications import send_telegram` inside a
    function resolves it live at call time -> covered by patching the source.
  - "src.stage_engine.send_telegram" is patched too, because stage_engine binds
    send_telegram as a MODULE-LEVEL name (from .notifications import send_telegram
    at import), so a patch of the source alone would not intercept its 3 direct
    calls. webhooks/plane and launcher import it locally inside functions, so the
    source patch already covers them; they are patched defensively with
    raising=False anyway in case that ever changes.
 raising=False so a module that doesn't (yet) expose the name never breaks setup.
 """
 import pytest
@pytest.fixture(autouse=True)
 def _no_telegram(monkeypatch):
    _noop = lambda *a, **k: None  # noqa: E731
    # Source of truth (covers notifications.notify_* and all local re-imports).
    monkeypatch.setattr("src.notifications.send_telegram", _noop, raising=False)
    # Module-level binding in stage_engine (and defensive coverage elsewhere).
    monkeypatch.setattr("src.stage_engine.send_telegram", _noop, raising=False)
    monkeypatch.setattr("src.webhooks.plane.send_telegram", _noop, raising=False)
    monkeypatch.setattr("src.agents.launcher.send_telegram", _noop, raising=False)
    monkeypatch.setattr("src.queue_worker.send_telegram", _noop, raising=False)
    yield
--- a/tests/test_analyst_comment.py
+++ b/tests/test_analyst_comment.py
@@ -0,0 +1,74 @@
 """BUG C: analyst "artifacts ready" comment under the status-only model.
 The comment must ask for the **Approved** status (not the obsolete
 ":approved:" reaction, not moving back to "In Progress") and link only the
 docs that actually exist in the worktree.
 """
 import os
 import tempfile
 os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
 os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token")
 def test_analyst_comment_asks_approved_with_links(monkeypatch, tmp_path):
    from src import stage_engine as SE
    # Worktree with only SOME of the candidate docs present.
    wt = tmp_path / "wt"
    docs = wt / "docs" / "work-items" / "ET-011"
    docs.mkdir(parents=True)
    for fname in ("00-business-request.md", "01-brd.md", "02-trz.md",
                  "03-acceptance-criteria.md", "04-test-plan.yaml"):
        (docs / fname).write_text("x")
    # 04b-ui-test-cases.md intentionally absent -> must NOT be linked
    monkeypatch.setattr(SE, "get_worktree_path", lambda repo, branch: str(wt))
    # public URL set -> links must be built from it (not gitea_url)
    monkeypatch.setattr(SE.settings, "gitea_url", "http://localhost:3000")
    monkeypatch.setattr(SE.settings, "gitea_public_url", "https://git.mva154.duckdns.org")
    monkeypatch.setattr(SE.settings, "gitea_owner", "admin")
    html = SE._build_analyst_ready_comment(
        "enduro-trails", "ET-011", "feature/ET-011-gpx-upload-feature"
    )
    # text asks for the Approved STATUS, not the obsolete mechanisms
    assert "Approved" in html
    assert ":approved:" not in html
    assert "In Progress" not in html
    assert "Rejected" in html
    # clickable links to docs that ACTUALLY exist
    assert "<a href=" in html
    base = ("https://git.mva154.duckdns.org/admin/enduro-trails/src/branch/"
            "feature/ET-011-gpx-upload-feature/docs/work-items/ET-011/")
    assert base + "01-brd.md" in html
    assert base + "04-test-plan.yaml" in html
    # the missing file is NOT invented
    assert "04b-ui-test-cases.md" not in html
    # internal git url must NOT appear in clickable links
    assert "localhost:3000" not in html
 def test_analyst_comment_falls_back_to_gitea_url(monkeypatch, tmp_path):
    """When gitea_public_url is empty, links fall back to gitea_url."""
    from src import stage_engine as SE
    wt = tmp_path / "wt"
    docs = wt / "docs" / "work-items" / "ET-011"
    docs.mkdir(parents=True)
    (docs / "01-brd.md").write_text("x")
    monkeypatch.setattr(SE, "get_worktree_path", lambda repo, branch: str(wt))
    monkeypatch.setattr(SE.settings, "gitea_url", "http://localhost:3000")
    monkeypatch.setattr(SE.settings, "gitea_public_url", "")
    monkeypatch.setattr(SE.settings, "gitea_owner", "admin")
    html = SE._build_analyst_ready_comment(
        "enduro-trails", "ET-011", "feature/ET-011-gpx-upload-feature"
    )
    base = ("http://localhost:3000/admin/enduro-trails/src/branch/"
            "feature/ET-011-gpx-upload-feature/docs/work-items/ET-011/")
    assert base + "01-brd.md" in html
--- a/tests/test_m6_sequence.py
+++ b/tests/test_m6_sequence.py
@@ -102,16 +102,22 @@ def test_fetch_sequence_id_missing_field_returns_none():
 # handle_work_item_created: seq available -> prefix-NNN
 # ---------------------------------------------------------------------------
 # Feature 1: pipeline starts on a status change to In Progress, not on creation.
 _IN_PROGRESS = "b873d9eb-993c-48cd-97ac-99a9b1623967"
 def _post(plane_id, plane_project_id=ORCH_PLANE_ID, name="A valid work item title"):
    return client.post(
        "/webhook/plane",
        json={
-            "event": "work_item.created",
+            "event": "issue",
            "action": "updated",
            "data": {
                "id": plane_id,
                "name": name,
                "description_stripped": "This is a sufficiently long description.",
                "project": plane_project_id,
                "state": {"id": _IN_PROGRESS, "name": "In Progress", "group": "started"},
            },
        },
    )
--- a/tests/test_pipeline_start_bugs.py
+++ b/tests/test_pipeline_start_bugs.py
@@ -0,0 +1,213 @@
 """Tests for the two pipeline-start bugs surfaced by the ET-006 live run.
 BUG 1: issue.updated (status -> In Progress) ships a payload WITHOUT the
       description, so start_pipeline must pull it from the Plane issue API
       before QG-0 runs (otherwise QG-0 wrongly blocks the issue).
 BUG 2a: M-6 derives work_item_id from the Plane sequence_id, which can collide.
        ensure_unique_work_item_id() must hand out the next FREE id instead of
        reusing one that is already in the tasks table.
 BUG 2b: two tasks with an (artificially) identical work_item_id must not share a
        branch/worktree.
 launcher / Gitea / Plane network are mocked. Real FastAPI endpoint via
 TestClient for the BUG 1 end-to-end path.
 """
 import os
 import tempfile
 _test_db = os.path.join(tempfile.gettempdir(), "test_orchestrator_pipeline_bugs.db")
 os.environ["ORCH_DB_PATH"] = _test_db
 os.environ.setdefault("ORCH_PLANE_WEBHOOK_SECRET", "")
 os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
 os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token")
 import pytest  # noqa: E402
 from unittest.mock import patch, AsyncMock  # noqa: E402
 from fastapi.testclient import TestClient  # noqa: E402
 from src.main import app  # noqa: E402
 from src.db import init_db, get_db, ensure_unique_work_item_id  # noqa: E402
 from src import projects as P  # noqa: E402
 from src.projects import reload_projects  # noqa: E402
 from src.git_worktree import get_worktree_path  # noqa: E402
 ENDURO_PLANE_ID = "7a79f0a9-5278-49cd-9007-9a338f238f9c"
 IN_PROGRESS = "b873d9eb-993c-48cd-97ac-99a9b1623967"
 BACKLOG = "113b24f6-cce8-4be9-9a22-a359b9cf0122"
 client = TestClient(app)
@pytest.fixture(autouse=True)
 def setup(monkeypatch):
    monkeypatch.setattr(P.settings, "db_path", _test_db)
    import src.db as _db
    monkeypatch.setattr(_db.settings, "db_path", _test_db)
    if os.path.exists(_test_db):
        os.unlink(_test_db)
    init_db()
    monkeypatch.setattr("src.webhooks.plane.verify_plane_signature", lambda body, sig: True)
    registry_json = (
        f'[{{"plane_project_id": "{ENDURO_PLANE_ID}", "repo": "enduro-trails",'
        f' "work_item_prefix": "ET", "name": "enduro-trails"}}]'
    )
    monkeypatch.setattr(P.settings, "projects_json", registry_json)
    reload_projects()
    yield
    reload_projects()
    if os.path.exists(_test_db):
        os.unlink(_test_db)
 def _insert_task(work_item_id, branch, plane_id="x"):
    conn = get_db()
    conn.execute(
        "INSERT INTO tasks (plane_id, work_item_id, repo, branch, stage, plane_issue_id) "
        "VALUES (?, ?, ?, ?, ?, ?)",
        (plane_id, work_item_id, "enduro-trails", branch, "analysis", plane_id),
    )
    conn.commit()
    conn.close()
 def _count(plane_id):
    conn = get_db()
    n = conn.execute("SELECT COUNT(*) FROM tasks WHERE plane_id=?", (plane_id,)).fetchone()[0]
    conn.close()
    return n
 def _task(plane_id):
    conn = get_db()
    row = conn.execute("SELECT * FROM tasks WHERE plane_id=?", (plane_id,)).fetchone()
    conn.close()
    return row
 # --------------------------------------------------------------------------- #
 # BUG 1
 # --------------------------------------------------------------------------- #
 def _to_in_progress_no_desc(plane_id="bug1"):
    """issue.updated payload WITHOUT description (only changed fields)."""
    return client.post("/webhook/plane", json={
        "event": "issue", "action": "updated",
        "data": {
            "id": plane_id, "name": "A valid backlog item title",
            # NO description / description_stripped here, exactly like Plane sends
            # on a status change.
            "project": ENDURO_PLANE_ID,
            "state": {"id": IN_PROGRESS, "name": "In Progress", "group": "started"},
        },
        "activity": {"field": "state", "new_value": IN_PROGRESS, "old_value": BACKLOG},
    })
@patch("src.webhooks.plane.enqueue_job", return_value=1)
@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
@patch("src.plane_sync.fetch_issue_sequence_id", return_value=42)
@patch("src.plane_sync.fetch_issue_fields",
       return_value=("A valid backlog item title",
                     "This is a sufficiently long description fetched from Plane API."))
 def test_status_start_fetches_description(
    mock_fields, mock_seq, mock_branch, mock_docs, mock_enqueue
 ):
    """BUG 1: empty description in payload -> start_pipeline pulls it from the
    Plane API (single fetch_issue_fields GET) -> QG-0 passes -> task created +
    analyst enqueued (NOT blocked)."""
    resp = _to_in_progress_no_desc("bug1")
    assert resp.status_code == 200
    # name + description were pulled from the API in one call
    mock_fields.assert_called_once()
    # QG-0 passed -> task created and analyst launched (NOT set_issue_blocked)
    assert _count("bug1") == 1
    assert _task("bug1")["stage"] == "analysis"
    mock_enqueue.assert_called_once()
    assert mock_enqueue.call_args.args[0] == "analyst"
@patch("src.webhooks.plane.enqueue_job", return_value=1)
@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
@patch("src.plane_sync.fetch_issue_sequence_id", return_value=42)
@patch("src.plane_sync.fetch_issue_fields", return_value=("", ""))
 def test_status_start_empty_api_still_blocks(
    mock_fields, mock_seq, mock_branch, mock_docs, mock_enqueue
 ):
    """BUG 1 negative path: if the API also returns empty, QG-0 legitimately
    fails -> NO task is created (truly empty ticket)."""
    resp = _to_in_progress_no_desc("bug1-empty")
    assert resp.status_code == 200
    mock_fields.assert_called_once()
    assert _count("bug1-empty") == 0
    mock_enqueue.assert_not_called()
 # --------------------------------------------------------------------------- #
 # BUG 2a
 # --------------------------------------------------------------------------- #
 def test_work_item_id_uniqueness():
    """BUG 2a: if ET-006 is already in tasks, the guard returns the next free
    id (ET-007), not ET-006 again."""
    _insert_task("ET-006", "feature/ET-006-gpx-upload", plane_id="old")
    assert ensure_unique_work_item_id("ET-006", "enduro-trails") == "ET-007"
    # ET-006 AND ET-007 taken -> next free is ET-008.
    _insert_task("ET-007", "feature/ET-007-something", plane_id="old2")
    assert ensure_unique_work_item_id("ET-006", "enduro-trails") == "ET-008"
    # A free id is returned unchanged.
    assert ensure_unique_work_item_id("ET-099", "enduro-trails") == "ET-099"
    # Per-repo isolation: a different repo with the same id is not a collision.
    assert ensure_unique_work_item_id("ET-006", "other-repo") == "ET-006"
@patch("src.webhooks.plane.enqueue_job", return_value=1)
@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
@patch("src.plane_sync.fetch_issue_sequence_id", return_value=6)
@patch("src.plane_sync.fetch_issue_fields",
       return_value=("Popup enduro trails feature",
                     "A sufficiently long description for QG-0 to pass cleanly."))
 def test_collision_reassigns_in_start_pipeline(
    mock_fields, mock_seq, mock_branch, mock_docs, mock_enqueue
 ):
    """BUG 2a end-to-end: ET-006 already exists -> a new In Progress issue whose
    Plane sequence_id is also 6 must NOT reuse ET-006."""
    _insert_task("ET-006", "feature/ET-006-gpx-upload", plane_id="task8")
    resp = client.post("/webhook/plane", json={
        "event": "issue", "action": "updated",
        "data": {
            "id": "task25", "name": "Popup enduro trails feature",
            "description_stripped": "A sufficiently long description for QG-0.",
            "project": ENDURO_PLANE_ID,
            "state": {"id": IN_PROGRESS, "name": "In Progress", "group": "started"},
        },
        "activity": {"field": "state", "new_value": IN_PROGRESS, "old_value": BACKLOG},
    })
    assert resp.status_code == 200
    new_id = _task("task25")["work_item_id"]
    assert new_id != "ET-006"
    assert new_id == "ET-007"
 # --------------------------------------------------------------------------- #
 # BUG 2b
 # --------------------------------------------------------------------------- #
 def test_worktree_per_task():
    """BUG 2b: two tasks must not resolve to the same worktree path. With the
    uniqueness guard the branches differ, so the worktree paths differ too."""
    _insert_task("ET-006", "feature/ET-006-gpx-upload", plane_id="task8")
    # The second task gets a unique id via the guard...
    new_id = ensure_unique_work_item_id("ET-006", "enduro-trails")
    assert new_id == "ET-007"
    branch_a = "feature/ET-006-gpx-upload"
    branch_b = f"feature/{new_id}-popup-enduro-trails"
    wt_a = get_worktree_path("enduro-trails", branch_a)
    wt_b = get_worktree_path("enduro-trails", branch_b)
    assert wt_a != wt_b, "two tasks must not share a worktree path"
--- a/tests/test_plane_author.py
+++ b/tests/test_plane_author.py
@@ -0,0 +1,99 @@
 """Tests for per-agent Plane comment authorship (feat: per-agent bot author).
 Covers:
  * _headers_for: role -> bot token; None/unknown/empty token -> shared fallback.
  * add_comment: author is propagated into the POST headers; no author keeps
    backward-compatible behaviour (shared orchestrator token).
 GET/PATCH calls are intentionally NOT covered here: they stay on the shared
 token by design and are unchanged by this feature.
 """
 import os
 # Set env defaults before importing app modules (same convention as the other
 # suites) so config/settings load cleanly without a real .env.
 os.environ.setdefault("ORCH_PLANE_API_TOKEN", "shared-token")
 os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
 from unittest.mock import patch, MagicMock  # noqa: E402
 from src import plane_sync  # noqa: E402
 # --------------------------------------------------------------------------- #
 # _headers_for
 # --------------------------------------------------------------------------- #
 def test_headers_for_known_role_uses_bot_token():
    """A known role with a configured token -> that bot's X-API-Key."""
    with patch.dict(plane_sync.PLANE_BOT_TOKENS, {"analyst": "analyst-tok"}, clear=False):
        assert plane_sync._headers_for("analyst") == {"X-API-Key": "analyst-tok"}
 def test_headers_for_none_falls_back_to_shared():
    """author=None -> shared orchestrator headers."""
    assert plane_sync._headers_for(None) is plane_sync.PLANE_HEADERS
 def test_headers_for_unknown_role_falls_back_to_shared():
    """Unknown role -> shared orchestrator headers."""
    assert plane_sync._headers_for("nope") is plane_sync.PLANE_HEADERS
 def test_headers_for_empty_token_falls_back_to_shared():
    """Known role but empty/unconfigured token -> shared orchestrator headers."""
    with patch.dict(plane_sync.PLANE_BOT_TOKENS, {"tester": ""}, clear=False):
        assert plane_sync._headers_for("tester") is plane_sync.PLANE_HEADERS
 def test_headers_for_empty_string_author_falls_back_to_shared():
    """author='' -> shared orchestrator headers."""
    assert plane_sync._headers_for("") is plane_sync.PLANE_HEADERS
 # --------------------------------------------------------------------------- #
 # add_comment
 # --------------------------------------------------------------------------- #
 def _mock_post_ok():
    resp = MagicMock()
    resp.raise_for_status.return_value = None
    return resp
 def test_add_comment_with_author_posts_with_bot_headers():
    """add_comment(author='developer') -> httpx.post called with the developer
    bot's X-API-Key header."""
    with patch.object(plane_sync, "find_issue_id", return_value="issue-uuid"), \
         patch.object(plane_sync, "_resolve_project_id", return_value="proj-uuid"), \
         patch.dict(plane_sync.PLANE_BOT_TOKENS, {"developer": "dev-tok"}, clear=False), \
         patch.object(plane_sync.httpx, "post", return_value=_mock_post_ok()) as mock_post:
        plane_sync.add_comment("ET-001", "hello", author="developer")
    assert mock_post.called
    _, kwargs = mock_post.call_args
    assert kwargs["headers"] == {"X-API-Key": "dev-tok"}
 def test_add_comment_without_author_uses_shared_token():
    """add_comment without author -> shared orchestrator headers (backward
    compatible)."""
    with patch.object(plane_sync, "find_issue_id", return_value="issue-uuid"), \
         patch.object(plane_sync, "_resolve_project_id", return_value="proj-uuid"), \
         patch.object(plane_sync.httpx, "post", return_value=_mock_post_ok()) as mock_post:
        plane_sync.add_comment("ET-001", "hello")
    assert mock_post.called
    _, kwargs = mock_post.call_args
    assert kwargs["headers"] is plane_sync.PLANE_HEADERS
 def test_add_comment_unknown_author_uses_shared_token():
    """add_comment with an unknown role -> shared orchestrator headers."""
    with patch.object(plane_sync, "find_issue_id", return_value="issue-uuid"), \
         patch.object(plane_sync, "_resolve_project_id", return_value="proj-uuid"), \
         patch.object(plane_sync.httpx, "post", return_value=_mock_post_ok()) as mock_post:
        plane_sync.add_comment("ET-001", "hello", author="ghost")
    assert mock_post.called
    _, kwargs = mock_post.call_args
    assert kwargs["headers"] is plane_sync.PLANE_HEADERS
--- a/tests/test_plane_webhook.py
+++ b/tests/test_plane_webhook.py
@@ -73,16 +73,24 @@ def setup(monkeypatch):
        os.unlink(_test_db)
 # Feature 1: the pipeline now starts on a status change to In Progress (not on
 # creation). _post_created drives that status-change event so these ORCH-6
 # routing tests still exercise task creation through the new trigger.
 _IN_PROGRESS = "b873d9eb-993c-48cd-97ac-99a9b1623967"
 def _post_created(plane_project_id, plane_id="wi-1", name="A valid work item title"):
    return client.post(
        "/webhook/plane",
        json={
-            "event": "work_item.created",
+            "event": "issue",
            "action": "updated",
            "data": {
                "id": plane_id,
                "name": name,
                "description_stripped": "This is a sufficiently long description.",
                "project": plane_project_id,
                "state": {"id": _IN_PROGRESS, "name": "In Progress", "group": "started"},
            },
        },
    )
--- a/tests/test_qg.py
+++ b/tests/test_qg.py
@@ -17,7 +17,10 @@ from src.qg.checks import (
    check_ci_green,
    check_review_approved,
    check_tests_passed,
    check_tests_local,
    check_deploy_status,
 )
 from src.stages import get_qg_for_stage
@pytest.fixture(autouse=True)
@@ -186,3 +189,116 @@ class TestCheckTestsPassed:
        passed, reason = check_tests_passed("enduro-trails", "ET-001")
        assert passed is False
        assert "not found" in reason.lower()
 class TestCheckDeployStatus:
    """BUG 8: deploy -> done must be gated on the deployer's machine-readable
    deploy_status verdict in 14-deploy-log.md frontmatter, NOT the LLM exit code
    (always 0). Mirrors check_reviewer_verdict (reads ONLY the frontmatter field)."""
    def _write_log(self, repo_dir, content):
        wi_dir = repo_dir / "docs" / "work-items" / "ET-011"
        wi_dir.mkdir(parents=True)
        (wi_dir / "14-deploy-log.md").write_text(content)
    def test_success_verdict_passes(self, setup_work_item_dir):
        self._write_log(
            setup_work_item_dir,
            "---\ndeploy_status: SUCCESS\nversion: v0.0.3\n---\n\nDeployed OK.\n",
        )
        passed, reason = check_deploy_status("enduro-trails", "ET-011")
        assert passed is True
        assert "SUCCESS" in reason
    def test_failed_verdict_fails(self, setup_work_item_dir):
        self._write_log(
            setup_work_item_dir,
            "---\ndeploy_status: FAILED\nversion: v0.0.3\n---\n\npermission denied.\n",
        )
        passed, reason = check_deploy_status("enduro-trails", "ET-011")
        assert passed is False
        assert "FAILED" in reason
    def test_no_file_fails(self, setup_work_item_dir):
        passed, reason = check_deploy_status("enduro-trails", "ET-011")
        assert passed is False
        assert "not found" in reason.lower()
    def test_no_field_fails(self, setup_work_item_dir):
        # Frontmatter present but no deploy_status field -> must NOT pass.
        self._write_log(
            setup_work_item_dir,
            "---\nversion: v0.0.3\n---\n\nStatus: FAILED (prose only).\n",
        )
        passed, reason = check_deploy_status("enduro-trails", "ET-011")
        assert passed is False
    def test_prose_only_no_frontmatter_fails(self, setup_work_item_dir):
        # Prose mentioning SUCCESS but no machine-readable frontmatter -> fail.
        self._write_log(
            setup_work_item_dir,
            "# Deploy log\n\nStatus: SUCCESS (prose, not frontmatter).\n",
        )
        passed, reason = check_deploy_status("enduro-trails", "ET-011")
        assert passed is False
    def test_deploy_stage_qg_is_check_deploy_status(self):
        assert get_qg_for_stage("deploy") == "check_deploy_status"
    def test_registered_in_qg_checks(self):
        from src.qg.checks import QG_CHECKS
        assert QG_CHECKS.get("check_deploy_status") is check_deploy_status
 class TestDevelopmentStageQG:
    """BUG 6: development stage QG is now check_ci_green (CI is the authoritative
    gate), not the deprecated check_tests_local."""
    def test_development_qg_is_check_ci_green(self):
        assert get_qg_for_stage("development") == "check_ci_green"
    def test_check_tests_local_is_deprecated_and_unwired(self):
        # Kept in the registry for backward-compat, but not wired to any stage.
        from src.qg.checks import QG_CHECKS
        from src.stages import STAGE_TRANSITIONS
        assert "check_tests_local" in QG_CHECKS
        wired = {t.get("qg") for t in STAGE_TRANSITIONS.values()}
        assert "check_tests_local" not in wired
 class TestCheckTestsLocal:
    """BUG 5: check_tests_local must run pytest directly (not make, which is
    not installed in the orchestrator container)."""
    @patch("src.qg.checks.ensure_worktree")
    @patch("subprocess.run")
    def test_passes_on_returncode_zero(self, mock_run, mock_wt, tmp_path):
        mock_wt.return_value = str(tmp_path)
        mock_run.return_value = MagicMock(returncode=0, stdout="ok", stderr="")
        passed, reason = check_tests_local("enduro-trails", "feature/ET-001-x")
        assert passed is True
        assert reason == "Local tests passed"
    @patch("src.qg.checks.ensure_worktree")
    @patch("subprocess.run")
    def test_fails_on_nonzero_returncode(self, mock_run, mock_wt, tmp_path):
        mock_wt.return_value = str(tmp_path)
        mock_run.return_value = MagicMock(returncode=1, stdout="boom", stderr="trace")
        passed, reason = check_tests_local("enduro-trails", "feature/ET-001-x")
        assert passed is False
        assert "Local tests failed" in reason
    @patch("src.qg.checks.ensure_worktree")
    @patch("subprocess.run")
    def test_invokes_pytest_not_make(self, mock_run, mock_wt, tmp_path):
        """The subprocess call must be pytest, from src/api, against ../../tests/."""
        mock_wt.return_value = str(tmp_path)
        mock_run.return_value = MagicMock(returncode=0, stdout="", stderr="")
        check_tests_local("enduro-trails", "feature/ET-001-x")
        args, kwargs = mock_run.call_args
        cmd = args[0]
        assert "make" not in cmd
        assert cmd[:3] == ["python", "-m", "pytest"]
        assert "../../tests/" in cmd
        assert kwargs["cwd"] == os.path.join(str(tmp_path), "src", "api")
--- a/tests/test_stage_engine.py
+++ b/tests/test_stage_engine.py
@@ -69,6 +69,7 @@ def silence_side_effects(monkeypatch):
        "set_issue_needs_input",
        "set_issue_in_progress",
        "set_issue_blocked",
        "set_issue_done",
    ):
        monkeypatch.setattr(stage_engine, name, MagicMock())
@@ -177,6 +178,40 @@ class TestHappyPathAgentSelection:
        assert res.enqueued_agent is None
        assert _jobs() == []
    def test_deploy_success_syncs_plane_to_terminal_done(self, monkeypatch):
        """FIX 3: a successful deploy->done forces the Plane issue to terminal Done.
        Previously the task could stick on In Progress because the merge webhook
        completed it out-of-band. Now the engine drives set_issue_done() on the
        deploy->done success transition.
        """
        monkeypatch.setattr(
            stage_engine, "QG_CHECKS",
            {k: _pass for k in stage_engine.QG_CHECKS},
        )
        task_id = _make_task("deploy", wi="ET-012")
        res = advance_stage(
            task_id, "deploy", "enduro-trails", "ET-012",
            "feature/ET-012-x", finished_agent="deployer",
        )
        assert res.advanced is True
        assert _stage(task_id) == "done"
        # The terminal Plane sync was invoked with the work item id.
        stage_engine.set_issue_done.assert_called_once_with("ET-012")
    def test_non_terminal_advance_does_not_force_plane_done(self, monkeypatch):
        """set_issue_done must only fire on the terminal deploy->done transition."""
        monkeypatch.setattr(
            stage_engine, "QG_CHECKS",
            {k: _pass for k in stage_engine.QG_CHECKS},
        )
        task_id = _make_task("review")
        advance_stage(
            task_id, "review", "enduro-trails", "ET-001",
            "feature/ET-001-x", finished_agent=None,
        )
        stage_engine.set_issue_done.assert_not_called()
    def test_done_is_terminal(self):
        task_id = _make_task("done")
        res = advance_stage(task_id, "done", "enduro-trails", "ET-001",
@@ -203,10 +238,13 @@ class TestQgFailureDoesNotAdvance:
        assert _jobs() == []
    def test_webhook_path_emits_qg_failure_notification(self, monkeypatch):
-        """finished_agent=None -> generic QG-failure notification fires (plane parity)."""
+        """finished_agent=None -> generic QG-failure notification fires (plane parity).
        development stage QG is now check_ci_green (was check_tests_local).
        """
        monkeypatch.setattr(
            stage_engine, "QG_CHECKS",
-            {**stage_engine.QG_CHECKS, "check_tests_local": _fail("ci red")},
+            {**stage_engine.QG_CHECKS, "check_ci_green": _fail("ci red")},
        )
        task_id = _make_task("development")
        advance_stage(task_id, "development", "enduro-trails", "ET-001",
@@ -297,6 +335,59 @@ class TestTesterFail:
        assert _jobs() == []
 # ---------------------------------------------------------------------------
 # BUG 8: deploy verdict gates deploy -> done (not the LLM exit code)
 # ---------------------------------------------------------------------------
 class TestDeployVerdict:
    """deploy -> done must be gated on check_deploy_status (the deployer's
    machine-readable verdict), NOT on the LLM exit code (always 0)."""
    def test_failed_verdict_rolls_back_to_development(self, monkeypatch):
        # deployer finished (exit_code 0 from launcher), but verdict is FAILED.
        monkeypatch.setattr(
            stage_engine, "QG_CHECKS",
            {**stage_engine.QG_CHECKS,
             "check_deploy_status": _fail("Deploy status: FAILED")},
        )
        task_id = _make_task("deploy")
        res = advance_stage(task_id, "deploy", "enduro-trails", "ET-011",
                            "feature/ET-011-x", finished_agent="deployer")
        assert res.advanced is False
        assert res.rolled_back_to == "development"
        assert _stage(task_id) == "development"   # NOT done
        assert res.alerted is True
        assert stage_engine.set_issue_blocked.called
        assert stage_engine.send_telegram.called
    def test_no_deploy_log_rolls_back(self, monkeypatch):
        # No frontmatter field / no file -> check returns False -> rollback.
        monkeypatch.setattr(
            stage_engine, "QG_CHECKS",
            {**stage_engine.QG_CHECKS,
             "check_deploy_status": _fail("Deploy log not found (14-deploy-log.md)")},
        )
        task_id = _make_task("deploy")
        res = advance_stage(task_id, "deploy", "enduro-trails", "ET-011",
                            "feature/ET-011-x", finished_agent="deployer")
        assert res.advanced is False
        assert _stage(task_id) == "development"
    def test_success_verdict_advances_to_done(self, monkeypatch):
        monkeypatch.setattr(
            stage_engine, "QG_CHECKS",
            {**stage_engine.QG_CHECKS,
             "check_deploy_status": _pass},
        )
        task_id = _make_task("deploy")
        res = advance_stage(task_id, "deploy", "enduro-trails", "ET-011",
                            "feature/ET-011-x", finished_agent="deployer")
        assert res.advanced is True
        assert res.to_stage == "done"
        assert _stage(task_id) == "done"
        assert res.enqueued_agent is None   # no agent leaves deploy
        assert _jobs() == []
 # ---------------------------------------------------------------------------
 # Architect conflict -> rollback to analysis + enqueue analyst
 # ---------------------------------------------------------------------------
@@ -358,6 +449,63 @@ class TestAnalysisApprovedFlow:
        assert stage_engine.notify_approve_requested.called
        assert _jobs() == []
    def test_approved_verdict_advances_analysis_to_architecture(self, monkeypatch):
        """BUG 4: a human Approved STATUS (webhook path, finished_agent=None)
        must satisfy the analysis gate and advance analysis -> architecture,
        enqueuing the architect. The status-only approval must NOT re-run
        check_analysis_approved (which looks for an :approved: COMMENT and would
        otherwise wrongly block the advance).
        """
        # Make check_analysis_approved FAIL if it is ever called: the webhook
        # path must bypass it entirely (status == approval). If the engine were
        # to re-run the gate, this would block the advance and fail the test.
        monkeypatch.setattr(
            stage_engine, "QG_CHECKS",
            {
                **stage_engine.QG_CHECKS,
                "check_analysis_approved": _fail("no :approved: comment"),
            },
        )
        # Guard: the approval-flow (launcher-only) must NOT be invoked here.
        flow = MagicMock()
        monkeypatch.setattr(stage_engine, "_handle_analysis_approved_flow", flow)
        task_id = _make_task("analysis")
        res = advance_stage(
            task_id, "analysis", "enduro-trails", "ET-001",
            "feature/ET-001-x", finished_agent=None,
        )
        assert res.advanced is True
        assert res.to_stage == "architecture"
        assert _stage(task_id) == "architecture"
        assert res.enqueued_agent == "architect"
        # Sanity: agent for analysis is architect, never analyst (no re-run loop).
        assert get_agent_for_stage("analysis") == "architect"
        jobs = _jobs()
        assert len(jobs) == 1
        assert jobs[0]["agent"] == "architect"
        # The launcher-only approval-flow was NOT called on the webhook path.
        flow.assert_not_called()
    def test_launcher_path_does_not_advance_and_calls_flow(self, monkeypatch):
        """Regression: the launcher path (finished_agent='analyst') still routes
        into _handle_analysis_approved_flow and does NOT advance.
        """
        flow = MagicMock()
        monkeypatch.setattr(stage_engine, "_handle_analysis_approved_flow", flow)
        task_id = _make_task("analysis")
        res = advance_stage(
            task_id, "analysis", "enduro-trails", "ET-001",
            "feature/ET-001-x", finished_agent="analyst",
        )
        assert res.advanced is not True
        assert _stage(task_id) == "analysis"
        assert _jobs() == []
        flow.assert_called_once()
 # ---------------------------------------------------------------------------
 # launcher + plane both delegate to the engine
--- a/tests/test_stage_visibility.py
+++ b/tests/test_stage_visibility.py
@@ -0,0 +1,94 @@
 """Feature 3: stage visibility on the Plane board.
  * PLANE_STATES carries the 6 new per-stage / verdict UUIDs.
  * STAGE_TO_STATE maps architecture/development/review/testing to their
    dedicated board statuses (not all -> In Progress anymore).
  * set_issue_stage_state(work_item_id, stage) PATCHes the correct state UUID
    for a visible stage, and is a no-op for stages without one (analysis/deploy).
  * Needs Input / In Review / Blocked remain higher priority: their explicit
    setters use their own state, never overwritten by the stage map.
 httpx is mocked; no network.
 """
 import os
 os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token")
 os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
 from unittest.mock import patch, MagicMock  # noqa: E402
 from src import plane_sync as PS  # noqa: E402
 EXPECTED_UUIDS = {
    "architecture": "3020bbb7-6122-4663-930c-0315ba8dfa3d",
    "development": "9920609b-f140-4e46-ab95-89acda8412c8",
    "review": "ba0d802c-5218-41d4-ab43-978b0ea123ed",
    "testing": "7855d807-b1bf-42ef-8dae-6cde0df92d02",
    "approved": "a519a341-dada-4a91-8910-7604f82b79c5",
    "rejected": "ba958f3c-5db5-461d-8f82-89425e413b97",
 }
 def test_plane_states_has_new_uuids():
    for key, uuid in EXPECTED_UUIDS.items():
        assert PS.PLANE_STATES[key] == uuid
 def test_stage_to_state_maps_visible_stages():
    assert PS.STAGE_TO_STATE["architecture"] == EXPECTED_UUIDS["architecture"]
    assert PS.STAGE_TO_STATE["development"] == EXPECTED_UUIDS["development"]
    assert PS.STAGE_TO_STATE["review"] == EXPECTED_UUIDS["review"]
    assert PS.STAGE_TO_STATE["testing"] == EXPECTED_UUIDS["testing"]
    # analysis / deploy stay on In Progress; done stays Done.
    assert PS.STAGE_TO_STATE["analysis"] == PS.PLANE_STATES["in_progress"]
    assert PS.STAGE_TO_STATE["deploy"] == PS.PLANE_STATES["in_progress"]
    assert PS.STAGE_TO_STATE["done"] == PS.PLANE_STATES["done"]
 def _patch_resolution(monkey_targets):
    """Helper: patch find_issue_id + _resolve_project_id to skip the DB/network."""
    return monkey_targets
@patch("src.plane_sync.httpx.patch")
@patch("src.plane_sync.find_issue_id", return_value="issue-uuid")
@patch("src.plane_sync._resolve_project_id", return_value="proj-1")
 def test_set_issue_stage_state_patches_correct_uuid(mock_proj, mock_find, mock_patch):
    resp = MagicMock(); resp.raise_for_status.return_value = None
    mock_patch.return_value = resp
    PS.set_issue_stage_state("ET-1", "development")
    # the PATCH carried the development state UUID
    _, kwargs = mock_patch.call_args
    assert kwargs["json"]["state"] == EXPECTED_UUIDS["development"]
@patch("src.plane_sync.httpx.patch")
@patch("src.plane_sync.find_issue_id", return_value="issue-uuid")
@patch("src.plane_sync._resolve_project_id", return_value="proj-1")
 def test_set_issue_stage_state_noop_for_analysis(mock_proj, mock_find, mock_patch):
    # analysis has no dedicated board status -> no PATCH at all.
    PS.set_issue_stage_state("ET-1", "analysis")
    mock_patch.assert_not_called()
    PS.set_issue_stage_state("ET-1", "deploy")
    mock_patch.assert_not_called()
@patch("src.plane_sync.httpx.patch")
@patch("src.plane_sync.find_issue_id", return_value="issue-uuid")
@patch("src.plane_sync._resolve_project_id", return_value="proj-1")
 def test_priority_states_use_their_own_uuid(mock_proj, mock_find, mock_patch):
    """Needs Input / In Review / Blocked are set explicitly and take priority."""
    resp = MagicMock(); resp.raise_for_status.return_value = None
    mock_patch.return_value = resp
    PS.set_issue_needs_input("ET-1")
    assert mock_patch.call_args.kwargs["json"]["state"] == PS.PLANE_STATES["needs_input"]
    PS.set_issue_in_review("ET-1")
    assert mock_patch.call_args.kwargs["json"]["state"] == PS.PLANE_STATES["in_review"]
    PS.set_issue_blocked("ET-1")
    assert mock_patch.call_args.kwargs["json"]["state"] == PS.PLANE_STATES["blocked"]
--- a/tests/test_status_only_verdict.py
+++ b/tests/test_status_only_verdict.py
@@ -0,0 +1,200 @@
 """Status-only verdict model (bug 3 fix).
 The comment-based control mechanism (:approved: / :rejected: / answer-to-questions)
 was removed. The pipeline is driven SOLELY by Plane status changes. These tests
 lock in the new behaviour:
  * test_inreview_comment_does_not_revert       — bug 3 root: an In Review task,
    any comment arrives -> status NOT reverted, no agent launched.
  * test_any_comment_no_pipeline_action         — :approved: / :rejected: / plain
    text comment -> no status change, no enqueue.
  * test_approved_status_advances_without_inprogress_reset — Approved status
    advances WITHOUT an intermediate set_issue_in_progress reset.
  * test_rejected_status_pulls_reason_from_comment — Rejected status pulls the
    reason from the issue's latest comment (mocked GET comments).
 """
 import os
 import tempfile
 _test_db = os.path.join(tempfile.gettempdir(), "test_orchestrator_status_only.db")
 os.environ["ORCH_DB_PATH"] = _test_db
 os.environ.setdefault("ORCH_PLANE_WEBHOOK_SECRET", "")
 os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
 os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token")
 import pytest  # noqa: E402
 from unittest.mock import patch, AsyncMock  # noqa: E402
 from fastapi.testclient import TestClient  # noqa: E402
 from src.main import app  # noqa: E402
 from src.db import init_db, get_db  # noqa: E402
 from src import projects as P  # noqa: E402
 from src.projects import reload_projects  # noqa: E402
 ENDURO_PLANE_ID = "7a79f0a9-5278-49cd-9007-9a338f238f9c"
 APPROVED = "a519a341-dada-4a91-8910-7604f82b79c5"
 REJECTED = "ba958f3c-5db5-461d-8f82-89425e413b97"
 IN_REVIEW = "38fb1f64-aa1e-48a3-92e0-0b109679046b"
 client = TestClient(app)
@pytest.fixture(autouse=True)
 def setup(monkeypatch):
    monkeypatch.setattr(P.settings, "db_path", _test_db)
    import src.db as _db
    monkeypatch.setattr(_db.settings, "db_path", _test_db)
    if os.path.exists(_test_db):
        os.unlink(_test_db)
    init_db()
    monkeypatch.setattr("src.webhooks.plane.verify_plane_signature", lambda body, sig: True)
    registry_json = (
        f'[{{"plane_project_id": "{ENDURO_PLANE_ID}", "repo": "enduro-trails",'
        f' "work_item_prefix": "ET", "name": "enduro-trails"}}]'
    )
    monkeypatch.setattr(P.settings, "projects_json", registry_json)
    reload_projects()
    # Seed a task at the 'review' stage for plane_id 'r-1'.
    conn = get_db()
    conn.execute(
        "INSERT INTO tasks (plane_id, work_item_id, repo, branch, stage, plane_issue_id) "
        "VALUES (?, ?, ?, ?, ?, ?)",
        ("r-1", "ET-700", "enduro-trails", "feature/ET-700-x", "review", "r-1"),
    )
    conn.commit()
    conn.close()
    yield
    reload_projects()
    if os.path.exists(_test_db):
        os.unlink(_test_db)
 class _FakeResp:
    def __init__(self, status_code, payload):
        self.status_code = status_code
        self._payload = payload
    def json(self):
        return self._payload
 def _comment(text, plane_id="r-1"):
    return client.post("/webhook/plane", json={
        "event": "issue_comment", "action": "created",
        "data": {"work_item_id": plane_id, "comment_stripped": text,
                 "project": ENDURO_PLANE_ID},
    })
 def _status(state_id, plane_id="r-1", old="prev"):
    return client.post("/webhook/plane", json={
        "event": "issue", "action": "updated",
        "data": {
            "id": plane_id, "name": "Status task", "project": ENDURO_PLANE_ID,
            "state": {"id": state_id, "name": "X", "group": "started"},
        },
        "activity": {"field": "state", "new_value": state_id, "old_value": old},
    })
 def _stage(plane_id="r-1"):
    conn = get_db()
    row = conn.execute("SELECT stage FROM tasks WHERE plane_id=?", (plane_id,)).fetchone()
    conn.close()
    return row[0] if row else None
 # --------------------------------------------------------------------------- #
 # Bug 3 root: In Review must not revert on a comment.
 # --------------------------------------------------------------------------- #
@patch("src.webhooks.plane.enqueue_job")
@patch("src.plane_sync.set_issue_in_progress")
@patch("src.plane_sync._set_issue_state_direct")
@patch("src.plane_sync.update_issue_state")
 def test_inreview_comment_does_not_revert(
    mock_update_state, mock_set_direct, mock_sip, mock_enqueue
 ):
    """Bug 3: task in In Review, ANY comment arrives -> status NOT reverted to
    In Progress, NO agent launched. The analyst's own 'waiting for approval'
    comment used to echo back and self-hit -> reverted In Review -> In Progress.
    """
    # analyst's own echo comment
    resp = _comment("Готово, жду approved")
    assert resp.status_code == 200
    # no status changes whatsoever
    mock_sip.assert_not_called()
    mock_set_direct.assert_not_called()
    mock_update_state.assert_not_called()
    # no agent launched
    mock_enqueue.assert_not_called()
    # stage untouched
    assert _stage() == "review"
 # --------------------------------------------------------------------------- #
 # Any comment -> zero pipeline side-effects.
 # --------------------------------------------------------------------------- #
@pytest.mark.parametrize("text", [":approved:", ":rejected: bad", "plain text", ""])
@patch("src.webhooks.plane.enqueue_job")
@patch("src.webhooks.plane._try_advance_stage", new_callable=AsyncMock)
@patch("src.webhooks.plane._rollback_stage", new_callable=AsyncMock)
@patch("src.plane_sync.set_issue_in_progress")
@patch("src.plane_sync._set_issue_state_direct")
 def test_any_comment_no_pipeline_action(
    mock_set_direct, mock_sip, mock_rollback, mock_advance, mock_enqueue, text
 ):
    resp = _comment(text)
    assert resp.status_code == 200
    mock_advance.assert_not_called()
    mock_rollback.assert_not_called()
    mock_sip.assert_not_called()
    mock_set_direct.assert_not_called()
    mock_enqueue.assert_not_called()
    assert _stage() == "review"
 # --------------------------------------------------------------------------- #
 # Approved status advances WITHOUT in_progress reset.
 # --------------------------------------------------------------------------- #
@patch("src.plane_sync.set_issue_in_progress")
@patch("src.webhooks.plane._try_advance_stage", new_callable=AsyncMock)
 def test_approved_status_advances_without_inprogress_reset(mock_advance, mock_sip):
    resp = _status(APPROVED)
    assert resp.status_code == 200
    mock_advance.assert_awaited_once()
    # work_item_id passed positionally
    assert "ET-700" in mock_advance.call_args.args
    # bug 3 (cause B): NO intermediate set_issue_in_progress before advance.
    mock_sip.assert_not_called()
 # --------------------------------------------------------------------------- #
 # Rejected status pulls reason from latest comment.
 # --------------------------------------------------------------------------- #
@patch("src.webhooks.plane.httpx.get")
@patch("src.webhooks.plane._rollback_stage", new_callable=AsyncMock)
 def test_rejected_status_pulls_reason_from_comment(mock_rollback, mock_get):
    mock_get.return_value = _FakeResp(200, {"results": [
        {"comment_stripped": "old comment", "created_at": "2026-06-03T09:00:00Z"},
        {"comment_html": "<p>Needs more test coverage</p>",
         "created_at": "2026-06-03T11:30:00Z"},
    ]})
    resp = _status(REJECTED)
    assert resp.status_code == 200
    mock_rollback.assert_awaited_once()
    reason = mock_rollback.call_args.args[-1]
    # latest by created_at, HTML stripped
    assert "Needs more test coverage" in reason
    assert "<p>" not in reason
@patch("src.webhooks.plane.httpx.get")
@patch("src.webhooks.plane._rollback_stage", new_callable=AsyncMock)
 def test_rejected_status_no_comment_uses_fallback(mock_rollback, mock_get):
    mock_get.return_value = _FakeResp(200, {"results": []})
    resp = _status(REJECTED)
    assert resp.status_code == 200
    mock_rollback.assert_awaited_once()
    reason = mock_rollback.call_args.args[-1]
    assert "no reason comment" in reason
--- a/tests/test_status_trigger.py
+++ b/tests/test_status_trigger.py
@@ -0,0 +1,243 @@
 """Feature 1: pipeline starts on status -> In Progress, not on creation.
  * work_item.created / issue created -> NO task, NO branch, NO analyst.
  * issue updated -> In Progress (from backlog) -> task created + analyst enqueued.
  * a second In Progress update while the agent is busy -> NO duplicate, NO
    restart (busy-guard).
  * In Progress returned from Needs Input (agent idle) -> agent RELAUNCHED.
 launcher / Gitea network are mocked. Real FastAPI endpoint via TestClient.
 """
 import os
 import tempfile
 _test_db = os.path.join(tempfile.gettempdir(), "test_orchestrator_status_trigger.db")
 os.environ["ORCH_DB_PATH"] = _test_db
 os.environ.setdefault("ORCH_PLANE_WEBHOOK_SECRET", "")
 os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
 os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token")
 import pytest  # noqa: E402
 from unittest.mock import patch, AsyncMock  # noqa: E402
 from fastapi.testclient import TestClient  # noqa: E402
 from src.main import app  # noqa: E402
 from src.db import init_db, get_db  # noqa: E402
 from src import projects as P  # noqa: E402
 from src.projects import reload_projects  # noqa: E402
 ENDURO_PLANE_ID = "7a79f0a9-5278-49cd-9007-9a338f238f9c"
 IN_PROGRESS = "b873d9eb-993c-48cd-97ac-99a9b1623967"
 BACKLOG = "113b24f6-cce8-4be9-9a22-a359b9cf0122"
 client = TestClient(app)
@pytest.fixture(autouse=True)
 def setup(monkeypatch):
    monkeypatch.setattr(P.settings, "db_path", _test_db)
    import src.db as _db
    monkeypatch.setattr(_db.settings, "db_path", _test_db)
    if os.path.exists(_test_db):
        os.unlink(_test_db)
    init_db()
    monkeypatch.setattr("src.webhooks.plane.verify_plane_signature", lambda body, sig: True)
    registry_json = (
        f'[{{"plane_project_id": "{ENDURO_PLANE_ID}", "repo": "enduro-trails",'
        f' "work_item_prefix": "ET", "name": "enduro-trails"}}]'
    )
    monkeypatch.setattr(P.settings, "projects_json", registry_json)
    reload_projects()
    yield
    reload_projects()
    if os.path.exists(_test_db):
        os.unlink(_test_db)
 def _created(plane_id="st-created"):
    return client.post("/webhook/plane", json={
        "event": "issue", "action": "created",
        "data": {
            "id": plane_id, "name": "A valid backlog item title",
            "description_stripped": "A sufficiently long description for QG-0.",
            "project": ENDURO_PLANE_ID,
            "state": {"id": BACKLOG, "name": "Backlog", "group": "backlog"},
        },
    })
 def _to_in_progress(plane_id="st-1"):
    return client.post("/webhook/plane", json={
        "event": "issue", "action": "updated",
        "data": {
            "id": plane_id, "name": "A valid backlog item title",
            "description_stripped": "A sufficiently long description for QG-0.",
            "project": ENDURO_PLANE_ID,
            "state": {"id": IN_PROGRESS, "name": "In Progress", "group": "started"},
        },
        "activity": {"field": "state", "new_value": IN_PROGRESS, "old_value": BACKLOG},
    })
 def _count(plane_id):
    conn = get_db()
    n = conn.execute("SELECT COUNT(*) FROM tasks WHERE plane_id=?", (plane_id,)).fetchone()[0]
    conn.close()
    return n
 # --------------------------------------------------------------------------- #
@patch("src.webhooks.plane.enqueue_job")
@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
 def test_created_does_not_start_pipeline(mock_branch, mock_docs, mock_enqueue):
    resp = _created("st-created")
    assert resp.status_code == 200
    assert resp.json()["status"] == "accepted"
    # No task, no branch, no analyst enqueue.
    assert _count("st-created") == 0
    mock_branch.assert_not_called()
    mock_enqueue.assert_not_called()
@patch("src.webhooks.plane.enqueue_job")
@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
@patch("src.plane_sync.fetch_issue_sequence_id", return_value=5)
 def test_in_progress_starts_pipeline(mock_seq, mock_branch, mock_docs, mock_enqueue):
    mock_enqueue.return_value = 1
    resp = _to_in_progress("st-1")
    assert resp.status_code == 200
    assert resp.json()["status"] == "accepted"
    assert _count("st-1") == 1
    conn = get_db()
    task = conn.execute("SELECT * FROM tasks WHERE plane_id='st-1'").fetchone()
    conn.close()
    assert task["stage"] == "analysis"
    assert task["repo"] == "enduro-trails"
    mock_branch.assert_called_once()
    # analyst enqueued exactly once
    assert mock_enqueue.call_count == 1
    assert mock_enqueue.call_args.args[0] == "analyst"
@patch("src.webhooks.plane.enqueue_job")
@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
@patch("src.plane_sync.fetch_issue_sequence_id", return_value=5)
 def test_repeat_in_progress_while_job_active_does_not_relaunch(
    mock_seq, mock_branch, mock_docs, mock_enqueue
 ):
    """Status-only model busy-guard: a duplicate In Progress webhook that arrives
    while the stage agent still has a queued/running job must NOT relaunch the
    agent (no double launch).
    """
    mock_enqueue.return_value = 1
    _to_in_progress("st-2")
    assert _count("st-2") == 1
    assert mock_enqueue.call_count == 1
    # enqueue_job is mocked above, so no real job row exists. Seed an ACTIVE
    # (queued) job for the task so has_active_job_for_task() reports the agent as
    # busy -> the busy-guard fires.
    conn = get_db()
    task_id = conn.execute(
        "SELECT id FROM tasks WHERE plane_id='st-2'"
    ).fetchone()[0]
    conn.execute(
        "INSERT INTO jobs (agent, repo, task_id, status) VALUES (?, ?, ?, 'queued')",
        ("analyst", "enduro-trails", task_id),
    )
    conn.commit()
    conn.close()
    # Second In Progress update. DISTINCT body (different activity old_value) so
    # webhook dedup does NOT short-circuit it — this exercises the busy-guard in
    # handle_status_start, not the delivery-dedup layer.
    resp = client.post("/webhook/plane", json={
        "event": "issue", "action": "updated",
        "data": {
            "id": "st-2", "name": "A valid backlog item title",
            "description_stripped": "A sufficiently long description for QG-0.",
            "project": ENDURO_PLANE_ID,
            "state": {"id": IN_PROGRESS, "name": "In Progress", "group": "started"},
        },
        "activity": {"field": "state", "new_value": IN_PROGRESS, "old_value": "some-other-state"},
    })
    assert resp.status_code == 200
    assert _count("st-2") == 1          # still exactly one task
    assert mock_enqueue.call_count == 1  # analyst NOT re-enqueued (busy-guard)
@patch("src.webhooks.plane.add_comment", create=True)
@patch("src.webhooks.plane.enqueue_job")
@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
@patch("src.plane_sync.fetch_issue_sequence_id", return_value=5)
 def test_inprogress_from_needs_input_relaunches_analyst(
    mock_seq, mock_branch, mock_docs, mock_enqueue, mock_comment
 ):
    """Status-only answer-to-questions flow: an existing analysis task whose agent
    is IDLE (no active job — it went to Needs Input) is returned to In Progress
    -> the analyst is relaunched to read Slava's fresh comments.
    + double-webhook protection: a second In Progress while the relaunch job is
    active does NOT relaunch again.
    """
    mock_enqueue.return_value = 1
    # First In Progress: starts the pipeline (creates task + enqueues analyst).
    _to_in_progress("st-ni")
    assert _count("st-ni") == 1
    assert mock_enqueue.call_count == 1
    # The analyst finished and asked questions -> Needs Input. In our model that
    # means NO active job for the task (enqueue_job is mocked, so no job row).
    conn = get_db()
    task_id = conn.execute(
        "SELECT id FROM tasks WHERE plane_id='st-ni'"
    ).fetchone()[0]
    has_job = conn.execute(
        "SELECT COUNT(*) FROM jobs WHERE task_id=? AND status IN ('queued','running')",
        (task_id,),
    ).fetchone()[0]
    conn.close()
    assert has_job == 0  # agent idle
    # Slava answers + returns the issue to In Progress (distinct body).
    resp = client.post("/webhook/plane", json={
        "event": "issue", "action": "updated",
        "data": {
            "id": "st-ni", "name": "A valid backlog item title",
            "description_stripped": "A sufficiently long description for QG-0.",
            "project": ENDURO_PLANE_ID,
            "state": {"id": IN_PROGRESS, "name": "In Progress", "group": "started"},
        },
        "activity": {"field": "state", "new_value": IN_PROGRESS, "old_value": "needs-input"},
    })
    assert resp.status_code == 200
    assert _count("st-ni") == 1               # no duplicate task
    assert mock_enqueue.call_count == 2        # analyst RELAUNCHED
    assert mock_enqueue.call_args.args[0] == "analyst"
    # Seed an active job for the relaunch, then a SECOND In Progress webhook must
    # NOT relaunch again (busy-guard against double webhooks).
    conn = get_db()
    conn.execute(
        "INSERT INTO jobs (agent, repo, task_id, status) VALUES (?, ?, ?, 'running')",
        ("analyst", "enduro-trails", task_id),
    )
    conn.commit()
    conn.close()
    resp2 = client.post("/webhook/plane", json={
        "event": "issue", "action": "updated",
        "data": {
            "id": "st-ni", "name": "A valid backlog item title",
            "description_stripped": "A sufficiently long description for QG-0.",
            "project": ENDURO_PLANE_ID,
            "state": {"id": IN_PROGRESS, "name": "In Progress", "group": "started"},
        },
        "activity": {"field": "state", "new_value": IN_PROGRESS, "old_value": "x-y-z"},
    })
    assert resp2.status_code == 200
    assert mock_enqueue.call_count == 2        # still 2 — busy-guard held
--- a/tests/test_taskmd_description.py
+++ b/tests/test_taskmd_description.py
@@ -0,0 +1,138 @@
 """Tests for fix/taskmd-description (3 bugs at the analyst pipeline entry/exit):
 BUG A: start_pipeline built the analyst .task.md WITHOUT the description body
       (only Title), so analyst received a ~101-byte file and reported the
       "business request is empty". task_desc must now carry the description.
 BUG B: issue.updated ships only changed fields, so `name` is usually absent ->
       slug/branch became "untitled". start_pipeline must pull the real name
       from the Plane API (single fetch_issue_fields GET, above the slug build)
       so the branch slug is NOT "untitled".
 BUG C: the analyst "artifacts ready" comment used the obsolete ":approved:"
       wording. Under the status-only model it must ask for the **Approved**
       status (not ":approved:", not "In Progress") and link the docs that
       actually exist.
 """
 import os
 import tempfile
 _test_db = os.path.join(tempfile.gettempdir(), "test_orchestrator_taskmd_desc.db")
 os.environ["ORCH_DB_PATH"] = _test_db
 os.environ.setdefault("ORCH_PLANE_WEBHOOK_SECRET", "")
 os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
 os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token")
 import pytest  # noqa: E402
 from unittest.mock import patch, AsyncMock  # noqa: E402
 from fastapi.testclient import TestClient  # noqa: E402
 from src.main import app  # noqa: E402
 from src.db import init_db, get_db  # noqa: E402
 from src import projects as P  # noqa: E402
 from src.projects import reload_projects  # noqa: E402
 ENDURO_PLANE_ID = "7a79f0a9-5278-49cd-9007-9a338f238f9c"
 IN_PROGRESS = "b873d9eb-993c-48cd-97ac-99a9b1623967"
 BACKLOG = "113b24f6-cce8-4be9-9a22-a359b9cf0122"
 client = TestClient(app)
@pytest.fixture(autouse=True)
 def setup(monkeypatch):
    monkeypatch.setattr(P.settings, "db_path", _test_db)
    import src.db as _db
    monkeypatch.setattr(_db.settings, "db_path", _test_db)
    if os.path.exists(_test_db):
        os.unlink(_test_db)
    init_db()
    monkeypatch.setattr("src.webhooks.plane.verify_plane_signature", lambda body, sig: True)
    registry_json = (
        f'[{{"plane_project_id": "{ENDURO_PLANE_ID}", "repo": "enduro-trails",'
        f' "work_item_prefix": "ET", "name": "enduro-trails"}}]'
    )
    monkeypatch.setattr(P.settings, "projects_json", registry_json)
    reload_projects()
    yield
    reload_projects()
    if os.path.exists(_test_db):
        os.unlink(_test_db)
 def _task(plane_id):
    conn = get_db()
    row = conn.execute("SELECT * FROM tasks WHERE plane_id=?", (plane_id,)).fetchone()
    conn.close()
    return row
 # --------------------------------------------------------------------------- #
 # BUG A: description reaches the analyst .task.md
 # --------------------------------------------------------------------------- #
@patch("src.webhooks.plane.enqueue_job", return_value=1)
@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
@patch("src.plane_sync.fetch_issue_sequence_id", return_value=11)
@patch("src.plane_sync.fetch_issue_fields",
       return_value=("ET-011 real title",
                     "REAL BUSINESS REQUEST BODY: user wants GPX upload with "
                     "validation and a results map."))
 def test_taskdesc_includes_description(
    mock_fields, mock_seq, mock_branch, mock_docs, mock_enqueue
 ):
    resp = client.post("/webhook/plane", json={
        "event": "issue", "action": "updated",
        "data": {
            "id": "taskA",
            # status change payload: NO name, NO description (only changed field)
            "project": ENDURO_PLANE_ID,
            "state": {"id": IN_PROGRESS, "name": "In Progress", "group": "started"},
        },
        "activity": {"field": "state", "new_value": IN_PROGRESS, "old_value": BACKLOG},
    })
    assert resp.status_code == 200
    mock_enqueue.assert_called_once()
    # task_desc is the 3rd positional arg of enqueue_job(agent, repo, task_desc, ...)
    task_desc = mock_enqueue.call_args.args[2]
    assert "Description:" in task_desc
    # the actual description body (not just the Title) is in the file
    assert "REAL BUSINESS REQUEST BODY" in task_desc
    assert "results map" in task_desc
 # --------------------------------------------------------------------------- #
 # BUG B: name fetched from Plane API when payload is empty -> slug not untitled
 # --------------------------------------------------------------------------- #
@patch("src.webhooks.plane.enqueue_job", return_value=1)
@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
@patch("src.plane_sync.fetch_issue_sequence_id", return_value=11)
@patch("src.plane_sync.fetch_issue_fields",
       return_value=("GPX upload feature",
                     "A sufficiently long description so QG-0 passes cleanly."))
 def test_name_fetched_when_payload_empty(
    mock_fields, mock_seq, mock_branch, mock_docs, mock_enqueue
 ):
    resp = client.post("/webhook/plane", json={
        "event": "issue", "action": "updated",
        "data": {
            "id": "taskB",
            # NO name, NO description in the payload (Plane status-change shape)
            "project": ENDURO_PLANE_ID,
            "state": {"id": IN_PROGRESS, "name": "In Progress", "group": "started"},
        },
        "activity": {"field": "state", "new_value": IN_PROGRESS, "old_value": BACKLOG},
    })
    assert resp.status_code == 200
    mock_fields.assert_called_once()
    row = _task("taskB")
    assert row is not None
    branch = row["branch"]
    # slug derived from the fetched name -> "gpx-upload-feature", NOT untitled
    assert "untitled" not in branch
    assert "gpx-upload-feature" in branch
    # Title in the analyst task file is the fetched name, not "untitled"
    task_desc = mock_enqueue.call_args.args[2]
    assert "Title: GPX upload feature" in task_desc
--- a/tests/test_telegram_tracker.py
+++ b/tests/test_telegram_tracker.py
@@ -0,0 +1,518 @@
 """feat/telegram-live-tracker: tests for the live Telegram task tracker.
 Covers (per DEV_TASK_TELEGRAM_TRACKER.md):
  * short_model_name: provider/claude- prefix trimming.
  * render_task_tracker: per-stage line format (in↓/out↑, model, cost, minutes),
    the "⏸️ Ревью БРД · твоё время" line, the 💰 totals, and the finish block
    (⏱️ three times + 🔗/📦).
  * first message -> sendMessage stores message_id; transition -> editMessageText.
  * fallback: editMessageText fails -> a NEW message is sent and the id updated.
  * which alerts go out SEPARATELY (approve-gate / deploy-fail / agent-fail /
    error) vs which do NOT (QG-pending / agent-start / stage-transition).
 Isolated temp DB; no network (httpx is patched).
 """
 import os
 import tempfile
 os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token")
 os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
 _test_db = os.path.join(tempfile.gettempdir(), "test_orchestrator_tracker.db")
 os.environ["ORCH_DB_PATH"] = _test_db
 from unittest.mock import MagicMock, patch  # noqa: E402
 import pytest  # noqa: E402
 import src.db as db_module  # noqa: E402
 from src.db import init_db, get_db  # noqa: E402
 from src import notifications as N  # noqa: E402
 from src import usage as U  # noqa: E402
@pytest.fixture(autouse=True)
 def setup_db(monkeypatch):
    monkeypatch.setattr(db_module.settings, "db_path", _test_db, raising=False)
    if os.path.exists(_test_db):
        os.unlink(_test_db)
    init_db()
    # Re-enable send_telegram (conftest stubs it to a no-op); these tests patch
    # httpx / the lower-level helpers explicitly per case.
    yield
    if os.path.exists(_test_db):
        os.unlink(_test_db)
 # --------------------------------------------------------------------------- #
 # helpers to build a task + runs in the DB
 # --------------------------------------------------------------------------- #
 def _mk_task(stage="development", title="\u0422\u0440\u0435\u043a\u0438 \u0441 \u0437\u0443\u043c\u0430 z5",
             wid="ET-012", brd_start=None, brd_end=None):
    conn = get_db()
    cur = conn.execute(
        "INSERT INTO tasks (plane_id, work_item_id, repo, branch, stage, title, "
        "brd_review_started_at, brd_review_ended_at) "
        "VALUES (?, ?, ?, ?, ?, ?, ?, ?)",
        ("p1", wid, "enduro-trails", "feature/ET-012-x", stage, title,
         brd_start, brd_end),
    )
    tid = cur.lastrowid
    conn.commit()
    conn.close()
    return tid
 def _mk_run(task_id, agent, started, finished, in_tok, out_tok,
            cache_read=0, cache_creation=0, cost=0.0, model=None, exit_code=0):
    conn = get_db()
    cur = conn.execute(
        "INSERT INTO agent_runs (task_id, agent, started_at, finished_at, "
        "exit_code, input_tokens, output_tokens, cache_read_tokens, "
        "cache_creation_tokens, cost_usd, model) "
        "VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)",
        (task_id, agent, started, finished, exit_code, in_tok, out_tok,
         cache_read, cache_creation, cost, model),
    )
    rid = cur.lastrowid
    conn.commit()
    conn.close()
    return rid
 # --------------------------------------------------------------------------- #
 # short_model_name
 # --------------------------------------------------------------------------- #
 def test_short_model_name():
    assert U.short_model_name("tokenator/claude-opus-4-8") == "opus-4-8"
    assert U.short_model_name("vibecode/claude-sonnet-4.6") == "sonnet-4.6"
    assert U.short_model_name("claude-opus-4-8") == "opus-4-8"
    assert U.short_model_name("opus-4-8") == "opus-4-8"
    assert U.short_model_name(None) == ""
    assert U.short_model_name("") == ""
 def test_parse_usage_extracts_model_from_modelusage():
    blob = (
        '{"total_cost_usd":0.01,'
        '"usage":{"input_tokens":10,"output_tokens":5},'
        '"modelUsage":{"claude-opus-4-8":{"inputTokens":10,"outputTokens":5}}}'
    )
    u = U.parse_usage_from_text(blob)
    assert u["model"] == "claude-opus-4-8"
 # --------------------------------------------------------------------------- #
 # render_task_tracker
 # --------------------------------------------------------------------------- #
 def test_render_in_progress_stage_lines_and_totals():
    tid = _mk_task(stage="deploy", brd_start="2026-06-04 10:00:00",
                   brd_end="2026-06-04 10:08:00")
    # Analysis: 10м, 1.1M in (mostly cache) / 39.6k out, $2.38, opus-4-8
    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
            in_tok=1000, out_tok=39600, cache_read=1_100_000, cost=2.38,
            model="tokenator/claude-opus-4-8")
    _mk_run(tid, "architect", "2026-06-04 10:08:00", "2026-06-04 10:17:00",
            in_tok=500, out_tok=34400, cache_read=1_500_000, cost=2.24,
            model="tokenator/claude-opus-4-8")
    _mk_run(tid, "developer", "2026-06-04 10:17:00", "2026-06-04 10:28:00",
            in_tok=400, out_tok=45800, cache_read=8_400_000, cost=7.29,
            model="tokenator/claude-opus-4-8")
    _mk_run(tid, "reviewer", "2026-06-04 10:28:00", "2026-06-04 10:31:00",
            in_tok=300, out_tok=12900, cache_read=1_200_000, cost=1.53,
            model="vibecode/claude-sonnet-4.6")
    _mk_run(tid, "tester", "2026-06-04 10:31:00", "2026-06-04 10:36:00",
            in_tok=200, out_tok=19500, cache_read=1_200_000, cost=1.51,
            model="vibecode/claude-sonnet-4.6")
    # deployer started but not finished -> active "идёт" line.
    _mk_run(tid, "deployer", "2026-06-04 10:36:00", None,
            in_tok=0, out_tok=0, model=None, exit_code=None)
    text = N.render_task_tracker(tid)
    # Header in-progress
    assert text.startswith("\U0001f6e0\ufe0f ET-012 \u00b7 \u0422\u0440\u0435\u043a\u0438")
    # Per-stage format: in↓/out↑ · cost · model
    assert "\u2705 Analysis" in text
    assert "10\u043c" in text          # analysis duration
    assert "39.6k\u2191" in text       # analysis out
    assert "$2.38" in text
    assert "opus-4-8" in text
    assert "sonnet-4.6" in text        # reviewer/tester model
    # BRD review line (human time, ended)
    assert "\u0420\u0435\u0432\u044c\u044e \u0411\u0420\u0414" in text
    assert "\u0442\u0432\u043e\u0451 \u0432\u0440\u0435\u043c\u044f" in text
    # Active stage
    assert "\U0001f504 Deploy" in text
    assert "\u0438\u0434\u0451\u0442" in text
    # Totals line present with 💰
    assert "\U0001f4b0" in text
    # In-progress: no final ⏱️ line
    assert "\u0412\u0441\u0435\u0433\u043e" not in text
 def test_render_brd_review_waiting_shows_hourglass():
    tid = _mk_task(stage="analysis", brd_start="2026-06-04 10:00:00",
                   brd_end=None)
    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
            in_tok=1000, out_tok=39600, cache_read=1_100_000, cost=2.38,
            model="tokenator/claude-opus-4-8")
    text = N.render_task_tracker(tid)
    assert "\u0420\u0435\u0432\u044c\u044e \u0411\u0420\u0414" in text
    assert "\u23f3" in text  # hourglass while waiting
 def test_render_done_has_times_and_links():
    tid = _mk_task(stage="done", brd_start="2026-06-04 10:00:00",
                   brd_end="2026-06-04 10:08:00")
    # set created/updated to compute wall clock
    conn = get_db()
    conn.execute(
        "UPDATE tasks SET created_at='2026-06-04 09:00:00', "
        "updated_at='2026-06-04 09:56:00' WHERE id=?", (tid,))
    conn.commit()
    conn.close()
    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
            in_tok=1000, out_tok=39600, cache_read=1_100_000, cost=2.38,
            model="tokenator/claude-opus-4-8")
    _mk_run(tid, "deployer", "2026-06-04 09:50:00", "2026-06-04 09:56:00",
            in_tok=400, out_tok=22400, cache_read=1_600_000, cost=1.73,
            model="tokenator/claude-opus-4-8")
    with patch("src.notifications.httpx") as _hx:
        # No PR found -> just "📦 deployed"
        _resp = MagicMock(status_code=200)
        _resp.json.return_value = []
        _hx.get.return_value = _resp
        text = N.render_task_tracker(tid)
    assert text.startswith("\U0001f389 ET-012")
    assert "\u0413\u041e\u0422\u041e\u0412\u041e" in text
    # ⏱️ with three times
    assert "\u23f1\ufe0f" in text
    assert "\u0412\u0441\u0435\u0433\u043e" in text
    assert "\u0430\u0433\u0435\u043d\u0442\u044b" in text
    assert "\u0442\u0432\u043e\u0451" in text
    # 📦 deployed line
    assert "\U0001f4e6" in text
 def test_render_escapes_html_in_title():
    tid = _mk_task(stage="analysis", title="A <b>& B</b>")
    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
            in_tok=10, out_tok=5, cost=0.0)
    text = N.render_task_tracker(tid)
    assert "&lt;b&gt;" in text
    assert "&amp;" in text
 def test_render_omits_model_when_unknown():
    tid = _mk_task(stage="analysis")
    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
            in_tok=10, out_tok=5, cost=0.0, model=None)
    text = N.render_task_tracker(tid)
    # No trailing " · <model>" — line ends at cost.
    line = [l for l in text.splitlines() if l.startswith("\u2705 Analysis")][0]
    assert line.rstrip().endswith("$0.00")
 # --------------------------------------------------------------------------- #
 # tracker send / edit / fallback
 # --------------------------------------------------------------------------- #
 def test_first_call_sends_message_and_stores_id(monkeypatch):
    tid = _mk_task(stage="analysis")
    _mk_run(tid, "analyst", "2026-06-04 09:00:00", None, in_tok=0, out_tok=0,
            exit_code=None)
    sent = {}
    def _fake_send(text, disable_notification=False):
        sent["text"] = text
        sent["silent"] = disable_notification
        return 555
    monkeypatch.setattr(N, "send_telegram", _fake_send)
    monkeypatch.setattr(N, "edit_telegram", lambda *a, **k: (_ for _ in ()).throw(AssertionError("should not edit on first call")))
    N.update_task_tracker(tid)
    from src.db import get_tracker_message_id
    assert get_tracker_message_id(tid) == 555
    assert sent["silent"] is True  # tracker is silent
 def test_second_call_edits_existing_message(monkeypatch):
    tid = _mk_task(stage="development")
    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
            in_tok=10, out_tok=5, cost=0.1)
    from src.db import set_tracker_message_id
    set_tracker_message_id(tid, 777)
    edited = {}
    monkeypatch.setattr(N, "edit_telegram",
                        lambda mid, text: edited.update(mid=mid) or N.EDIT_OK)
    monkeypatch.setattr(N, "send_telegram",
                        lambda *a, **k: (_ for _ in ()).throw(AssertionError("should not send when edit succeeds")))
    N.update_task_tracker(tid)
    assert edited["mid"] == 777
 def test_fallback_to_new_message_when_edit_gone(monkeypatch):
    """edit returns 'gone' (message deleted/too old) -> send NEW + update id."""
    tid = _mk_task(stage="development")
    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
            in_tok=10, out_tok=5, cost=0.1)
    from src.db import set_tracker_message_id, get_tracker_message_id
    set_tracker_message_id(tid, 100)
    monkeypatch.setattr(N, "edit_telegram", lambda mid, text: N.EDIT_GONE)
    monkeypatch.setattr(N, "send_telegram", lambda text, disable_notification=False: 200)
    N.update_task_tracker(tid)
    assert get_tracker_message_id(tid) == 200  # id updated to the new message
 def test_not_modified_does_not_send_new_message(monkeypatch):
    """edit returns 'not_modified' -> NO new message, id unchanged (no dupe)."""
    tid = _mk_task(stage="development")
    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
            in_tok=10, out_tok=5, cost=0.1)
    from src.db import set_tracker_message_id, get_tracker_message_id
    set_tracker_message_id(tid, 100)
    monkeypatch.setattr(N, "edit_telegram", lambda mid, text: N.EDIT_NOT_MODIFIED)
    monkeypatch.setattr(N, "send_telegram",
                        lambda *a, **k: (_ for _ in ()).throw(AssertionError("must not send on not_modified")))
    N.update_task_tracker(tid)
    assert get_tracker_message_id(tid) == 100  # unchanged, no duplicate
 def test_transient_edit_failure_does_not_send_new_message(monkeypatch):
    """edit returns 'failed' (network/timeout/5xx) -> NO new message (no dupe)."""
    tid = _mk_task(stage="development")
    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
            in_tok=10, out_tok=5, cost=0.1)
    from src.db import set_tracker_message_id, get_tracker_message_id
    set_tracker_message_id(tid, 100)
    monkeypatch.setattr(N, "edit_telegram", lambda mid, text: N.EDIT_FAILED)
    monkeypatch.setattr(N, "send_telegram",
                        lambda *a, **k: (_ for _ in ()).throw(AssertionError("must not send on transient failure")))
    N.update_task_tracker(tid)
    assert get_tracker_message_id(tid) == 100  # unchanged, no duplicate
 # --------------------------------------------------------------------------- #
 # edit_telegram outcome classification (httpx mocked)
 # --------------------------------------------------------------------------- #
 def _edit_resp(ok, description=None):
    resp = MagicMock()
    body = {"ok": ok}
    if description is not None:
        body["description"] = description
    resp.json.return_value = body
    return resp
 def _patch_tg_creds(monkeypatch):
    monkeypatch.setattr(N._get_settings(), "telegram_bot_token", "T", raising=False)
    monkeypatch.setattr(N._get_settings(), "telegram_chat_id", "C", raising=False)
 def test_edit_telegram_ok(monkeypatch):
    _patch_tg_creds(monkeypatch)
    with patch("src.notifications.httpx") as hx:
        hx.post.return_value = _edit_resp(True)
        assert N.edit_telegram(1, "x") == N.EDIT_OK
 def test_edit_telegram_not_modified_is_success(monkeypatch):
    # 400 "message is not modified" -> success, not gone, no duplicate
    _patch_tg_creds(monkeypatch)
    with patch("src.notifications.httpx") as hx:
        hx.post.return_value = _edit_resp(
            False, "Bad Request: message is not modified: ...")
        assert N.edit_telegram(1, "x") == N.EDIT_NOT_MODIFIED
 def test_edit_telegram_exactly_the_same_is_not_modified(monkeypatch):
    _patch_tg_creds(monkeypatch)
    with patch("src.notifications.httpx") as hx:
        hx.post.return_value = _edit_resp(
            False, "Bad Request: specified new message content and reply markup "
                   "are exactly the same")
        assert N.edit_telegram(1, "x") == N.EDIT_NOT_MODIFIED
 def test_edit_telegram_message_not_found_is_gone(monkeypatch):
    _patch_tg_creds(monkeypatch)
    with patch("src.notifications.httpx") as hx:
        hx.post.return_value = _edit_resp(
            False, "Bad Request: message to edit not found")
        assert N.edit_telegram(1, "x") == N.EDIT_GONE
 def test_edit_telegram_cant_be_edited_is_gone(monkeypatch):
    _patch_tg_creds(monkeypatch)
    with patch("src.notifications.httpx") as hx:
        hx.post.return_value = _edit_resp(
            False, "Bad Request: message can't be edited")
        assert N.edit_telegram(1, "x") == N.EDIT_GONE
 def test_edit_telegram_unknown_400_is_failed(monkeypatch):
    # unknown 400 -> failed (NOT gone) -> caller won't duplicate
    _patch_tg_creds(monkeypatch)
    with patch("src.notifications.httpx") as hx:
        hx.post.return_value = _edit_resp(
            False, "Bad Request: some other unexpected error")
        assert N.edit_telegram(1, "x") == N.EDIT_FAILED
 def test_edit_telegram_timeout_is_failed(monkeypatch):
    _patch_tg_creds(monkeypatch)
    with patch("src.notifications.httpx") as hx:
        hx.post.side_effect = Exception("read timeout")
        assert N.edit_telegram(1, "x") == N.EDIT_FAILED
 def test_edit_telegram_5xx_is_failed(monkeypatch):
    # Telegram 5xx still returns ok:false w/o gone/not_modified markers
    _patch_tg_creds(monkeypatch)
    with patch("src.notifications.httpx") as hx:
        hx.post.return_value = _edit_resp(False, "Internal Server Error")
        assert N.edit_telegram(1, "x") == N.EDIT_FAILED
 # --------------------------------------------------------------------------- #
 # render: repeated stage attempt shows "попытка N"
 # --------------------------------------------------------------------------- #
 _POPYTKA = "\u043f\u043e\u043f\u044b\u0442\u043a\u0430"  # popytka
 def test_render_active_stage_shows_attempt_on_second_run():
    # Two reviewer runs while in review -> active line shows attempt 2.
    tid = _mk_task(stage="review")
    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
            in_tok=10, out_tok=5, cost=0.1, model="tokenator/claude-opus-4-8")
    _mk_run(tid, "developer", "2026-06-04 09:10:00", "2026-06-04 09:20:00",
            in_tok=10, out_tok=5, cost=0.1, model="tokenator/claude-opus-4-8")
    # First review run finished (sent back to dev), second review run active.
    _mk_run(tid, "reviewer", "2026-06-04 09:20:00", "2026-06-04 09:25:00",
            in_tok=10, out_tok=5, cost=0.1, model="vibecode/claude-sonnet-4.6",
            exit_code=0)
    _mk_run(tid, "reviewer", "2026-06-04 09:30:00", None,
            in_tok=0, out_tok=0, exit_code=None)
    text = N.render_task_tracker(tid)
    active = [l for l in text.splitlines()
              if l.startswith("\U0001f504") and "Review" in l][0]
    assert _POPYTKA in active
    assert "2" in active
    assert "\u0438\u0434\u0451\u0442" in active
 def test_render_active_stage_no_attempt_on_first_run():
    # Single reviewer run -> active line has NO attempt marker.
    tid = _mk_task(stage="review")
    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
            in_tok=10, out_tok=5, cost=0.1, model="tokenator/claude-opus-4-8")
    _mk_run(tid, "developer", "2026-06-04 09:10:00", "2026-06-04 09:20:00",
            in_tok=10, out_tok=5, cost=0.1, model="tokenator/claude-opus-4-8")
    _mk_run(tid, "reviewer", "2026-06-04 09:20:00", None,
            in_tok=0, out_tok=0, exit_code=None)
    text = N.render_task_tracker(tid)
    active = [l for l in text.splitlines()
              if l.startswith("\U0001f504") and "Review" in l][0]
    assert _POPYTKA not in active
    assert "\u0438\u0434\u0451\u0442" in active
 def test_render_finished_lines_unaffected_by_attempt_logic():
    # Completed (checkmark) lines never carry an attempt marker.
    tid = _mk_task(stage="review")
    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
            in_tok=10, out_tok=5, cost=0.1, model="tokenator/claude-opus-4-8")
    # developer ran twice (retry) but is a FINISHED stage now.
    _mk_run(tid, "developer", "2026-06-04 09:10:00", "2026-06-04 09:15:00",
            in_tok=10, out_tok=5, cost=0.1, model="tokenator/claude-opus-4-8")
    _mk_run(tid, "developer", "2026-06-04 09:16:00", "2026-06-04 09:20:00",
            in_tok=10, out_tok=5, cost=0.1, model="tokenator/claude-opus-4-8")
    text = N.render_task_tracker(tid)
    for l in text.splitlines():
        if l.startswith("\u2705"):
            assert _POPYTKA not in l
 # --------------------------------------------------------------------------- #
 # which alerts are SEPARATE vs tracker-only
 # --------------------------------------------------------------------------- #
 def test_approve_gate_sends_separate_message_and_starts_brd_clock(monkeypatch):
    tid = _mk_task(stage="analysis")
    calls = []
    monkeypatch.setattr(N, "send_telegram",
                        lambda text, disable_notification=False: calls.append((text, disable_notification)) or 1)
    monkeypatch.setattr(N, "update_task_tracker", lambda task_id: None)
    N.notify_approve_requested(tid)
    # exactly one SEPARATE (notifying) send for the approve gate
    assert len(calls) == 1
    assert calls[0][1] is False  # notifying
    assert "Approved" in calls[0][0]
    # BRD clock started
    conn = get_db()
    row = conn.execute("SELECT brd_review_started_at FROM tasks WHERE id=?", (tid,)).fetchone()
    conn.close()
    assert row[0] is not None
 def test_error_sends_separate_message(monkeypatch):
    tid = _mk_task(stage="development")
    calls = []
    monkeypatch.setattr(N, "send_telegram",
                        lambda text, disable_notification=False: calls.append((text, disable_notification)) or 1)
    N.notify_error(tid, "boom")
    assert len(calls) == 1
    assert calls[0][1] is False  # notifying
    assert "ERROR" in calls[0][0]
 def test_stage_change_does_not_send_separate_message(monkeypatch):
    tid = _mk_task(stage="development")
    sent = []
    monkeypatch.setattr(N, "send_telegram",
                        lambda text, disable_notification=False: sent.append(text) or 1)
    # tracker refresh is allowed (edit/send silent) but must NOT use send_telegram
    # for a separate notification; stub update to isolate.
    refreshed = []
    monkeypatch.setattr(N, "update_task_tracker", lambda task_id: refreshed.append(task_id))
    N.notify_stage_change(tid, "development", "review")
    assert sent == []            # no separate message
    assert refreshed == [tid]    # tracker refreshed instead
 def test_agent_started_does_not_send_separate_message(monkeypatch):
    tid = _mk_task(stage="analysis")
    sent = []
    monkeypatch.setattr(N, "send_telegram",
                        lambda text, disable_notification=False: sent.append(text) or 1)
    refreshed = []
    monkeypatch.setattr(N, "update_task_tracker", lambda task_id: refreshed.append(task_id))
    N.notify_agent_started(1, "analyst", tid)
    assert sent == []
    assert refreshed == [tid]
 def test_qg_failure_does_not_send_separate_message(monkeypatch):
    tid = _mk_task(stage="development")
    sent = []
    monkeypatch.setattr(N, "send_telegram",
                        lambda text, disable_notification=False: sent.append(text) or 1)
    N.notify_qg_failure(tid, "development", "check_ci_green", "CI state: pending")
    assert sent == []  # QG-pending is log-only, never a separate ping
--- a/tests/test_usage.py
+++ b/tests/test_usage.py
@@ -0,0 +1,309 @@
 """Feature 4: token / cost accounting tests.
 Covers:
  * parse_usage_from_text on a REAL claude --output-format json result blob
    (captured live from CLI 2.1.142), including a leading text line.
  * parse on garbage / missing JSON -> None (never raises).
  * record_usage writes the columns; NULLs when usage is None.
  * fmt_tokens / fmt_cost formatting.
  * usage_comment string format.
  * task_usage_summary / task_summary_comment aggregate over agent_runs.
 DB is an isolated temp file; no network or subprocess.
 """
 import os
 import tempfile
 os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token")
 os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
 _test_db = os.path.join(tempfile.gettempdir(), "test_orchestrator_usage.db")
 os.environ["ORCH_DB_PATH"] = _test_db
 import pytest  # noqa: E402
 from src import db as db_module  # noqa: E402
 from src.db import init_db, get_db  # noqa: E402
 from src import usage as U  # noqa: E402
 # Real claude --output-format json result object (captured from CLI 2.1.142).
 REAL_RESULT_JSON = (
    '{"type":"result","subtype":"success","is_error":false,"duration_ms":1795,'
    '"num_turns":1,"result":"Hi!","session_id":"abc",'
    '"total_cost_usd":0.0560175,'
    '"usage":{"input_tokens":45231,"cache_creation_input_tokens":7418,'
    '"cache_read_input_tokens":18500,"output_tokens":12100,'
    '"service_tier":"standard"},'
    '"modelUsage":{"claude-opus-4-7":{"inputTokens":6,"outputTokens":7}},'
    '"permission_denials":[]}'
 )
@pytest.fixture(autouse=True)
 def setup_db(monkeypatch):
    # get_db() reads settings.db_path live; pin it to our isolated DB.
    monkeypatch.setattr(db_module.settings, "db_path", _test_db, raising=False)
    if os.path.exists(_test_db):
        os.unlink(_test_db)
    init_db()
    yield
    if os.path.exists(_test_db):
        os.unlink(_test_db)
 # --------------------------------------------------------------------------- #
 # parsing
 # --------------------------------------------------------------------------- #
 def test_parse_real_result_json():
    u = U.parse_usage_from_text(REAL_RESULT_JSON)
    assert u is not None
    assert u["input_tokens"] == 45231
    assert u["output_tokens"] == 12100
    assert u["cache_read_tokens"] == 18500
    # FIX 2: cache_creation slice must now be parsed (was dropped before).
    assert u["cache_creation_tokens"] == 7418
    assert abs(u["cost_usd"] - 0.0560175) < 1e-9
 def test_parse_cache_creation_present():
    u = U.parse_usage_from_text(REAL_RESULT_JSON)
    assert u["cache_creation_tokens"] == 7418
 def test_parse_cache_creation_missing_defaults_zero():
    blob = (
        '{"total_cost_usd":0.01,'
        '"usage":{"input_tokens":10,"output_tokens":5,'
        '"cache_read_input_tokens":100}}'
    )
    u = U.parse_usage_from_text(blob)
    assert u["cache_creation_tokens"] == 0
    assert u["cache_read_tokens"] == 100
 def test_parse_with_leading_text():
    """The agent may print text before the trailing JSON; we still find it."""
    text = "some agent stdout line\nanother line\n" + REAL_RESULT_JSON
    u = U.parse_usage_from_text(text)
    assert u is not None
    assert u["input_tokens"] == 45231
    assert u["output_tokens"] == 12100
 def test_parse_garbage_returns_none():
    assert U.parse_usage_from_text("not json at all { broken") is None
    assert U.parse_usage_from_text("") is None
    assert U.parse_usage_from_text(None) is None
 def test_parse_json_without_usage_returns_none():
    assert U.parse_usage_from_text('{"hello":"world"}') is None
 def test_parse_from_log_missing_file_returns_none():
    assert U.parse_usage_from_log("/no/such/file.log") is None
 # --------------------------------------------------------------------------- #
 # record_usage
 # --------------------------------------------------------------------------- #
 def _new_run(agent="developer", task_id=1):
    conn = get_db()
    cur = conn.execute("INSERT INTO agent_runs (task_id, agent) VALUES (?, ?)", (task_id, agent))
    rid = cur.lastrowid
    conn.commit()
    conn.close()
    return rid
 def test_record_usage_writes_columns():
    rid = _new_run()
    u = U.parse_usage_from_text(REAL_RESULT_JSON)
    U.record_usage(rid, u)
    conn = get_db()
    row = conn.execute(
        "SELECT input_tokens, output_tokens, cache_read_tokens, "
        "cache_creation_tokens, cost_usd "
        "FROM agent_runs WHERE id=?", (rid,)
    ).fetchone()
    conn.close()
    assert row["input_tokens"] == 45231
    assert row["output_tokens"] == 12100
    assert row["cache_read_tokens"] == 18500
    # FIX 2: cache_creation column is now persisted.
    assert row["cache_creation_tokens"] == 7418
    assert abs(row["cost_usd"] - 0.0560175) < 1e-9
 def test_record_usage_none_writes_nulls():
    rid = _new_run()
    U.record_usage(rid, None)  # must not raise
    conn = get_db()
    row = conn.execute("SELECT input_tokens, cost_usd FROM agent_runs WHERE id=?", (rid,)).fetchone()
    conn.close()
    assert row["input_tokens"] is None
    assert row["cost_usd"] is None
 # --------------------------------------------------------------------------- #
 # formatting
 # --------------------------------------------------------------------------- #
 def test_fmt_tokens():
    assert U.fmt_tokens(6) == "6"
    assert U.fmt_tokens(1234) == "1.2k"
    assert U.fmt_tokens(45231) == "45.2k"
    assert U.fmt_tokens(2_500_000) == "2.5M"
    assert U.fmt_tokens(None) == "0"
 def test_fmt_cost():
    assert U.fmt_cost(0.21) == "$0.21"
    assert U.fmt_cost(0.0560175) == "$0.06"
    assert U.fmt_cost(None) == "$0.00"
 def test_usage_comment_format():
    # No cache -> in_total == input_tokens, no cached breakdown shown.
    u = {"input_tokens": 45231, "output_tokens": 12100, "cost_usd": 0.21}
    c = U.usage_comment("developer", u)
    assert "Developer" in c
    assert "45.2k in" in c
    assert "cached" not in c
    assert "12.1k out" in c
    assert "$0.21" in c
 def test_usage_comment_shows_full_input_with_cached():
    """FIX 2: in = input + cache_read + cache_creation, with cached breakdown."""
    u = {
        "input_tokens": 81,
        "cache_read_tokens": 8_400_000,
        "cache_creation_tokens": 100_000,
        "output_tokens": 45_800,
        "cost_usd": 7.29,
    }
    c = U.usage_comment("developer", u)
    # total in = 8_500_081 -> 8.5M ; cached = 8_500_000 -> 8.5M
    assert "8.5M in (8.5M cached)" in c
    assert "45.8k out" in c
    assert "$7.29" in c
 def test_usage_comment_no_cached_when_zero():
    u = {"input_tokens": 1234, "cache_read_tokens": 0,
         "cache_creation_tokens": 0, "output_tokens": 50, "cost_usd": 0.01}
    c = U.usage_comment("developer", u)
    assert "1.2k in" in c
    assert "cached" not in c
 # --------------------------------------------------------------------------- #
 # FIX 4: per-agent artifact links in finish comments
 # --------------------------------------------------------------------------- #
 def _ctx():
    return dict(repo="enduro-trails", branch="feature/ET-012-x",
               work_item_id="ET-012")
 def test_usage_comment_reviewer_links_review_doc():
    c = U.usage_comment("reviewer", {"input_tokens": 5}, **_ctx())
    assert "12-review.md" in c
    assert "ET-012" in c
 def test_usage_comment_tester_links_test_report():
    c = U.usage_comment("tester", {"input_tokens": 5}, **_ctx())
    assert "13-test-report.md" in c
 def test_usage_comment_deployer_links_deploy_log():
    c = U.usage_comment("deployer", {"input_tokens": 5}, **_ctx())
    assert "14-deploy-log.md" in c
 def test_usage_comment_developer_links_pr_and_branch():
    c = U.usage_comment("developer", {"input_tokens": 5}, pr_number=7, **_ctx())
    assert "pulls/7" in c
    assert "feature/ET-012-x" in c
 def test_usage_comment_architect_links_adr():
    c = U.usage_comment("architect", {"input_tokens": 5}, **_ctx())
    assert "06-adr" in c
 def test_usage_comment_no_links_without_context():
    """Without repo/branch context, no links are appended (no crash)."""
    c = U.usage_comment("reviewer", {"input_tokens": 5})
    assert "12-review.md" not in c
    assert "http" not in c
 # --------------------------------------------------------------------------- #
 # task summary
 # --------------------------------------------------------------------------- #
 def test_task_summary_aggregates_over_agents():
    # two runs for the same task: developer + tester
    for agent, ti, to, cost in [("developer", 1000, 200, 0.10), ("tester", 500, 100, 0.05)]:
        rid = _new_run(agent=agent, task_id=42)
        U.record_usage(rid, {"input_tokens": ti, "output_tokens": to,
                             "cache_read_tokens": 0, "cost_usd": cost})
    s = U.task_usage_summary(42)
    assert s["total_in"] == 1500
    assert s["total_out"] == 300
    assert abs(s["total_cost"] - 0.15) < 1e-9
    agents = {a for a, *_ in s["per_agent"]}
    assert agents == {"developer", "tester"}
    comment = U.task_summary_comment(42)
    assert "1.5k" in comment       # total in
    assert "$0.15" in comment       # total cost
    assert "Developer" in comment
    assert "Tester" in comment
 def test_task_summary_sums_all_three_input_components():
    """FIX 2: total_in = SUM(input + cache_read + cache_creation); total_cached too."""
    rid = _new_run(agent="developer", task_id=77)
    U.record_usage(rid, {
        "input_tokens": 100,
        "cache_read_tokens": 2000,
        "cache_creation_tokens": 900,
        "output_tokens": 50,
        "cost_usd": 0.10,
    })
    rid2 = _new_run(agent="tester", task_id=77)
    U.record_usage(rid2, {
        "input_tokens": 10,
        "cache_read_tokens": 500,
        "cache_creation_tokens": 0,
        "output_tokens": 5,
        "cost_usd": 0.05,
    })
    s = U.task_usage_summary(77)
    # total_in = (100+2000+900) + (10+500+0) = 3510
    assert s["total_in"] == 3510
    # total_cached = (2000+900) + (500+0) = 3400
    assert s["total_cached"] == 3400
    assert s["total_out"] == 55
    comment = U.task_summary_comment(77)
    assert "cached" in comment
 def test_task_summary_handles_null_cache_creation():
    """Pre-existing rows (NULL cache_creation) must not break aggregation."""
    rid = _new_run(agent="developer", task_id=88)
    conn = get_db()
    conn.execute(
        "UPDATE agent_runs SET input_tokens=100, cache_read_tokens=200, "
        "cache_creation_tokens=NULL, output_tokens=10, cost_usd=0.01 WHERE id=?",
        (rid,),
    )
    conn.commit()
    conn.close()
    s = U.task_usage_summary(88)  # must not raise
    assert s["total_in"] == 300   # 100 + 200 + (NULL->0)
    assert s["total_cached"] == 200
--- a/tests/test_verdict_status.py
+++ b/tests/test_verdict_status.py
@@ -0,0 +1,171 @@
 """Status-only verdict model: verdict statuses Approved / Rejected.
  * issue updated -> Approved  : calls _try_advance_stage, with NO intermediate
    set_issue_in_progress reset (bug 3 fix).
  * issue updated -> Rejected  : calls _rollback_stage, with the reason pulled
    from the issue's latest comment.
  * COMMENTS NEVER trigger the pipeline: a :approved: / :rejected: comment is a
    pure no-op (the comment-based control mechanism was removed).
 We mock the shared engine entry points (_try_advance_stage / _rollback_stage)
 and assert they fire ONLY for the status trigger, never for a comment.
 """
 import os
 import tempfile
 _test_db = os.path.join(tempfile.gettempdir(), "test_orchestrator_verdict.db")
 os.environ["ORCH_DB_PATH"] = _test_db
 os.environ.setdefault("ORCH_PLANE_WEBHOOK_SECRET", "")
 os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
 os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token")
 import pytest  # noqa: E402
 from unittest.mock import patch, AsyncMock  # noqa: E402
 from fastapi.testclient import TestClient  # noqa: E402
 from src.main import app  # noqa: E402
 from src.db import init_db, get_db  # noqa: E402
 from src import projects as P  # noqa: E402
 from src.projects import reload_projects  # noqa: E402
 ENDURO_PLANE_ID = "7a79f0a9-5278-49cd-9007-9a338f238f9c"
 APPROVED = "a519a341-dada-4a91-8910-7604f82b79c5"
 REJECTED = "ba958f3c-5db5-461d-8f82-89425e413b97"
 client = TestClient(app)
@pytest.fixture(autouse=True)
 def setup(monkeypatch):
    monkeypatch.setattr(P.settings, "db_path", _test_db)
    import src.db as _db
    monkeypatch.setattr(_db.settings, "db_path", _test_db)
    if os.path.exists(_test_db):
        os.unlink(_test_db)
    init_db()
    monkeypatch.setattr("src.webhooks.plane.verify_plane_signature", lambda body, sig: True)
    registry_json = (
        f'[{{"plane_project_id": "{ENDURO_PLANE_ID}", "repo": "enduro-trails",'
        f' "work_item_prefix": "ET", "name": "enduro-trails"}}]'
    )
    monkeypatch.setattr(P.settings, "projects_json", registry_json)
    reload_projects()
    # Seed a task at the 'review' stage for plane_id 'v-1'.
    conn = get_db()
    conn.execute(
        "INSERT INTO tasks (plane_id, work_item_id, repo, branch, stage, plane_issue_id) "
        "VALUES (?, ?, ?, ?, ?, ?)",
        ("v-1", "ET-500", "enduro-trails", "feature/ET-500-x", "review", "v-1"),
    )
    conn.commit()
    conn.close()
    yield
    reload_projects()
    if os.path.exists(_test_db):
        os.unlink(_test_db)
 def _status(state_id, plane_id="v-1", old="prev"):
    return client.post("/webhook/plane", json={
        "event": "issue", "action": "updated",
        "data": {
            "id": plane_id, "name": "Verdict task", "project": ENDURO_PLANE_ID,
            "state": {"id": state_id, "name": "X", "group": "started"},
        },
        "activity": {"field": "state", "new_value": state_id, "old_value": old},
    })
 def _comment(text, plane_id="v-1"):
    return client.post("/webhook/plane", json={
        "event": "issue_comment", "action": "created",
        "data": {"work_item_id": plane_id, "comment_stripped": text,
                 "project": ENDURO_PLANE_ID},
    })
 class _FakeResp:
    def __init__(self, status_code, payload):
        self.status_code = status_code
        self._payload = payload
    def json(self):
        return self._payload
 def _comments_response(comments):
    return _FakeResp(200, {"results": comments})
 # --------------------------------------------------------------------------- #
 # Approved status -> advance (no in_progress reset)
 # --------------------------------------------------------------------------- #
@patch("src.plane_sync.set_issue_in_progress")
@patch("src.webhooks.plane._try_advance_stage", new_callable=AsyncMock)
 def test_approved_status_advances(mock_advance, mock_sip):
    resp = _status(APPROVED)
    assert resp.status_code == 200
    mock_advance.assert_awaited_once()
    # advanced the right task (ET-500 at review)
    args = mock_advance.call_args.args
    assert "ET-500" in args  # work_item_id is passed positionally
    # bug 3 fix: handle_verdict no longer resets the status to In Progress.
    mock_sip.assert_not_called()
@patch("src.plane_sync.set_issue_in_progress")
@patch("src.webhooks.plane._rollback_stage", new_callable=AsyncMock)
@patch("src.webhooks.plane._try_advance_stage", new_callable=AsyncMock)
 def test_approved_comment_is_noop(mock_advance, mock_rollback, mock_sip):
    """Status-only model: a :approved: comment NEVER advances the pipeline."""
    resp = _comment(":approved:")
    assert resp.status_code == 200
    mock_advance.assert_not_called()
    mock_rollback.assert_not_called()
    mock_sip.assert_not_called()
 # --------------------------------------------------------------------------- #
 # Rejected status -> rollback (reason from latest comment)
 # --------------------------------------------------------------------------- #
@patch("src.webhooks.plane.httpx.get")
@patch("src.webhooks.plane._rollback_stage", new_callable=AsyncMock)
 def test_rejected_status_rolls_back(mock_rollback, mock_get):
    mock_get.return_value = _comments_response(
        [{"comment_stripped": "ADR missing tradeoffs",
          "created_at": "2026-06-03T10:00:00Z"}]
    )
    resp = _status(REJECTED)
    assert resp.status_code == 200
    mock_rollback.assert_awaited_once()
    # reason pulled from the latest comment
    reason = mock_rollback.call_args.args[-1]
    assert "ADR missing tradeoffs" in reason
@patch("src.webhooks.plane.httpx.get")
@patch("src.plane_sync.set_issue_in_progress")
@patch("src.webhooks.plane._rollback_stage", new_callable=AsyncMock)
@patch("src.webhooks.plane._try_advance_stage", new_callable=AsyncMock)
 def test_rejected_comment_is_noop(mock_advance, mock_rollback, mock_sip, mock_get):
    """Status-only model: a :rejected: comment NEVER rolls back the pipeline."""
    resp = _comment(":rejected: bad ADR")
    assert resp.status_code == 200
    mock_advance.assert_not_called()
    mock_rollback.assert_not_called()
    mock_sip.assert_not_called()
    mock_get.assert_not_called()
 # --------------------------------------------------------------------------- #
 # Unknown verdict status -> no-op
 # --------------------------------------------------------------------------- #
@patch("src.webhooks.plane._rollback_stage", new_callable=AsyncMock)
@patch("src.webhooks.plane._try_advance_stage", new_callable=AsyncMock)
 def test_other_status_no_verdict_action(mock_advance, mock_rollback):
    # In Review status is not a verdict -> neither advance nor rollback.
    resp = _status("38fb1f64-aa1e-48a3-92e0-0b109679046b")  # in_review
    assert resp.status_code == 200
    mock_advance.assert_not_called()
    mock_rollback.assert_not_called()
--- a/tests/test_webhook_dedup.py
+++ b/tests/test_webhook_dedup.py
@@ -211,14 +211,21 @@ def test_gitea_fallback_hash_when_no_delivery_header():
@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
 def test_plane_fallback_hash_dedup(mock_docs, mock_branch, mock_enqueue):
-    """Repeated identical Plane body -> first accepted+enqueue, repeat duplicate."""
+    """Repeated identical Plane body -> first accepted+enqueue, repeat duplicate.
    Feature 1: the pipeline now starts on a status change to In Progress, not on
    creation, so this drives the dedup test with an 'issue updated' event.
    """
    IN_PROGRESS = "b873d9eb-993c-48cd-97ac-99a9b1623967"
    body = {
-        "event": "work_item.created",
+        "event": "issue",
        "action": "updated",
        "data": {
            "id": "pd-001",
            "name": "Dedup plane task",
            "description_stripped": "A sufficiently long description for QG-0 to pass.",
            "project": "proj-1",
            "state": {"id": IN_PROGRESS, "name": "In Progress", "group": "started"},
        },
    }
    r1 = client.post("/webhook/plane", json=body)
--- a/tests/test_webhooks.py
+++ b/tests/test_webhooks.py
@@ -1,4 +1,5 @@
 import pytest
 import asyncio
 import os
 import tempfile
 from unittest.mock import patch, MagicMock, AsyncMock
@@ -95,27 +96,32 @@ def test_plane_webhook_generates_sequential_ids(mock_docs, mock_branch):
    assert ids[1] == "ET-002"
 APPROVED_STATE = "a519a341-dada-4a91-8910-7604f82b79c5"
 REJECTED_STATE = "ba958f3c-5db5-461d-8f82-89425e413b97"
@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
@patch("src.webhooks.plane.launcher")
 def test_plane_approved_advances_stage(mock_launcher, mock_docs, mock_branch, tmp_path, monkeypatch):
-    """Comment :approved: at stage=analysis → advance to architecture."""
+    """Status-only model: Approved STATUS at stage=analysis -> advance to
    architecture. A comment never triggers this.
    """
    # Patch repos_dir for QG check
    monkeypatch.setattr("src.qg.checks.settings.repos_dir", str(tmp_path))
-    # Create task first
+    # Seed an analysis task directly (creation no longer makes a task post-PR#11).
    client.post("/webhook/plane", json={
        "event": "work_item.created",
        "data": {"id": "adv-001", "name": "Advance test", "project": "proj-1"}
    })
    # Get the task to find work_item_id
    conn = get_db()
-    task = conn.execute("SELECT * FROM tasks WHERE plane_id = 'adv-001'").fetchone()
+    conn.execute(
        "INSERT INTO tasks (plane_id, work_item_id, repo, branch, stage, plane_issue_id) "
        "VALUES (?, ?, ?, ?, ?, ?)",
        ("adv-001", "ET-001", "enduro-trails", "feature/ET-001-x", "analysis", "adv-001"),
    )
    conn.commit()
    conn.close()
-    work_item_id = task["work_item_id"]
+    work_item_id = "ET-001"
-    # Create required analysis files
+    # Create required analysis files so the analysis QG passes.
    wi_dir = tmp_path / "enduro-trails" / "docs" / "work-items" / work_item_id
    wi_dir.mkdir(parents=True)
    (wi_dir / "01-brd.md").write_text("# BRD")
@@ -123,16 +129,15 @@ def test_plane_approved_advances_stage(mock_launcher, mock_docs, mock_branch, tm
    (wi_dir / "03-acceptance-criteria.md").write_text("# AC")
    (wi_dir / "04-test-plan.yaml").write_text("tests: []")
    # Mock launcher
    mock_launcher.launch.return_value = 1
-    # Send approved comment
+    # Send Approved STATUS change.
    resp = client.post("/webhook/plane", json={
-        "event": "comment.created",
+        "event": "issue", "action": "updated",
        "data": {
-            "work_item_id": "adv-001",
+            "id": "adv-001", "name": "Advance test", "project": "proj-1",
-            "comment": "Looks good :approved:"
+            "state": {"id": APPROVED_STATE, "name": "Approved", "group": "completed"},
-        }
+        },
    })
    assert resp.status_code == 200
@@ -143,29 +148,39 @@ def test_plane_approved_advances_stage(mock_launcher, mock_docs, mock_branch, tm
    assert task["stage"] == "architecture"
@patch("src.webhooks.plane.httpx.get")
@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
-def test_plane_rejected_rolls_back(mock_docs, mock_branch):
+def test_plane_rejected_rolls_back(mock_docs, mock_branch, mock_get):
-    """Comment :rejected: rolls back stage."""
+    """Status-only model: Rejected STATUS rolls back stage. A comment never
-    # Create task
+    triggers this; the reason is pulled from the latest comment.
-    client.post("/webhook/plane", json={
+    """
-        "event": "work_item.created",
+    class _R:
-        "data": {"id": "rej-001", "name": "Reject test", "project": "proj-1"}
+        status_code = 200
-    })
+        @staticmethod
        def json():
            return {"results": [
                {"comment_stripped": "missing ADR", "created_at": "2026-06-03T10:00:00Z"}
            ]}
    mock_get.return_value = _R()
-    # Manually set stage to architecture
+    # Seed an architecture task directly.
    conn = get_db()
-    conn.execute("UPDATE tasks SET stage = 'architecture' WHERE plane_id = 'rej-001'")
+    conn.execute(
        "INSERT INTO tasks (plane_id, work_item_id, repo, branch, stage, plane_issue_id) "
        "VALUES (?, ?, ?, ?, ?, ?)",
        ("rej-001", "ET-002", "enduro-trails", "feature/ET-002-x", "architecture", "rej-001"),
    )
    conn.commit()
    conn.close()
-    # Send rejected comment
+    # Send Rejected STATUS change.
    resp = client.post("/webhook/plane", json={
-        "event": "comment.created",
+        "event": "issue", "action": "updated",
        "data": {
-            "work_item_id": "rej-001",
+            "id": "rej-001", "name": "Reject test", "project": "proj-1",
-            "comment": "Not ready :rejected:"
+            "state": {"id": REJECTED_STATE, "name": "Rejected", "group": "cancelled"},
-        }
+        },
    })
    assert resp.status_code == 200
@@ -258,6 +273,46 @@ def test_gitea_ci_success_advances_to_review(mock_launcher, mock_ci):
    assert task["stage"] == "review"
@patch("src.webhooks.gitea.notify_qg_failure")
@patch("src.webhooks.gitea.launcher")
 def test_gitea_ci_failure_on_development_notifies_qg_failure(mock_launcher, mock_notify):
    """BUG 6: CI failure at development is now the authoritative QG gate failing.
    It must notify QG failure (not silently suppress) and must NOT advance the stage.
    """
    conn = get_db()
    conn.execute(
        "INSERT INTO tasks (plane_id, work_item_id, repo, branch, stage) VALUES (?, ?, ?, ?, ?)",
        ("ci-fail-001", "ET-011", "enduro-trails", "feature/ET-011-test", "development"),
    )
    conn.commit()
    conn.close()
    resp = client.post(
        "/webhook/gitea",
        json={
            "state": "failure",
            "branches": [{"name": "feature/ET-011-test"}],
            "repository": {"name": "enduro-trails"},
        },
        headers={"X-Gitea-Event": "status"},
    )
    assert resp.status_code == 200
    # QG failure was reported for the development stage with check_ci_green.
    assert mock_notify.called
    args, kwargs = mock_notify.call_args
    call = list(args) + list(kwargs.values())
    assert "development" in call
    assert "check_ci_green" in call
    # Stage did NOT advance.
    conn = get_db()
    task = conn.execute("SELECT * FROM tasks WHERE plane_id = 'ci-fail-001'").fetchone()
    conn.close()
    assert task["stage"] == "development"
 def test_gitea_webhook_pr():
    """PR event is accepted."""
    resp = client.post(
@@ -287,3 +342,158 @@ def test_plane_webhook_event_logged():
    conn.close()
    assert event is not None
    assert event["source"] == "plane"
 # ---------------------------------------------------------------------------
 # BUG 7: red CI on development must bounce the task back to the developer
 # (capped retries, symmetric to review REQUEST_CHANGES). These are pure-logic
 # tests: they invoke handle_ci_status() directly with mocked helpers so they do
 # not pass through the TestClient HMAC barrier (baseline 401s are off-limits).
 # ---------------------------------------------------------------------------
 def _ci_failure_payload():
    return {
        "state": "failure",
        "branches": [{"name": "feature/ET-011-test"}],
        "repository": {"name": "enduro-trails"},
    }
 def _mock_db_with_retry_count(count):
    """Build a get_db() mock whose retry_count query returns `count`."""
    conn = MagicMock()
    conn.execute.return_value.fetchone.return_value = {"cnt": count}
    return conn
@patch("src.webhooks.gitea.notify_error")
@patch("src.webhooks.gitea.notify_qg_failure")
@patch("src.webhooks.gitea.enqueue_job")
@patch("src.webhooks.gitea.update_task_stage")
@patch("src.webhooks.gitea.get_db")
@patch("src.webhooks.gitea.get_task_by_repo_branch")
@patch("src.webhooks.gitea.get_project_by_repo")
 def test_ci_failure_development_retries_developer_under_limit(
    mock_proj, mock_task, mock_get_db, mock_update_stage,
    mock_enqueue, mock_qg, mock_err,
 ):
    """retry_count < MAX_DEV_RETRIES → relaunch developer, stage untouched."""
    from src.webhooks.gitea import handle_ci_status
    mock_proj.return_value = {"repo": "enduro-trails"}
    mock_task.return_value = {
        "id": 1, "stage": "development", "work_item_id": "ET-011",
    }
    mock_get_db.return_value = _mock_db_with_retry_count(0)
    mock_enqueue.return_value = 42
    asyncio.run(handle_ci_status(_ci_failure_payload()))
    # QG failure was still reported (Slava sees both the failure and the retry).
    assert mock_qg.called
    # developer was re-enqueued.
    assert mock_enqueue.called
    assert mock_enqueue.call_args[0][0] == "developer"
    # No escalation.
    assert not mock_err.called
    # Stage stays on development — no update_task_stage in the CI-failure path.
    assert not mock_update_stage.called
@patch("src.webhooks.gitea.notify_error")
@patch("src.webhooks.gitea.notify_qg_failure")
@patch("src.webhooks.gitea.enqueue_job")
@patch("src.webhooks.gitea.update_task_stage")
@patch("src.webhooks.gitea.get_db")
@patch("src.webhooks.gitea.get_task_by_repo_branch")
@patch("src.webhooks.gitea.get_project_by_repo")
 def test_ci_failure_development_escalates_at_limit(
    mock_proj, mock_task, mock_get_db, mock_update_stage,
    mock_enqueue, mock_qg, mock_err,
 ):
    """retry_count >= MAX_DEV_RETRIES → escalate via notify_error, no relaunch."""
    from src.webhooks.gitea import handle_ci_status, MAX_DEV_RETRIES
    mock_proj.return_value = {"repo": "enduro-trails"}
    mock_task.return_value = {
        "id": 1, "stage": "development", "work_item_id": "ET-011",
    }
    mock_get_db.return_value = _mock_db_with_retry_count(MAX_DEV_RETRIES)
    asyncio.run(handle_ci_status(_ci_failure_payload()))
    # QG failure still reported.
    assert mock_qg.called
    # developer NOT re-enqueued at the cap.
    assert not mock_enqueue.called
    # Escalation message mentions CI failure.
    assert mock_err.called
    err_msg = " ".join(str(a) for a in mock_err.call_args[0])
    assert "Max developer retries" in err_msg
    assert "after CI failure" in err_msg
    # Stage untouched.
    assert not mock_update_stage.called
 # ---------------------------------------------------------------------------
 # BUG 8 (second door): a merged-PR webhook must NOT fake-complete a task that is
 # still in the deploy stage. On `deploy` done is gated by the deployer's verdict
 # (check_deploy_status via advance_stage), not by the merge event. For every
 # other stage the merge->done behaviour is preserved. Pure-logic tests: invoke
 # handle_pr() directly with mocked helpers (no HMAC barrier).
 # ---------------------------------------------------------------------------
 def _merged_pr_payload(branch="feature/ET-012-x"):
    return {
        "action": "closed",
        "pull_request": {
            "merged": True,
            "number": 7,
            "head": {"ref": branch},
        },
        "repository": {"name": "enduro-trails"},
    }
@patch("src.webhooks.gitea.notify_stage_change")
@patch("src.webhooks.gitea.update_task_stage")
@patch("src.webhooks.gitea.get_task_by_repo_branch")
@patch("src.webhooks.gitea.get_project_by_repo")
 def test_merge_on_deploy_stage_does_not_set_done(
    mock_proj, mock_task, mock_update_stage, mock_notify,
 ):
    """FIX 1: merge at deploy stage is ignored — done is gated by deployer verdict."""
    from src.webhooks.gitea import handle_pr
    mock_proj.return_value = {"repo": "enduro-trails"}
    mock_task.return_value = {
        "id": 1, "stage": "deploy", "work_item_id": "ET-012",
    }
    asyncio.run(handle_pr(_merged_pr_payload()))
    # The merge-driven done path must NOT run on deploy.
    assert not mock_update_stage.called
    assert not mock_notify.called
@patch("src.webhooks.gitea.notify_stage_change")
@patch("src.webhooks.gitea.update_task_stage")
@patch("src.webhooks.gitea.get_task_by_repo_branch")
@patch("src.webhooks.gitea.get_project_by_repo")
 def test_merge_on_non_deploy_stage_sets_done(
    mock_proj, mock_task, mock_update_stage, mock_notify,
 ):
    """FIX 1: merge behaviour is preserved for non-deploy stages (e.g. review)."""
    from src.webhooks.gitea import handle_pr
    mock_proj.return_value = {"repo": "enduro-trails"}
    mock_task.return_value = {
        "id": 2, "stage": "review", "work_item_id": "ET-013",
    }
    asyncio.run(handle_pr(_merged_pr_payload(branch="feature/ET-013-x")))
    # Non-deploy stages still get the merge-driven done.
    mock_update_stage.assert_called_once_with(2, "done")
    assert mock_notify.called
Author	SHA1	Message	Date
dev-bot	ec9aa74492	fix(tracker): no duplicate Telegram messages on not-modified/transient edits edit_telegram now returns a distinguishable outcome (ok\|not_modified\|gone\| failed) instead of a bare bool. update_task_tracker only sends a NEW message when the original is truly gone; not_modified and transient failures no longer spawn duplicate trackers or orphan the live one. render_task_tracker shows "попытка N" on an actively re-run stage (>=2 agent runs) so the text changes between review<->development cycles. Finished (✅) lines are unchanged. Tests: edit_telegram classification (ok/not_modified/gone/failed via mocked httpx), update_task_tracker (not_modified/failed -> no send, gone -> send+id), render attempt marker.	2026-06-04 13:20:40 +03:00
Slava	3e5c74ce4f	Merge pull request 'feat(telegram): live editable task tracker (Variant B+)' (#21 ) from feat/telegram-live-tracker into main	2026-06-04 11:46:21 +03:00
dev-bot	9a0298de9d	feat(telegram): live editable task tracker (Variant B+), replace 15-message spam Replace the ~15 separate Telegram messages per task (agent start/finish, stage transition, QG-pending, tech noise) with ONE live tracker message edited in place (editMessageText) on every stage transition. Only attention-worthy events are still sent as SEPARATE, notifying messages: approve-gate, deploy-fail, agent-fail, task error. - db.py: idempotent ALTERs — tasks.tracker_message_id, tasks.title, tasks.brd_review_started_at/ended_at, agent_runs.model. Helpers for tracker message_id + BRD-review clock. - usage.py: short_model_name() (strip provider/claude- prefix); parse model from result-JSON modelUsage; record_usage persists model. - notifications.py: render_task_tracker(task_id) (stateless render from agent_runs), update_task_tracker (sendMessage->store id->editMessageText with fallback to a new message, silent), edit_telegram(). Per-stage line in↓/out↑·cost·model, ⏸️ Ревью БРД (human time), 💰 totals, finish block (⏱️ wall/agents/yours, 🔗 PR · 📦). notify_* are now tracker-only/log-only except the four alerts. - stage_engine.py: stamp brd_review_ended on analysis->architecture advance. - webhooks/plane.py: persist task title on creation. - tests/test_telegram_tracker.py: render, short_model_name, send/edit/fallback, separate-vs-silent alert behavior.	2026-06-04 11:42:46 +03:00
Slava	2801983d7b	Merge pull request 'fix(observability): merge-gate on deploy, full token input, Plane Done, artifact links' (#20 ) from fix/observability-and-merge-gate into main	2026-06-04 11:21:50 +03:00
Dev Agent	61e26a8930	fix(observability): merge-gate on deploy, full token input, Plane Done, artifact links 1. BUG 8 (second door): merge webhook no longer fake-completes a task at the deploy stage; done is gated by the deployer verdict (check_deploy_status). Other stages keep merge->done. 2. Token accounting: parse+persist cache_creation_input_tokens (new idempotent agent_runs column). usage_comment / task_summary now show the FULL input (input + cache_read + cache_creation) with a cached breakdown. cost_usd untouched. 3. deploy->done success now forces the Plane issue to terminal Done state. 4. All agents (architect/developer/reviewer/tester/deployer) attach artifact links to their finish comment via gitea_public_url. Tests added for each fix; pytest 244 passed / 9 failed (off-limits HMAC group).	2026-06-04 11:17:58 +03:00
Slava	2629dffe1b	Merge pull request 'fix(deploy): gate deploy->done on deployer verdict, not LLM exit code' (#19 ) from fix/deploy-verdict-gate into main	2026-06-04 02:46:52 +03:00
dev-agent	e4a9c48395	fix(deploy): gate deploy->done on deployer verdict, not LLM exit code	2026-06-04 02:43:01 +03:00
Slava	a0621b9952	Merge pull request 'fix(ci): bounce task back to developer on red CI (capped retries)' (#18 ) from fix/ci-fail-retry-developer into main	2026-06-04 01:41:01 +03:00
Dev Agent	3a285de11d	fix(ci): bounce task back to developer on red CI (capped retries)	2026-06-04 01:39:40 +03:00
Slava	7922f6b67b	Merge pull request 'fix(qg): use check_ci_green instead of local tests on development stage' (#17 ) from fix/drop-local-tests-qg into main	2026-06-04 01:24:14 +03:00
Dev Agent	e15d339b14	fix(qg): use check_ci_green instead of local tests on development stage	2026-06-04 01:22:43 +03:00
Slava	994f73a78e	Merge pull request 'fix(qg): run pytest directly instead of make in check_tests_local' (#16 ) from fix/qg-pytest-no-make into main	2026-06-04 00:44:40 +03:00
orchestrator-dev	90c9ffe839	fix(qg): run pytest directly instead of make in check_tests_local	2026-06-04 00:43:04 +03:00
Slava	b6aa107f93	Merge pull request 'fix(stage): approved verdict advances analysis->architecture instead of re-running gate' (#15 ) from fix/approved-advances-stage into main	2026-06-03 23:31:45 +03:00
Dev Agent	0b8013cb06	fix(stage): approved verdict advances analysis->architecture instead of re-running gate	2026-06-03 23:30:08 +03:00
Slava	b01643fcc3	Merge pull request 'feat(config): external gitea_public_url for clickable doc links' (#14 ) from fix/gitea-public-url into main	2026-06-03 22:59:17 +03:00
Dev Agent	ca63bc26bb	feat(config): external gitea_public_url for clickable doc links	2026-06-03 22:58:18 +03:00
Slava	dce9ac806b	Merge pull request 'fix(pipeline): description+name to analyst, status-only analyst comment with doc links' (#13 ) from fix/taskmd-description into main	2026-06-03 22:45:17 +03:00
dev-agent	a9cdb17614	feat(plane): analyst comment asks for Approved status + links docs The analyst ready-comment used the obsolete :approved: wording (comment-based approve was removed in PR #12). Rewrite it for the status-only model: ask the stakeholder to move the issue to Approved (reject = reason comment + Rejected), and add clickable Gitea links to the analyst docs that actually exist in the worktree.	2026-06-03 22:42:53 +03:00
dev-agent	96c5e6b2f9	fix(pipeline): fetch issue name from Plane API on status-trigger start issue.updated ships only the changed fields, so name was absent and the branch slug became feature/<id>-untitled. Add fetch_issue_fields (single issue-detail GET returning name+description, reusing the endpoint/token of fetch_issue_description) and pull the name above the slug build. Empty name still falls back to untitled.	2026-06-03 22:42:53 +03:00
dev-agent	b91be74692	fix(pipeline): pass issue description to analyst task file start_pipeline built the analyst .task.md with only the Title, so the analyst received a ~101-byte file and reported the business request as empty even though the description was already fetched. Append the resolved description to task_desc.	2026-06-03 22:42:02 +03:00
Slava	2d392b6fc7	Merge pull request 'fix: status-only verdict — remove comment-based approve + fix bug 3 (echo self-hit)' (#12 ) from fix/status-only-verdict into main	2026-06-03 22:20:46 +03:00
Dev Agent	857bad314c	feat(webhook): pull reject reason from latest comment handle_verdict(rejected): the reason is now pulled from the issue latest Plane comment (_latest_comment_reason: GET comments, newest by created_at, HTML stripped) instead of a fixed stub. Slava writes the reason in a comment before flipping the status to Rejected. Falls back to a fixed note when there is no comment / the API call fails. tests: add test_status_only_verdict.py (test_inreview_comment_does_not_revert [bug 3 root], test_any_comment_no_pipeline_action, test_approved_status_advances_without_inprogress_reset, test_rejected_status_pulls_reason_from_comment) and test_inprogress_from_needs_input_relaunches_analyst in test_status_trigger.py. Rewrote the comment-based tests (test_verdict_status, test_plane_approved/ rejected in test_webhooks) under the status-only model: comments are no-ops, verdicts come from status changes.	2026-06-03 22:18:24 +03:00
Dev Agent	c4be50ee20	fix(webhook): drop redundant in_progress reset on Approved handle_verdict(approved): removed set_issue_in_progress(work_item_id) before _try_advance_stage. _try_advance_stage -> advance_stage -> plane_notify_stage already PATCHes the issue to the NEXT stage status, so the reset only made the board flicker In Progress before the next stage (part of bug 3).	2026-06-03 22:18:13 +03:00
Dev Agent	6b3e144949	fix(webhook): remove comment-based approve, keep status-only verdict Status-only verdict model: comments NEVER drive the pipeline. Removed the whole comment-based control mechanism from handle_comment (:approved: / :rejected: / answer-to-questions) which caused bug 3 (echo self-hit): the analyst posts its own "waiting for approval" comment, handle_comment catches its own comment and reverts In Review -> In Progress. handle_comment is now a pure logger with no side effects. handle_status_start: a return to In Progress on an EXISTING task (Slava answered the analyst questions in Needs Input) now RELAUNCHES the stage agent instead of being a no-op. Distinguished from a duplicate In Progress webhook via has_active_job_for_task() (new db helper): no active job => agent idle => relaunch; active job => busy => skip (no double launch).	2026-06-03 22:18:02 +03:00
Slava	cd73c75cda	Merge pull request 'fix: pipeline-start bugs (ET-006) — fetch description on status-start + work_item_id collision guard' (#11 ) from fix/pipeline-start-bugs into main	2026-06-03 21:14:44 +03:00
Dev Agent	c69e11348b	test(pipeline): cover status-start description fetch and work_item_id uniqueness - test_status_start_fetches_description: empty payload description -> pulled from Plane API (mocked) -> QG-0 passes, analyst enqueued. - test_status_start_empty_api_still_blocks: empty API -> honest QG-0 fail. - test_work_item_id_uniqueness: ET-006 taken -> next free id, per-repo isolation. - test_collision_reassigns_in_start_pipeline: end-to-end collision reassignment. - test_worktree_per_task: two tasks never share a worktree path.	2026-06-03 21:12:59 +03:00
Dev Agent	ac9f5a05a6	fix(work-item): prevent work_item_id collision and bind branch per task ET-006 was handed to two different tasks because M-6 derives work_item_id from the Plane sequence_id, which can collide -> the two tasks shared a branch/worktree slug prefix and stepped on each other. 2a: ensure_unique_work_item_id() is a uniqueness-guard LAYERED ON TOP of the M-6 derive (derive is untouched): if the derived ET-NNN already exists in tasks for the repo, it walks forward to the next free number. Applied in start_pipeline after the derive. 2b (defense-in-depth): worktree is keyed by branch; if the resulting branch is already owned by another task in the repo, disambiguate it with the unique work_item_id + plane id so two tasks can never share a worktree.	2026-06-03 21:12:51 +03:00
Dev Agent	fa746105fd	fix(webhook): fetch description from Plane API on status-start Plane issue.updated (status -> In Progress) ships only changed fields, so the webhook payload has no description and QG-0 wrongly blocked issues. start_pipeline now pulls the full description from the Plane issue detail API (reusing the same GET endpoint + shared token as fetch_issue_sequence_id) when the payload field is empty/short, before QG-0 runs. Empty API -> honest QG-0 fail (truly empty ticket).	2026-06-03 21:12:38 +03:00
Slava	4773137b52	Merge pull request 'feat: pipeline UX — status-trigger, verdict statuses, stage visibility, token usage' (#10 ) from feature/pipeline-ux into main	2026-06-03 18:27:07 +03:00
Dev Agent	7fd6529a35	test(conftest): mute Telegram in all tests to stop prod leakage A pytest run on prod was sending REAL Telegram messages to Slava: some tests (e.g. test_webhook_dedup advancing a stage) reach notify_stage_change -> send_telegram, which read the live .env token/chat_id and actually POSTed. Add an autouse fixture stubbing send_telegram to a no-op for every test. Patch the SOURCE src.notifications.send_telegram (covers all notify_* helpers and the many modules that do a local from .notifications import send_telegram inside functions) AND src.stage_engine.send_telegram (module-level binding, would not be intercepted by the source patch alone). webhooks/plane, launcher, queue_worker are patched defensively with raising=False. Verified: full suite run with FAKE telegram creds + an un-swallowable httpx.post trip-wire (BaseException, so send_telegram except Exception can not hide it) shows ZERO calls to api.telegram.org. Without the fixture the trip-wire fires, proving the guard is real.	2026-06-03 18:23:09 +03:00
Dev Agent	9a702a0216	feat(metrics): per-agent token/cost accounting Feature 4. claude is now launched with --output-format json; the run-log trailing result JSON is parsed (defensively, never fatal) for usage + total_cost_usd. New idempotent ALTERs add input_tokens/output_tokens/cache_read_tokens/cost_usd to agent_runs; the launcher monitor records usage per run, posts a per-agent finish comment under that agent bot (e.g. Developer gotov · 45.2k in / 12.1k out · $0.21), and the deployer posts an end-of-task summary (SUM over agent_runs GROUP BY agent) on done. New src/usage.py holds parse/format/record/summary helpers; test_usage.py covers parsing a real CLI JSON blob, NULL-on-garbage, recording, formatting, and the per-task aggregate.	2026-06-03 18:18:46 +03:00
Dev Agent	38a741d24e	feat(webhook): verdict via Approved/Rejected statuses (variant B) Feature 2. The issue updated dispatch (shipped with the status-trigger handler) also routes Approved -> _try_advance_stage (== :approved: comment) and Rejected -> _rollback_stage (== :rejected: comment). The :rejected: comment branch was refactored into the shared _rollback_stage so both mechanisms behave identically; a status reject passes Reason: (rejected via status, see latest comment) since no inline reason arrives with a status change. Comments stay fully working. This commit adds test_verdict_status.py proving both status and comment paths funnel into the same advance/rollback logic.	2026-06-03 18:18:36 +03:00
Dev Agent	09b1c5e1b9	feat(webhook): start pipeline on In Progress status (not on create) Feature 1. work_item.created no longer starts the pipeline (soft QG-0 log only); the issue stays in the backlog until moved to In Progress. The pipeline-start body is extracted into start_pipeline(); a new issue updated handler routes a state change to In Progress -> handle_status_start, which is idempotent: an existing task for the plane_id is NOT re-created or restarted (protects handle_comment, which also flips issues to In Progress). Real Plane payload: event=issue, action=updated, data.state.id. Existing m6/plane_webhook/dedup tests updated to drive the new trigger; new test_status_trigger.py covers created-no-op / start / idempotent.	2026-06-03 18:18:26 +03:00
Dev Agent	a4668c0303	feat(plane): stage visibility on board + verdict status UUIDs Feature 3 + Feature 2 infra. Extend the global PLANE_STATES with the 6 new enduro status UUIDs (architecture/development/review/testing + approved/rejected), remap STAGE_TO_STATE so the 4 mid-pipeline stages move the issue across its own board column instead of all sitting in In Progress, and add the set_issue_stage_state() helper. Needs Input / In Review / Blocked keep their own explicit setters and stay higher priority. TODO(ORCH-10): statuses are per-project; resolve per project when more projects are onboarded.	2026-06-03 18:18:17 +03:00
Slava	e9fd30528f	Merge pull request 'feat(plane): per-agent bot authorship for comments' (#9 ) from feature/plane-per-agent-author into main	2026-06-03 10:55:29 +03:00
Dev Agent	d305521067	feat(plane): per-agent bot authorship for comments add_comment now accepts an optional author (agent role) and POSTs under the matching Plane bot token via _headers_for(), so Plane shows the real author (Analyst/Architect/Developer/Reviewer/Tester/Deployer/Stream) instead of a single shared account. Unknown/empty roles or missing tokens fall back to the shared orchestrator token (autonomy preserved). GET/PATCH (find_issue_id, set_state) are unchanged and stay on the shared token. Call sites in stage_engine, launcher, webhooks/plane and the plane_sync notify helpers now pass author by stage role; stage transitions use stream. Adds tests/test_plane_author.py.	2026-06-03 10:53:25 +03:00
Dev Agent	30d6dd0557	feat(config): add per-agent Plane bot token settings Add 7 optional bot-token fields (plane_bot_analyst..stream) read from the ORCH_PLANE_BOT_* env vars, default empty. Required for per-agent comment authorship; empty values fall back to the shared orchestrator token.	2026-06-03 10:53:17 +03:00
Slava	12e2691a24	Merge pull request 'M-6: derive work_item_id from Plane sequence_id' (#8 ) from feature/ORCH-M6-plane-sequence into main	2026-06-03 10:04:32 +03:00