feat(staging): add isolated orchestrator-staging service (port 8501, separate DB)

- Add orchestrator-staging compose service under profile 'staging' so normal 'docker compose up -d' does NOT start it. - Port 8501 via command override; network_mode: host (no ports mapping needed). - DB isolation via separate volume ./data/staging:/app/data — physically separate from prod ./data/orchestrator.db on the host. - ORCH_DB_PATH=/app/data/orchestrator.db explicit in env (same container path, isolated by volume mount). - Add .env.staging.example with all required keys and placeholders. - Update .gitignore: add .env.staging and data/staging/ exclusions. - Add docs/STAGING.md: how to start staging, architecture table, roadmap. Refs: ORCH-31 (Stage 1 of 5)
Merge PR #27 : isolate webhook tests + add CI workflow (self-hosting gate)
2026-06-05 07:34:48 +03:00 · 2026-06-05 07:29:04 +03:00 · 2026-06-05 00:00:01 +03:00 · 2026-06-04 22:38:09 +03:00 · 2026-06-04 22:15:40 +03:00 · 2026-06-04 17:37:36 +03:00
13 changed files with 991 additions and 70 deletions
--- a/.env.staging.example
+++ b/.env.staging.example
@@ -0,0 +1,52 @@
+# STAGING env for orchestrator-staging (port 8501).
+# Plane/Gitea tokens and sandbox project — configured in ORCH-32.
+# On Stage 1 (ORCH-31) you can copy from prod .env, changing only isolation-related keys.
+#
+# DO NOT COMMIT the real .env.staging — this file is the template only.
+# Create .env.staging on the server and fill in real values before starting staging.
+
+# ── Plane ─────────────────────────────────────────────────────────────────────
+ORCH_PLANE_API_URL=http://localhost:8091
+ORCH_PLANE_API_TOKEN=<plane-api-token>
+ORCH_PLANE_WORKSPACE_SLUG=<workspace-slug>
+ORCH_PLANE_WEBHOOK_SECRET=<webhook-secret>
+
+# Per-agent Plane bot tokens (authorship in Plane comments).
+# Leave empty to use ORCH_PLANE_API_TOKEN fallback.
+ORCH_PLANE_BOT_ANALYST=
+ORCH_PLANE_BOT_ARCHITECT=
+ORCH_PLANE_BOT_DEVELOPER=
+ORCH_PLANE_BOT_REVIEWER=
+ORCH_PLANE_BOT_TESTER=
+ORCH_PLANE_BOT_DEPLOYER=
+ORCH_PLANE_BOT_STREAM=
+
+# ── Gitea ─────────────────────────────────────────────────────────────────────
+ORCH_GITEA_URL=http://localhost:3000
+ORCH_GITEA_PUBLIC_URL=https://git.mva154.duckdns.org
+ORCH_GITEA_TOKEN=<gitea-token>
+ORCH_GITEA_WEBHOOK_SECRET=<gitea-webhook-secret>
+
+# ── Telegram ──────────────────────────────────────────────────────────────────
+ORCH_TELEGRAM_BOT_TOKEN=<telegram-bot-token>
+ORCH_TELEGRAM_CHAT_ID=<telegram-chat-id>
+
+# ── Claude / repos ────────────────────────────────────────────────────────────
+ORCH_CLAUDE_BIN=/usr/bin/claude
+ORCH_REPOS_DIR=/repos
+ORCH_HOST_REPOS_DIR=/home/slin/repos
+
+# ── Database (ISOLATION KEY for staging) ─────────────────────────────────────
+# The staging volume mounts ./data/staging:/app/data, so the DB physically lives
+# at ./data/staging/orchestrator.db on the host — fully isolated from prod.
+# Do NOT change this path; isolation is achieved via the volume mount, not this path.
+ORCH_DB_PATH=/app/data/orchestrator.db
+
+# ── Concurrency / worker ──────────────────────────────────────────────────────
+ORCH_MAX_CONCURRENCY=1
+ORCH_QUEUE_POLL_INTERVAL=2.0
+
+# ── Deploy hook ───────────────────────────────────────────────────────────────
+DEPLOY_SSH_USER=slin
+DEPLOY_SSH_HOST=127.0.0.1
+DEPLOY_HOOK_SCRIPT=/home/slin/bin/enduro-deploy-hook.sh
--- a/.gitea/workflows/ci.yml
+++ b/.gitea/workflows/ci.yml
@@ -0,0 +1,22 @@
+name: CI
+on:
+  push:
+    branches: ["feature/**", "bugfix/**", "hotfix/**", "fix/**", "ci/**"]
+  pull_request:
+    branches: [main]
+
+jobs:
+  test:
+    runs-on: self-hosted
+    steps:
+      - uses: actions/checkout@v4
+      - name: Install dependencies
+        run: |
+          python3 -m pip install --user --upgrade pip
+          python3 -m pip install --user -r requirements.txt
+      - name: Test
+        env:
+          PYTHONPATH: ${{ github.workspace }}
+        run: |
+          export PATH="$HOME/.local/bin:$PATH"
+          python3 -m pytest tests/ -q
--- a/.gitignore
+++ b/.gitignore
@@ -5,3 +5,7 @@ __pycache__/
 data/
 *.db
 .pytest_cache/
+# ORCH-31: staging env (secrets, not committed — see .env.staging.example)
+.env.staging
+# ORCH-31: staging DB data directory
+data/staging/
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -25,3 +25,39 @@ services:
      - DEPLOY_HOOK_SCRIPT=/home/slin/bin/enduro-deploy-hook.sh
    group_add:
      - "999"
+
+  # ORCH-31: staging instance (port 8501, isolated DB).
+  # Starts ONLY with: docker compose --profile staging up -d orchestrator-staging
+  # Normal "docker compose up -d" does NOT start this service.
+  orchestrator-staging:
+    profiles:
+      - staging
+    build: .
+    container_name: orchestrator-staging
+    restart: unless-stopped
+    init: true
+    network_mode: host
+    command: ["uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8501"]
+    volumes:
+      - ./data/staging:/app/data
+      - /home/slin/repos:/repos
+      - /var/run/docker.sock:/var/run/docker.sock
+      - /usr/lib/node_modules/@anthropic-ai/claude-code:/opt/claude-code:ro
+      - /usr/bin/node:/usr/bin/node:ro
+      - /home/slin/.claude:/home/slin/.claude
+      - /home/slin/.claude.json:/home/slin/.claude.json:ro
+      - /home/slin/.orchestrator-ssh:/root/.ssh:ro
+    env_file: .env.staging
+    environment:
+      - ORCH_REPOS_DIR=/repos
+      - ORCH_HOST_REPOS_DIR=/home/slin/repos
+      - DEPLOY_SSH_USER=slin
+      - DEPLOY_SSH_HOST=127.0.0.1
+      - DEPLOY_HOOK_SCRIPT=/home/slin/bin/enduro-deploy-hook.sh
+      # Staging DB is isolated via ./data/staging volume mount.
+      # Inside the container the path remains /app/data/orchestrator.db (same default),
+      # but on the host it physically lives at ./data/staging/orchestrator.db — 
+      # completely separate from prod ./data/orchestrator.db.
+      - ORCH_DB_PATH=/app/data/orchestrator.db
+    group_add:
+      - "999"
--- a/docs/PRODUCT_VISION.md
+++ b/docs/PRODUCT_VISION.md
@@ -0,0 +1,132 @@
+# Product Vision — Автономная фабрика разработки (Orchestrator)
+
+> Мультиагентная платформа, которая превращает идею или баг в задеплоенный на прод результат — автономно, надёжно и дёшево.
+
+**Версия:** 1.0 · **Дата:** 2026-06-04 · **Статус:** концепция развития
+
+---
+
+## 1. Зачем это (бизнес-взгляд)
+
+### Проблема
+Классическая разработка — это люди-бутылочное-горлышко на каждом шаге: аналитик, архитектор, разработчик, ревьюер, тестировщик, деплой-инженер. Каждая передача задачи между ними — потеря времени, контекста и денег. Мелкая фича или баг едут днями.
+
+### Решение
+**Orchestrator** — это конвейер из ИИ-агентов, который проводит задачу через все стадии разработки сам: от бизнес-постановки до релиза на прод. Человек ставит задачу и принимает результат. Всё между — автономно.
+
+### Ценность
+- ⚡ **Скорость:** фича проходит полный цикл (анализ → архитектура → код → ревью → тесты → деплой) за ~35 минут без ручных вмешательств.
+- 💰 **Стоимость:** работа агентов в разы дешевле команды; адаптивный выбор моделей экономит на простых задачах.
+- 🎯 **Автономность:** 0 ручных пинков в штатном прогоне. Человек — постановщик и приёмщик, а не оператор.
+- 🛡️ **Надёжность:** многоуровневые гейты качества не пускают недоделку на прод.
+- 🔁 **Масштаб:** одна платформа ведёт несколько проектов; саму платформу можно тиражировать на новые хосты.
+
+---
+
+## 2. Как это работает (обзор)
+
+### Конвейер
+```
+created → analysis → architecture → development → review → testing → deploy → done
+```
+На каждом переходе стоит **quality gate** — автоматическая проверка, которая не пускает задачу дальше, пока стадия не выполнена честно:
+
+| Переход | Гейт | Что проверяет |
+|---|---|---|
+| analysis → architecture | check_analysis_approved | BRD/TRZ/AC готовы + апрув человека |
+| architecture → development | check_architecture_done | Архитектура/ADR зафиксированы |
+| development → review | check_ci_green | CI зелёный (тесты проходят) |
+| review → testing | check_reviewer_verdict | Машинный вердикт ревьюера: APPROVED |
+| testing → deploy | check_tests_passed | Машинный вердикт тестера (не подделать) |
+| deploy → done | check_deploy_status | Деплой реально успешен, лог в origin/main |
+
+### Агенты
+- **Analyst** — собирает бизнес-требования, пишет BRD/TRZ/критерии приёмки.
+- **Architect** — проектирует решение, фиксирует ADR.
+- **Developer** — пишет код в изолированном git-worktree.
+- **Reviewer** — ревьюит, выносит машинный вердикт.
+- **Tester** — прогоняет тесты, фиксирует результат в отчёте.
+- **Deployer** — мержит, тегирует, деплоит на прод, пишет deploy-log.
+
+### Объекты
+- **Project** — проект в реестре (Plane project ↔ git-репозиторий ↔ префикс задач).
+- **Work-Item** — задача, проходящая конвейер; на каждой стадии накапливает артефакты (00-business-request … 14-deploy-log).
+- **Job** — единица работы в очереди (atomic claim, ретраи, restart-safe).
+
+### Интеграции
+- **Plane** — управление задачами, статусы как триггеры конвейера, webhooks.
+- **Gitea** — репозитории, PR, защита main (pre-receive hook).
+- **Telegram** — живой трекер прогресса, апрувы, уведомления.
+- **LLM** — модели агентов (сейчас Claude, в планах мультипровайдерность).
+
+---
+
+## 3. Что уже сделано (фундамент)
+
+✅ **Автономный конвейер** — подтверждён живым прогоном: задача от issue до Done без ручных вмешательств (~35 мин).
+✅ **Очередь задач** — atomic claim, max_concurrency, ретраи, restart-safe.
+✅ **Изоляция через git-worktree** — каждая задача в своём дереве, без конфликтов в shared-репо.
+✅ **Машинные гейты качества** — вердикты читаются из структурированных артефактов, а не угадываются по тексту.
+✅ **Multi-repo** — платформа ведёт несколько проектов (enduro-trails, сам orchestrator).
+✅ **Идемпотентность webhooks** — дедуп по delivery-id, защита от дублей.
+✅ **Наблюдаемость** — учёт токенов и стоимости каждой задачи.
+✅ **Живой Telegram-трекер** — прогресс редактируется в одном сообщении, без спама.
+
+---
+
+## 4. Куда движемся (дорожная карта)
+
+Развитие сгруппировано в 5 стратегических направлений.
+
+### 🛡️ Надёжность и безопасность
+- **Post-deploy мониторинг + авто-rollback** — следить за продом после релиза, откатывать при деградации.
+- **Security-гейт** — secret-scanning + аудит зависимостей перед мержем.
+- **Бюджетный circuit-breaker** — хард-лимит стоимости на задачу, защита от «убегающих» расходов.
+- **Опциональная human-приёмка** — финальный взгляд человека для критичных фич.
+
+### 💰 Экономика и интеллект
+- **Мультипровайдерность LLM** — Claude, OpenRouter, другие провайдеры на выбор.
+- **Оценка задачи** — прогноз стоимости/времени до старта.
+- **Адаптивный выбор модели** — по сложности: тривиальное на дешёвой, сложное на сильной.
+- **Багфикс-трек** — упрощённый дешёвый путь для багов (без потери качества).
+
+### 🏗️ Платформа и масштаб
+- **Self-hosting** — оркестратор пилит сам себя через собственный конвейер.
+- **Саморазвитие** — петля уроков: ловить отклонения → фиксировать → предлагать улучшения.
+- **Онбординг проектов** — turnkey-заведение нового проекта в систему.
+- **Тиражирование** — развернуть платформу на новой инфраструктуре под ключ.
+
+### 💬 Взаимодействие с человеком
+- **UX/UI дизайнер** — макеты интерфейсов на этапе аналитики.
+- **Интерактивный аналитик** — живой диалог для уточнения требований и обсуждения макетов.
+- **Единые коммент-артефакты** — все агенты прикладывают результаты с кликабельными ссылками.
+- **Прямые ссылки в Telegram** — апрув в один клик, без блужданий.
+
+### 🧩 Расширение возможностей
+- **Тяжёлые расчёты данных** — опциональная стадия для миграций/обработки больших данных.
+- **Android-разработка** — мобильный стек через тот же конвейер.
+- **Декомпозиция эпиков** — большая фича → подзадачи → сборка.
+- **Управление зависимостями** — задача B ждёт задачу A.
+- **Code coverage gate** — защита покрытия тестами от деградации.
+- **База знаний проекта** — персистентный контекст для агентов.
+
+---
+
+## 5. Принципы (что для нас неизменно)
+
+1. **Автономность по умолчанию, человек — на ключевых развилках.** Машина делает, человек ставит и принимает.
+2. **Качество не приносится в жертву скорости/цене.** Удешевляем аналитику — гейты качества остаются. Урок дорого выученный: срезанная проверка = недоделка на проде.
+3. **Машинные вердикты, а не угадывание.** Гейты читают структурированные поля, а не ищут слова в тексте.
+4. **Самоизменение — только через PR + ревью + апрув.** Агент, меняющий агентов, всегда под контролем человека.
+5. **Документация — сразу, не потом.** Изменил функционал → обновил доки.
+6. **Прод — источник правды.** «Деплой прошёл» ≠ «работает». Проверяем реальный результат.
+
+---
+
+## 6. Видение в одну фразу
+
+> **Самодостаточная фабрика разработки, которая размножается, учится на ошибках, оценивает себя, бережёт бюджет и не ломает прод — превращая намерение человека в работающий продукт почти без его участия.**
+
+---
+
+*Документ поддерживается в репозитории orchestrator. Источник дорожной карты — задачи проекта ORCH в Plane (ORCH-7…ORCH-28).*
--- a/docs/PRODUCT_VISION.pptx
+++ b/docs/PRODUCT_VISION.pptx
--- a/docs/STAGING.md
+++ b/docs/STAGING.md
@@ -0,0 +1,85 @@
+# Staging Environment (ORCH-31)
+
+Orchestrator supports a permanent **staging instance** running on port **8501** with a
+fully-isolated SQLite database. The staging instance shares the same codebase and
+Dockerfile as production but is started under the `staging` Docker Compose profile so it
+**never starts accidentally** during a normal `docker compose up -d`.
+
+## Architecture
+
+| | Production | Staging |
+|---|---|---|
+| Port | 8500 | 8501 |
+| Container name | `orchestrator` | `orchestrator-staging` |
+| DB (host path) | `./data/orchestrator.db` | `./data/staging/orchestrator.db` |
+| DB (container path) | `/app/data/orchestrator.db` | `/app/data/orchestrator.db` |
+| env file | `.env` | `.env.staging` |
+| Compose profile | *(default)* | `staging` |
+
+DB isolation is achieved via a separate volume mount (`./data/staging:/app/data`), not by
+changing `ORCH_DB_PATH` — the container path stays identical while the host path is a
+different directory.
+
+## Prerequisites
+
+1. **`.env.staging`** — create from the template (see below). This file is **not committed**
+   to the repo (it contains secrets). Copy and fill in values before first start.
+2. **`./data/staging/`** directory — created automatically on first container start.
+
+### Create `.env.staging`
+
+```bash
+cd /home/slin/repos/orchestrator
+cp .env.staging.example .env.staging
+# Edit .env.staging — fill in real tokens / secrets.
+# At Stage 1 (ORCH-31) you can reuse prod values; sandbox Plane project
+# and isolated Gitea webhook will be wired in ORCH-32.
+nano .env.staging
+```
+
+## Starting Staging
+
+```bash
+cd /home/slin/repos/orchestrator
+docker compose --profile staging up -d orchestrator-staging
+```
+
+Check it is running:
+
+```bash
+docker ps | grep orchestrator-staging
+curl -s http://localhost:8501/health | python3 -m json.tool
+```
+
+## Stopping Staging
+
+```bash
+docker compose --profile staging stop orchestrator-staging
+# or remove the container entirely:
+docker compose --profile staging down orchestrator-staging
+```
+
+## Normal `up -d` does NOT start staging
+
+```bash
+# This starts ONLY the prod orchestrator (port 8500). Staging is NOT affected.
+docker compose up -d
+```
+
+The `profiles: [staging]` directive in `docker-compose.yml` ensures staging is
+completely invisible to commands that do not pass `--profile staging`.
+
+## Logs
+
+```bash
+docker logs -f orchestrator-staging
+```
+
+## Roadmap
+
+| Task | Description |
+|---|---|
+| **ORCH-31** *(this PR)* | Infra: compose service, .env template, gitignore, docs |
+| **ORCH-32** | Sandbox: isolated Plane project + Gitea repo for staging |
+| **ORCH-33** | Test suite running against staging endpoint |
+| **ORCH-34** | Deploy hook: promote `orchestrator:candidate` image to staging |
--- a/src/notifications.py
+++ b/src/notifications.py
@@ -68,16 +68,43 @@ def send_telegram(text: str, disable_notification: bool = False):
    return None


-def edit_telegram(message_id: int, text: str) -> bool:
-    """Edit an existing Telegram message. Returns True on success, else False.
+# edit_telegram outcome codes -> let update_task_tracker decide what to do:
+#   "ok"           edit applied -> nothing else to do
+#   "not_modified" Telegram says text is identical (400 "message is not
+#                  modified" / "exactly the same") -> success, NO new message
+#   "gone"         original message can't be edited (deleted / too old /
+#                  invalid id) -> caller must fall back to a NEW message
+#   "failed"       transient failure (network / timeout / 5xx / unknown 400)
+#                  -> caller must NOT send a new message (avoid duplicates)
+EDIT_OK = "ok"
+EDIT_NOT_MODIFIED = "not_modified"
+EDIT_GONE = "gone"
+EDIT_FAILED = "failed"

-    Used by the live tracker to refresh the single per-task message in place.
-    Never raises. A False return tells the caller to fall back to a new message
-    (e.g. the message is too old to edit / was deleted / 400).
+# Telegram error descriptions that mean the message is permanently un-editable
+# (it is gone / orphaned) -> fall back to a fresh message.
+_GONE_MARKERS = (
+    "message to edit not found",
+    "message can't be edited",
+    "message_id_invalid",
+)
+# Telegram "nothing changed" -> treat as success, never a duplicate.
+_NOT_MODIFIED_MARKERS = (
+    "message is not modified",
+    "exactly the same",
+)
+
+
+def edit_telegram(message_id: int, text: str) -> str:
+    """Edit an existing Telegram message. Never raises.
+
+    Returns a distinguishable outcome (see EDIT_* constants) so the caller can
+    tell apart "all good" / "nothing changed" / "message gone" / "transient
+    failure" and only fall back to a NEW message when the original is truly gone.
    """
    s = _get_settings()
    if not s.telegram_bot_token or not s.telegram_chat_id:
-        return False
+        return EDIT_FAILED
    try:
        url = f"https://api.telegram.org/bot{s.telegram_bot_token}/editMessageText"
        resp = httpx.post(
@@ -91,9 +118,32 @@ def edit_telegram(message_id: int, text: str) -> bool:
            timeout=5,
        )
        data = resp.json()
-        return bool(data.get("ok"))
-    except Exception:
-        return False
+        if data.get("ok"):
+            return EDIT_OK
+        # ok:false -> inspect the description to classify the 400.
+        desc = str(data.get("description") or "").lower()
+        if any(m in desc for m in _NOT_MODIFIED_MARKERS):
+            # Text is identical between transitions (e.g. repeat review cycle
+            # renders the same line). Nothing to do, NOT a duplicate.
+            logger.debug(
+                f"edit_telegram(mid={message_id}): not modified, skipping"
+            )
+            return EDIT_NOT_MODIFIED
+        if any(m in desc for m in _GONE_MARKERS):
+            logger.warning(
+                f"edit_telegram(mid={message_id}): message gone ({desc!r}), "
+                f"will fall back to a new message"
+            )
+            return EDIT_GONE
+        # Unknown 400 / other non-ok -> transient/unknown, do NOT duplicate.
+        logger.warning(
+            f"edit_telegram(mid={message_id}): edit failed ({desc!r})"
+        )
+        return EDIT_FAILED
+    except Exception as e:
+        # Network / timeout / 5xx -> transient, do NOT duplicate.
+        logger.warning(f"edit_telegram(mid={message_id}): transient error: {e}")
+        return EDIT_FAILED


 def _get_work_item_id(task_id: int) -> str:
@@ -280,11 +330,35 @@ def render_task_tracker(task_id: int) -> str:

    for stage_key, label, agent in _TRACKER_STAGES:
        run = last_done.get(agent)
-        if run is not None:
+        # The stage is "in progress" only when it is the task's current stage AND
+        # there is an unfinished run for its agent (the agent is actually still
+        # working). A finished run with no in-flight run -> show the \u2705 result,
+        # even if the task still sits in that stage (just-finished snapshot).
+        agent_runs = agent_runs_by_agent.get(agent, [])
+        has_inflight = any(ar["finished_at"] is None for ar in agent_runs)
+        is_active_stage = (
+            _STAGE_ACTIVE_AGENT.get(stage) == agent
+            and stage == stage_key
+            and (has_inflight or run is None)
+        )
+        if is_active_stage:
+            # Live "\U0001f504 ... \u0438\u0434\u0451\u0442" line. Count how many times THIS stage's
+            # agent has run for this task; a 2nd+ run means we're re-doing the
+            # stage (e.g. review->development->review), so show "\u043f\u043e\u043f\u044b\u0442\u043a\u0430 N"
+            # to make the text change between cycles and to honestly show Slava
+            # the stage is being re-worked.
+            attempt = len(agent_runs)
+            if attempt >= 2:
+                lines.append(
+                    f"\U0001f504 {label} \u00b7 \u043f\u043e\u043f\u044b\u0442\u043a\u0430 {attempt} "
+                    f"\u2026 \u0438\u0434\u0451\u0442"
+                )
+            else:
+                lines.append(
+                    f"\U0001f504 {label:<13} \u2026   \u00b7 \u0438\u0434\u0451\u0442"
+                )
+        elif run is not None:
            lines.append(_stage_line(label, run))
-        elif _STAGE_ACTIVE_AGENT.get(stage) == agent and stage == stage_key:
-            # This stage is the active one and has no finished run yet.
-            lines.append(f"\U0001f504 {label:<13} \u2026   \u00b7 \u0438\u0434\u0451\u0442")
        # else: not started yet -> not shown.

        # Insert the BRD review line right after Analysis.
@@ -372,19 +446,32 @@ def update_task_tracker(task_id: int):
    """Render + push the live tracker for a task. Never raises.

    First call (no stored tracker_message_id): sendMessage (silent) and store the
-    returned message_id. Subsequent calls: editMessageText the stored message; if
-    the edit fails (too old / deleted / 400), fall back to a NEW message and
-    update the stored id. The tracker is always sent with disable_notification so
-    it never pings — only the dedicated alert helpers ping.
+    returned message_id. Subsequent calls: editMessageText the stored message.
+    A NEW message is sent ONLY when the original is truly gone (deleted / too old
+    / invalid id). On "not modified" (text unchanged) or transient failures
+    (network / timeout / 5xx / unknown 400) we do NOT send a new message — that
+    is exactly what produced duplicate trackers and orphaned (lagging) messages.
+    The tracker is always sent with disable_notification so it never pings —
+    only the dedicated alert helpers ping.
    """
    try:
        from .db import get_tracker_message_id, set_tracker_message_id
        text = render_task_tracker(task_id)
        mid = get_tracker_message_id(task_id)
        if mid is not None:
-            if edit_telegram(mid, text):
+            result = edit_telegram(mid, text)
+            if result in (EDIT_OK, EDIT_NOT_MODIFIED):
+                # Edited in place (or nothing to change) -> done, no duplicate.
                return
-            # Edit failed -> fall back to a fresh message.
+            if result == EDIT_FAILED:
+                # Transient -> don't duplicate; tracker redraws next transition.
+                logger.debug(
+                    f"update_task_tracker({task_id}): edit failed transiently, "
+                    f"keeping message {mid}"
+                )
+                return
+            # result == EDIT_GONE -> the stored message is gone; fall through
+            # to send a fresh one and re-point tracker_message_id at it.
        new_mid = send_telegram(text, disable_notification=True)
        if new_mid is not None:
            set_tracker_message_id(task_id, new_mid)
--- a/src/qg/checks.py
+++ b/src/qg/checks.py
@@ -2,6 +2,7 @@

 import os
 import logging
+import subprocess
 import httpx
 from ..config import settings

@@ -137,7 +138,16 @@ def check_review_approved(repo: str, pr_number: int) -> tuple[bool, str]:

 def check_tests_passed(repo: str, work_item_id: str, branch: str | None = None) -> tuple[bool, str]:
    """
-    Check if test report exists and contains PASS indicator.
+    Gate the testing -> deploy transition on the tester's MACHINE-READABLE verdict
+    in 13-test-report.md frontmatter, NOT on a naive substring search of the body.
+
+    ET-013 fix: the previous implementation did `if "PASS" in content`, so a report
+    explicitly marked `verdict: BLOCKED` / `status: blocked` but whose prose mentioned
+    "23 passed" / "✅ PASS" / "All checks passed" was treated as a pass, and an
+    unfinished feature reached Done. This mirrors check_reviewer_verdict (S-5) and
+    check_deploy_status (БАГ 8): read ONLY the YAML frontmatter `verdict:` / `status:`
+    fields, never the body.
+
    File: docs/work-items/<work_item_id>/13-test-report.md
    """
    repo_path = _repo_path(repo, branch)
@@ -149,12 +159,67 @@ def check_tests_passed(repo: str, work_item_id: str, branch: str | None = None)
    try:
        with open(report_path, "r") as f:
            content = f.read()
-        if "PASS" in content or "All tests passed" in content:
-            return True, "Test report indicates PASS"
-        return False, "Test report exists but no PASS indicator found"
    except OSError as e:
        return False, f"Error reading test report: {e}"

+    return _parse_tests_verdict(content)
+
+
+# Positive / negative verdict tokens, derived from REAL tester reports in
+# enduro-trails (ET-001..ET-014). The tester is inconsistent: most write
+# `verdict: PASS`, but ET-006 used `verdict: ready-to-deploy` (with `status: PASSED`),
+# ET-007 `verdict: PASS — ready-to-deploy`, ET-008 `verdict: stage:ready-to-deploy`
+# (with `status: pass`). ET-013 (the bug) used `verdict: BLOCKED` / `status: blocked`.
+# We therefore match known positive/negative TOKENS inside the normalized
+# verdict/status fields, and treat a negative token as authoritative (a BLOCKED/FAILED
+# report never passes, even if another field looks positive).
+_TESTS_NEGATIVE_TOKENS = ("BLOCKED", "FAILED", "FAIL", "REQUEST_CHANGES", "REJECT", "RED")
+_TESTS_POSITIVE_TOKENS = ("PASSED", "PASS", "READY-TO-DEPLOY", "READY_TO_DEPLOY", "GREEN", "APPROVED")
+
+
+def _parse_tests_verdict(content: str) -> tuple[bool, str]:
+    """Map a 13-test-report.md body to a quality-gate verdict by reading ONLY the
+    machine-readable `verdict:` (and corroborating `status:`) YAML frontmatter fields.
+
+    Rules:
+      - No frontmatter / bad YAML / neither field present -> (False, reason).
+      - A negative token (BLOCKED/FAILED/...) in verdict OR status -> (False) and is
+        authoritative (ET-013 main case: verdict BLOCKED wins over any prose PASS).
+      - Otherwise a positive token (PASS/PASSED/READY-TO-DEPLOY/...) in verdict OR
+        status -> (True).
+      - Anything else (unrecognized / empty verdict) -> (False, reason).
+    """
+    import yaml
+
+    if not content.startswith("---"):
+        return False, "No YAML frontmatter in test report (cannot read machine verdict)"
+
+    parts = content.split("---", 2)
+    if len(parts) < 3:
+        return False, "Malformed YAML frontmatter in test report"
+
+    try:
+        fm = yaml.safe_load(parts[1]) or {}
+    except yaml.YAMLError as e:
+        return False, f"Invalid YAML frontmatter in test report: {e}"
+    if not isinstance(fm, dict):
+        return False, "Malformed YAML frontmatter in test report (not a mapping)"
+
+    verdict = str(fm.get("verdict", "") or "").upper().strip()
+    status = str(fm.get("status", "") or "").upper().strip()
+
+    if not verdict and not status:
+        return False, "No machine-readable verdict/status in test report frontmatter"
+
+    fields = f"{verdict} {status}"
+    for neg in _TESTS_NEGATIVE_TOKENS:
+        if neg in fields:
+            return False, f"Test verdict: {verdict or status} ({neg})"
+    for pos in _TESTS_POSITIVE_TOKENS:
+        if pos in fields:
+            return True, f"Test verdict: {verdict or status} (PASS)"
+
+    return False, f"No recognized PASS verdict in frontmatter (verdict={verdict!r}, status={status!r})"


 def check_analysis_approved(repo: str, work_item_id: str, branch: str | None = None) -> tuple[bool, str]:
@@ -281,6 +346,64 @@ def check_tests_local(repo: str, branch: str) -> tuple[bool, str]:
        return False, f"Local test run error: {e}"


+def _parse_deploy_status(content: str) -> tuple[bool, str]:
+    """Parse a 14-deploy-log.md body and map its `deploy_status:` frontmatter to a
+    quality-gate verdict. Reads ONLY the machine-readable YAML field, never prose.
+
+      deploy_status: SUCCESS -> (True,  "Deploy status: SUCCESS")
+      deploy_status: FAILED  -> (False, "Deploy status: FAILED")
+      missing field / no frontmatter / bad YAML -> (False, <reason>)
+    """
+    import yaml
+    status = None
+    if content.startswith("---"):
+        parts = content.split("---", 2)
+        if len(parts) >= 3:
+            try:
+                fm = yaml.safe_load(parts[1]) or {}
+            except yaml.YAMLError as e:
+                return False, f"Invalid YAML frontmatter in deploy log: {e}"
+            status = str(fm.get("deploy_status", "")).upper().strip()
+    if status == "SUCCESS":
+        return True, "Deploy status: SUCCESS"
+    if status == "FAILED":
+        return False, "Deploy status: FAILED"
+    return False, f"No machine-readable deploy_status in frontmatter (got: {status!r})"
+
+
+def _deploy_log_from_main(repo: str, work_item_id: str) -> str | None:
+    """Best-effort read of 14-deploy-log.md from origin/main on the shared clone.
+
+    The deployer writes 14-deploy-log.md and merges the deploy artifacts into main
+    via a separate PR (see ET-013), so the file lands in origin/main, NOT in the
+    feature branch worktree the gate normally reads. This recovers it from main.
+
+    Degrades gracefully: any git failure (no clone, network/fetch error, file
+    absent in main) returns None instead of raising, so the caller falls back to
+    the plain "not found" verdict. Never raises.
+    """
+    repo_clone = os.path.join(settings.repos_dir, repo)
+    if not os.path.isdir(os.path.join(repo_clone, ".git")):
+        return None
+    rel = f"docs/work-items/{work_item_id}/14-deploy-log.md"
+    try:
+        # Refresh origin/main so we see freshly-merged deploy artifacts.
+        subprocess.run(
+            ["git", "-C", repo_clone, "fetch", "origin", "main"],
+            check=False, capture_output=True, timeout=30,
+        )
+        show = subprocess.run(
+            ["git", "-C", repo_clone, "show", f"origin/main:{rel}"],
+            check=False, capture_output=True, text=True, timeout=15,
+        )
+    except (subprocess.SubprocessError, OSError) as e:
+        logger.warning("deploy-log origin/main lookup failed for %s/%s: %s", repo, work_item_id, e)
+        return None
+    if show.returncode != 0:
+        return None
+    return show.stdout
+
+
 def check_deploy_status(repo: str, work_item_id: str, branch: str | None = None) -> tuple[bool, str]:
    """
    БАГ 8 fix: gate the deploy -> done transition on the deployer's machine-readable
@@ -291,32 +414,30 @@ def check_deploy_status(repo: str, work_item_id: str, branch: str | None = None)
    frontmatter. Returns:
      (True, ...)  -> deploy_status: SUCCESS
      (False, ...) -> deploy_status: FAILED, missing field, or no frontmatter
+
+    ET-013 path-sync fix: the deployer writes 14-deploy-log.md and merges the deploy
+    artifacts into main via a SEPARATE PR, so the log lands in origin/main, not in
+    the feature-branch worktree this gate reads via _repo_path(repo, branch). If the
+    file is absent in the worktree we fall back to reading it from origin/main on the
+    shared clone. Lookup order: worktree -> origin/main -> not found.
    """
-    import yaml
    repo_path = _repo_path(repo, branch)
    log_path = os.path.join(repo_path, f"docs/work-items/{work_item_id}/14-deploy-log.md")

-    if not os.path.isfile(log_path):
-        return False, "Deploy log not found (14-deploy-log.md)"
-    try:
-        with open(log_path, "r") as f:
-            content = f.read()
-        status = None
-        if content.startswith("---"):
-            parts = content.split("---", 2)
-            if len(parts) >= 3:
-                try:
-                    fm = yaml.safe_load(parts[1]) or {}
-                except yaml.YAMLError as e:
-                    return False, f"Invalid YAML frontmatter in deploy log: {e}"
-                status = str(fm.get("deploy_status", "")).upper().strip()
-        if status == "SUCCESS":
-            return True, "Deploy status: SUCCESS"
-        if status == "FAILED":
-            return False, "Deploy status: FAILED"
-        return False, f"No machine-readable deploy_status in frontmatter (got: {status!r})"
-    except OSError as e:
-        return False, f"Error reading deploy log: {e}"
+    if os.path.isfile(log_path):
+        try:
+            with open(log_path, "r") as f:
+                content = f.read()
+        except OSError as e:
+            return False, f"Error reading deploy log: {e}"
+        return _parse_deploy_status(content)
+
+    # Not in the feature worktree — the deployer may have merged it into main.
+    main_content = _deploy_log_from_main(repo, work_item_id)
+    if main_content is not None:
+        return _parse_deploy_status(main_content)
+
+    return False, "Deploy log not found (14-deploy-log.md)"


 # Registry for dynamic lookup by name
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -38,3 +38,36 @@ def _no_telegram(monkeypatch):
    monkeypatch.setattr("src.agents.launcher.send_telegram", _noop, raising=False)
    monkeypatch.setattr("src.queue_worker.send_telegram", _noop, raising=False)
    yield
+
+
+@pytest.fixture(autouse=True)
+def _reset_webhook_secrets(monkeypatch):
+    """Isolate settings singleton between test files (CI cross-file isolation).
+
+    settings is a process-wide Pydantic singleton read once at import.  Different
+    test modules set env variables differently at import-time, so those values leak
+    across files when pytest collects them together (as CI does).
+
+    1. webhook secrets: reset to "" so HMAC is disabled by default.  Tests that
+       intentionally test the 401 path (test_webhook_dedup.py:268,278) re-apply
+       their own monkeypatch AFTER this autouse fixture runs, which overrides the
+       reset for the duration of that one test only.
+
+    2. db_path: reset to the value from ORCH_DB_PATH env var (last written by the
+       last imported test module).  Without this, test_webhook_dedup.py (imported
+       first, alphabetically) seeds settings.db_path = dedup.db, while
+       test_webhooks.py's setup_db fixture tries to remove test_orchestrator.db,
+       leaving the DB dirty across tests that share a branch name and causing
+       get_task_by_repo_branch() to return a stale row with the wrong stage.
+       Per-test monkeypatches in test_webhook_dedup.setup_db override this reset.
+    """
+    import os
+    from src.webhooks import gitea as gitea_mod
+    from src.webhooks import plane as plane_mod
+    from src import db as db_mod
+    monkeypatch.setattr(gitea_mod.settings, "gitea_webhook_secret", "", raising=False)
+    monkeypatch.setattr(plane_mod.settings, "plane_webhook_secret", "", raising=False)
+    db_path_env = os.environ.get("ORCH_DB_PATH", "")
+    if db_path_env:
+        monkeypatch.setattr(db_mod.settings, "db_path", db_path_env, raising=False)
+    yield
--- a/tests/test_qg.py
+++ b/tests/test_qg.py
@@ -167,23 +167,110 @@ class TestCheckReviewApproved:


 class TestCheckTestsPassed:
-    def test_report_with_pass(self, setup_work_item_dir):
-        repo_dir = setup_work_item_dir
-        wi_dir = repo_dir / "docs" / "work-items" / "ET-001"
-        wi_dir.mkdir(parents=True)
-        (wi_dir / "13-test-report.md").write_text("# Test Report\n\nResult: PASS\n")
+    """ET-013 fix: testing -> deploy gate reads the tester's MACHINE-READABLE verdict
+    in 13-test-report.md frontmatter (verdict:/status:), NOT a substring of the body.
+    Mirrors check_reviewer_verdict / check_deploy_status. The old `if "PASS" in content`
+    let a `verdict: BLOCKED` report whose prose said "23 passed"/"✅ PASS" pass the gate,
+    shipping an unfinished feature to Done."""

+    def _write(self, repo_dir, content, wi="ET-001"):
+        wi_dir = repo_dir / "docs" / "work-items" / wi
+        wi_dir.mkdir(parents=True)
+        (wi_dir / "13-test-report.md").write_text(content)
+
+    def test_verdict_pass_passes(self, setup_work_item_dir):
+        # Most common real form (ET-001/002/005/009/011/012/014).
+        self._write(
+            setup_work_item_dir,
+            "---\ntype: test-report\nverdict: PASS\nstatus: pass\n---\n\n# Test Report\n",
+        )
+        passed, reason = check_tests_passed("enduro-trails", "ET-001")
+        assert passed is True
+        assert "PASS" in reason
+
+    def test_verdict_pass_ready_to_deploy_passes(self, setup_work_item_dir):
+        # ET-007 real form: "PASS — ready-to-deploy".
+        self._write(
+            setup_work_item_dir,
+            "---\nverdict: PASS — ready-to-deploy\nstatus: PASS\n---\n\nbody\n",
+        )
        passed, reason = check_tests_passed("enduro-trails", "ET-001")
        assert passed is True

-    def test_report_without_pass(self, setup_work_item_dir):
-        repo_dir = setup_work_item_dir
-        wi_dir = repo_dir / "docs" / "work-items" / "ET-001"
-        wi_dir.mkdir(parents=True)
-        (wi_dir / "13-test-report.md").write_text("# Test Report\n\nResult: FAIL\n")
+    def test_verdict_ready_to_deploy_with_status_passed_passes(self, setup_work_item_dir):
+        # ET-006 real form: verdict has no PASS word, but status: PASSED.
+        self._write(
+            setup_work_item_dir,
+            "---\nverdict: ready-to-deploy\nstatus: PASSED\n---\n\nbody\n",
+        )
+        passed, reason = check_tests_passed("enduro-trails", "ET-001")
+        assert passed is True

+    def test_verdict_stage_ready_to_deploy_with_status_pass_passes(self, setup_work_item_dir):
+        # ET-008 real form: verdict: stage:ready-to-deploy, status: pass.
+        self._write(
+            setup_work_item_dir,
+            "---\nverdict: stage:ready-to-deploy\nstatus: pass\n---\n\nbody\n",
+        )
+        passed, reason = check_tests_passed("enduro-trails", "ET-001")
+        assert passed is True
+
+    def test_blocked_verdict_with_pass_in_body_fails(self, setup_work_item_dir):
+        # THE ET-013 BUG: verdict BLOCKED but body is full of "PASS"/"passed".
+        self._write(
+            setup_work_item_dir,
+            "---\ntype: test-report\nstatus: blocked\nverdict: BLOCKED\n---\n\n"
+            "23 passed\n✅ PASS (часть AC-18)\nAll checks passed\n",
+        )
        passed, reason = check_tests_passed("enduro-trails", "ET-001")
        assert passed is False
+        assert "BLOCKED" in reason
+
+    def test_failed_verdict_fails(self, setup_work_item_dir):
+        self._write(
+            setup_work_item_dir,
+            "---\nverdict: FAILED\nstatus: failed\n---\n\nbody\n",
+        )
+        passed, reason = check_tests_passed("enduro-trails", "ET-001")
+        assert passed is False
+        assert "FAILED" in reason
+
+    def test_passed_count_in_body_but_blocked_verdict_fails(self, setup_work_item_dir):
+        # Body says "23 passed" but frontmatter verdict BLOCKED -> substring no longer fools.
+        self._write(
+            setup_work_item_dir,
+            "---\nverdict: BLOCKED\n---\n\nTests: 23 passed, 0 failed.\n",
+        )
+        passed, reason = check_tests_passed("enduro-trails", "ET-001")
+        assert passed is False
+
+    def test_no_frontmatter_fails(self, setup_work_item_dir):
+        # Old format / prose only -> no machine verdict -> fail.
+        self._write(
+            setup_work_item_dir,
+            "# Test Report\n\nResult: PASS\nAll tests passed.\n",
+        )
+        passed, reason = check_tests_passed("enduro-trails", "ET-001")
+        assert passed is False
+
+    def test_no_verdict_field_fails(self, setup_work_item_dir):
+        # Frontmatter present but neither verdict nor status -> fail.
+        self._write(
+            setup_work_item_dir,
+            "---\ntype: test-report\nversion: 1\n---\n\nResult: PASS\n",
+        )
+        passed, reason = check_tests_passed("enduro-trails", "ET-001")
+        assert passed is False
+
+    def test_invalid_yaml_fails_no_exception(self, setup_work_item_dir):
+        # Broken YAML frontmatter -> False with reason, never raises.
+        self._write(
+            setup_work_item_dir,
+            "---\nverdict: [unclosed\n  : : :\n---\n\nbody PASS\n",
+        )
+        passed, reason = check_tests_passed("enduro-trails", "ET-001")
+        assert passed is False
+        assert "YAML" in reason or "frontmatter" in reason.lower()

    def test_no_report(self, setup_work_item_dir):
        passed, reason = check_tests_passed("enduro-trails", "ET-001")
@@ -242,6 +329,65 @@ class TestCheckDeployStatus:
        passed, reason = check_deploy_status("enduro-trails", "ET-011")
        assert passed is False

+    # --- ET-013 path-sync fix: log written to origin/main via separate PR ---
+
+    def test_origin_main_success_passes_when_absent_in_worktree(self, monkeypatch):
+        # Deployer merged 14-deploy-log.md into main via a separate PR; it is NOT
+        # in the feature worktree. Gate must recover it from origin/main -> PASS.
+        # (This is the exact ET-013 regression.)
+        monkeypatch.setattr(
+            "src.qg.checks._deploy_log_from_main",
+            lambda repo, wi: "---\ndeploy_status: SUCCESS\nversion: v0.0.5\n---\n\nLive.\n",
+        )
+        passed, reason = check_deploy_status("enduro-trails", "ET-013")
+        assert passed is True
+        assert "SUCCESS" in reason
+
+    def test_origin_main_failed_fails(self, monkeypatch):
+        # A genuine FAILED log in main must still fail.
+        monkeypatch.setattr(
+            "src.qg.checks._deploy_log_from_main",
+            lambda repo, wi: "---\ndeploy_status: FAILED\nversion: v0.0.5\n---\n\nboom.\n",
+        )
+        passed, reason = check_deploy_status("enduro-trails", "ET-013")
+        assert passed is False
+        assert "FAILED" in reason
+
+    def test_absent_everywhere_fails(self, monkeypatch):
+        # Not in worktree and origin/main lookup yields nothing -> not found.
+        monkeypatch.setattr(
+            "src.qg.checks._deploy_log_from_main", lambda repo, wi: None
+        )
+        passed, reason = check_deploy_status("enduro-trails", "ET-013")
+        assert passed is False
+        assert "not found" in reason.lower()
+
+    @patch("src.qg.checks.subprocess.run")
+    @patch("src.qg.checks.os.path.isdir", return_value=True)
+    def test_fetch_failure_degrades_no_exception(self, mock_isdir, mock_run):
+        # git fetch/show raising (e.g. network) must degrade to "not found",
+        # never propagate an exception out of the gate.
+        import subprocess as _sp
+        mock_run.side_effect = _sp.TimeoutExpired(cmd="git", timeout=30)
+        passed, reason = check_deploy_status("enduro-trails", "ET-013")
+        assert passed is False
+        assert "not found" in reason.lower()
+
+    def test_worktree_log_short_circuits_main_lookup(self, setup_work_item_dir, monkeypatch):
+        # If the log IS present in the worktree, origin/main must NOT be consulted.
+        self._write_log(
+            setup_work_item_dir,
+            "---\ndeploy_status: SUCCESS\nversion: v0.0.3\n---\n\nDeployed OK.\n",
+        )
+        called = {"n": 0}
+        def _boom(repo, wi):
+            called["n"] += 1
+            return None
+        monkeypatch.setattr("src.qg.checks._deploy_log_from_main", _boom)
+        passed, reason = check_deploy_status("enduro-trails", "ET-011")
+        assert passed is True
+        assert called["n"] == 0
+
    def test_deploy_stage_qg_is_check_deploy_status(self):
        assert get_qg_for_stage("deploy") == "check_deploy_status"

--- a/tests/test_telegram_tracker.py
+++ b/tests/test_telegram_tracker.py
@@ -249,7 +249,7 @@ def test_second_call_edits_existing_message(monkeypatch):

    edited = {}
    monkeypatch.setattr(N, "edit_telegram",
-                        lambda mid, text: edited.update(mid=mid) or True)
+                        lambda mid, text: edited.update(mid=mid) or N.EDIT_OK)
    monkeypatch.setattr(N, "send_telegram",
                        lambda *a, **k: (_ for _ in ()).throw(AssertionError("should not send when edit succeeds")))

@@ -257,20 +257,196 @@ def test_second_call_edits_existing_message(monkeypatch):
    assert edited["mid"] == 777


-def test_fallback_to_new_message_when_edit_fails(monkeypatch):
+def test_fallback_to_new_message_when_edit_gone(monkeypatch):
+    """edit returns 'gone' (message deleted/too old) -> send NEW + update id."""
    tid = _mk_task(stage="development")
    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
            in_tok=10, out_tok=5, cost=0.1)
    from src.db import set_tracker_message_id, get_tracker_message_id
    set_tracker_message_id(tid, 100)

-    monkeypatch.setattr(N, "edit_telegram", lambda mid, text: False)  # edit fails
+    monkeypatch.setattr(N, "edit_telegram", lambda mid, text: N.EDIT_GONE)
    monkeypatch.setattr(N, "send_telegram", lambda text, disable_notification=False: 200)

    N.update_task_tracker(tid)
    assert get_tracker_message_id(tid) == 200  # id updated to the new message


+def test_not_modified_does_not_send_new_message(monkeypatch):
+    """edit returns 'not_modified' -> NO new message, id unchanged (no dupe)."""
+    tid = _mk_task(stage="development")
+    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
+            in_tok=10, out_tok=5, cost=0.1)
+    from src.db import set_tracker_message_id, get_tracker_message_id
+    set_tracker_message_id(tid, 100)
+
+    monkeypatch.setattr(N, "edit_telegram", lambda mid, text: N.EDIT_NOT_MODIFIED)
+    monkeypatch.setattr(N, "send_telegram",
+                        lambda *a, **k: (_ for _ in ()).throw(AssertionError("must not send on not_modified")))
+
+    N.update_task_tracker(tid)
+    assert get_tracker_message_id(tid) == 100  # unchanged, no duplicate
+
+
+def test_transient_edit_failure_does_not_send_new_message(monkeypatch):
+    """edit returns 'failed' (network/timeout/5xx) -> NO new message (no dupe)."""
+    tid = _mk_task(stage="development")
+    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
+            in_tok=10, out_tok=5, cost=0.1)
+    from src.db import set_tracker_message_id, get_tracker_message_id
+    set_tracker_message_id(tid, 100)
+
+    monkeypatch.setattr(N, "edit_telegram", lambda mid, text: N.EDIT_FAILED)
+    monkeypatch.setattr(N, "send_telegram",
+                        lambda *a, **k: (_ for _ in ()).throw(AssertionError("must not send on transient failure")))
+
+    N.update_task_tracker(tid)
+    assert get_tracker_message_id(tid) == 100  # unchanged, no duplicate
+
+
+# --------------------------------------------------------------------------- #
+# edit_telegram outcome classification (httpx mocked)
+# --------------------------------------------------------------------------- #
+def _edit_resp(ok, description=None):
+    resp = MagicMock()
+    body = {"ok": ok}
+    if description is not None:
+        body["description"] = description
+    resp.json.return_value = body
+    return resp
+
+
+def _patch_tg_creds(monkeypatch):
+    monkeypatch.setattr(N._get_settings(), "telegram_bot_token", "T", raising=False)
+    monkeypatch.setattr(N._get_settings(), "telegram_chat_id", "C", raising=False)
+
+
+def test_edit_telegram_ok(monkeypatch):
+    _patch_tg_creds(monkeypatch)
+    with patch("src.notifications.httpx") as hx:
+        hx.post.return_value = _edit_resp(True)
+        assert N.edit_telegram(1, "x") == N.EDIT_OK
+
+
+def test_edit_telegram_not_modified_is_success(monkeypatch):
+    # 400 "message is not modified" -> success, not gone, no duplicate
+    _patch_tg_creds(monkeypatch)
+    with patch("src.notifications.httpx") as hx:
+        hx.post.return_value = _edit_resp(
+            False, "Bad Request: message is not modified: ...")
+        assert N.edit_telegram(1, "x") == N.EDIT_NOT_MODIFIED
+
+
+def test_edit_telegram_exactly_the_same_is_not_modified(monkeypatch):
+    _patch_tg_creds(monkeypatch)
+    with patch("src.notifications.httpx") as hx:
+        hx.post.return_value = _edit_resp(
+            False, "Bad Request: specified new message content and reply markup "
+                   "are exactly the same")
+        assert N.edit_telegram(1, "x") == N.EDIT_NOT_MODIFIED
+
+
+def test_edit_telegram_message_not_found_is_gone(monkeypatch):
+    _patch_tg_creds(monkeypatch)
+    with patch("src.notifications.httpx") as hx:
+        hx.post.return_value = _edit_resp(
+            False, "Bad Request: message to edit not found")
+        assert N.edit_telegram(1, "x") == N.EDIT_GONE
+
+
+def test_edit_telegram_cant_be_edited_is_gone(monkeypatch):
+    _patch_tg_creds(monkeypatch)
+    with patch("src.notifications.httpx") as hx:
+        hx.post.return_value = _edit_resp(
+            False, "Bad Request: message can't be edited")
+        assert N.edit_telegram(1, "x") == N.EDIT_GONE
+
+
+def test_edit_telegram_unknown_400_is_failed(monkeypatch):
+    # unknown 400 -> failed (NOT gone) -> caller won't duplicate
+    _patch_tg_creds(monkeypatch)
+    with patch("src.notifications.httpx") as hx:
+        hx.post.return_value = _edit_resp(
+            False, "Bad Request: some other unexpected error")
+        assert N.edit_telegram(1, "x") == N.EDIT_FAILED
+
+
+def test_edit_telegram_timeout_is_failed(monkeypatch):
+    _patch_tg_creds(monkeypatch)
+    with patch("src.notifications.httpx") as hx:
+        hx.post.side_effect = Exception("read timeout")
+        assert N.edit_telegram(1, "x") == N.EDIT_FAILED
+
+
+def test_edit_telegram_5xx_is_failed(monkeypatch):
+    # Telegram 5xx still returns ok:false w/o gone/not_modified markers
+    _patch_tg_creds(monkeypatch)
+    with patch("src.notifications.httpx") as hx:
+        hx.post.return_value = _edit_resp(False, "Internal Server Error")
+        assert N.edit_telegram(1, "x") == N.EDIT_FAILED
+
+
+# --------------------------------------------------------------------------- #
+# render: repeated stage attempt shows "попытка N"
+# --------------------------------------------------------------------------- #
+_POPYTKA = "\u043f\u043e\u043f\u044b\u0442\u043a\u0430"  # popytka
+
+
+def test_render_active_stage_shows_attempt_on_second_run():
+    # Two reviewer runs while in review -> active line shows attempt 2.
+    tid = _mk_task(stage="review")
+    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
+            in_tok=10, out_tok=5, cost=0.1, model="tokenator/claude-opus-4-8")
+    _mk_run(tid, "developer", "2026-06-04 09:10:00", "2026-06-04 09:20:00",
+            in_tok=10, out_tok=5, cost=0.1, model="tokenator/claude-opus-4-8")
+    # First review run finished (sent back to dev), second review run active.
+    _mk_run(tid, "reviewer", "2026-06-04 09:20:00", "2026-06-04 09:25:00",
+            in_tok=10, out_tok=5, cost=0.1, model="vibecode/claude-sonnet-4.6",
+            exit_code=0)
+    _mk_run(tid, "reviewer", "2026-06-04 09:30:00", None,
+            in_tok=0, out_tok=0, exit_code=None)
+
+    text = N.render_task_tracker(tid)
+    active = [l for l in text.splitlines()
+              if l.startswith("\U0001f504") and "Review" in l][0]
+    assert _POPYTKA in active
+    assert "2" in active
+    assert "\u0438\u0434\u0451\u0442" in active
+
+
+def test_render_active_stage_no_attempt_on_first_run():
+    # Single reviewer run -> active line has NO attempt marker.
+    tid = _mk_task(stage="review")
+    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
+            in_tok=10, out_tok=5, cost=0.1, model="tokenator/claude-opus-4-8")
+    _mk_run(tid, "developer", "2026-06-04 09:10:00", "2026-06-04 09:20:00",
+            in_tok=10, out_tok=5, cost=0.1, model="tokenator/claude-opus-4-8")
+    _mk_run(tid, "reviewer", "2026-06-04 09:20:00", None,
+            in_tok=0, out_tok=0, exit_code=None)
+
+    text = N.render_task_tracker(tid)
+    active = [l for l in text.splitlines()
+              if l.startswith("\U0001f504") and "Review" in l][0]
+    assert _POPYTKA not in active
+    assert "\u0438\u0434\u0451\u0442" in active
+
+
+def test_render_finished_lines_unaffected_by_attempt_logic():
+    # Completed (checkmark) lines never carry an attempt marker.
+    tid = _mk_task(stage="review")
+    _mk_run(tid, "analyst", "2026-06-04 09:00:00", "2026-06-04 09:10:00",
+            in_tok=10, out_tok=5, cost=0.1, model="tokenator/claude-opus-4-8")
+    # developer ran twice (retry) but is a FINISHED stage now.
+    _mk_run(tid, "developer", "2026-06-04 09:10:00", "2026-06-04 09:15:00",
+            in_tok=10, out_tok=5, cost=0.1, model="tokenator/claude-opus-4-8")
+    _mk_run(tid, "developer", "2026-06-04 09:16:00", "2026-06-04 09:20:00",
+            in_tok=10, out_tok=5, cost=0.1, model="tokenator/claude-opus-4-8")
+    text = N.render_task_tracker(tid)
+    for l in text.splitlines():
+        if l.startswith("\u2705"):
+            assert _POPYTKA not in l
+
+
 # --------------------------------------------------------------------------- #
 # which alerts are SEPARATE vs tracker-only
 # --------------------------------------------------------------------------- #
--- a/tests/test_webhooks.py
+++ b/tests/test_webhooks.py
@@ -54,13 +54,19 @@ def test_status_endpoint():
    assert "active_tasks" in resp.json()


+@patch("src.plane_sync.add_comment")
+@patch("src.plane_sync.fetch_issue_sequence_id", return_value=None)
+@patch("src.plane_sync.fetch_issue_fields", return_value=("Test task", "This is a detailed test description for the task"))
@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
-def test_plane_webhook_creates_task(mock_docs, mock_branch):
-    """work_item.created → task in DB with stage=analysis."""
+def test_plane_webhook_creates_task(mock_docs, mock_branch, mock_fetch_fields, mock_fetch_seq, mock_add_comment):
+    """work_item.created (via In Progress status) → task in DB with stage=analysis."""
    resp = client.post("/webhook/plane", json={
-        "event": "work_item.created",
-        "data": {"id": "test-123", "name": "Test task", "project": "proj-1"}
+        "event": "issue", "action": "updated",
+        "data": {
+            "id": "test-123", "name": "Test task", "project": "proj-1",
+            "state": {"id": "b873d9eb-993c-48cd-97ac-99a9b1623967", "name": "In Progress", "group": "started"},
+        }
    })
    assert resp.status_code == 200
    assert resp.json()["status"] == "accepted"
@@ -75,17 +81,37 @@ def test_plane_webhook_creates_task(mock_docs, mock_branch):
    assert "feature/" in task["branch"]


+@patch("src.plane_sync.add_comment")
+@patch("src.plane_sync.fetch_issue_sequence_id", return_value=None)
+@patch("src.plane_sync.fetch_issue_fields",
+       side_effect=[
+           ("First task", "This is a detailed description for the first task item"),
+           ("Second task", "This is a detailed description for the second task item"),
+       ])
@patch("src.webhooks.plane._create_gitea_branch", new_callable=AsyncMock)
@patch("src.webhooks.plane._create_initial_docs", new_callable=AsyncMock)
-def test_plane_webhook_generates_sequential_ids(mock_docs, mock_branch):
-    """Multiple work items get sequential IDs."""
+def test_plane_webhook_generates_sequential_ids(
+    mock_docs, mock_branch, mock_fetch_fields, mock_fetch_seq, mock_add_comment
+):
+    """Multiple In Progress transitions get sequential IDs (ET-001, ET-002)."""
+    in_progress_state = {
+        "id": "b873d9eb-993c-48cd-97ac-99a9b1623967",
+        "name": "In Progress",
+        "group": "started",
+    }
    client.post("/webhook/plane", json={
-        "event": "work_item.created",
-        "data": {"id": "item-1", "name": "First task", "project": "proj-1"}
+        "event": "issue", "action": "updated",
+        "data": {
+            "id": "item-1", "name": "First task", "project": "proj-1",
+            "state": in_progress_state,
+        }
    })
    client.post("/webhook/plane", json={
-        "event": "work_item.created",
-        "data": {"id": "item-2", "name": "Second task", "project": "proj-1"}
+        "event": "issue", "action": "updated",
+        "data": {
+            "id": "item-2", "name": "Second task", "project": "proj-1",
+            "state": in_progress_state,
+        }
    })

    conn = get_db()
@@ -202,8 +228,9 @@ def test_gitea_webhook_push():
    assert resp.json()["status"] == "accepted"


+@patch("src.webhooks.gitea.plane_notify_stage")
@patch("src.webhooks.gitea.launcher")
-def test_gitea_push_with_adr_advances_stage(mock_launcher):
+def test_gitea_push_with_adr_advances_stage(mock_launcher, mock_plane_notify):
    """Push with ADR files at architecture stage → advance to development."""
    mock_launcher.launch.return_value = 1

@@ -235,7 +262,7 @@ def test_gitea_push_with_adr_advances_stage(mock_launcher):
    task = conn.execute("SELECT * FROM tasks WHERE plane_id = 'push-001'").fetchone()
    conn.close()
    assert task["stage"] == "development"
-    mock_launcher.launch.assert_called_once()
+    mock_plane_notify.assert_called_once()


@patch("src.webhooks.gitea.check_ci_green")
Author	SHA1	Message	Date
Dev Agent	6c1e5fff52	feat(staging): add isolated orchestrator-staging service (port 8501, separate DB) All checks were successful CI / test (push) Successful in 10s Details CI / test (pull_request) Successful in 9s Details - Add orchestrator-staging compose service under profile 'staging' so normal 'docker compose up -d' does NOT start it. - Port 8501 via command override; network_mode: host (no ports mapping needed). - DB isolation via separate volume ./data/staging:/app/data — physically separate from prod ./data/orchestrator.db on the host. - ORCH_DB_PATH=/app/data/orchestrator.db explicit in env (same container path, isolated by volume mount). - Add .env.staging.example with all required keys and placeholders. - Update .gitignore: add .env.staging and data/staging/ exclusions. - Add docs/STAGING.md: how to start staging, architecture table, roadmap. Refs: ORCH-31 (Stage 1 of 5)	2026-06-05 07:34:48 +03:00
Slava	d0a34249cc	Merge PR #27 : isolate webhook tests + add CI workflow (self-hosting gate) Closes the CI quality gate for orchestrator self-hosting (ORCH-7). Full pytest tests/ green (294 passed). Supersedes #26.	2026-06-05 07:29:04 +03:00
Dev Agent	1baae81165	test: reset webhook secret per-test to fix cross-file isolation (CI green) All checks were successful CI / test (push) Successful in 10s Details CI / test (pull_request) Successful in 10s Details Adds autouse fixture _reset_webhook_secrets to tests/conftest.py that resets the process-wide Pydantic settings singleton before every test: 1. gitea_webhook_secret / plane_webhook_secret → "" (HMAC disabled by default). Tests that deliberately test the 401 path (test_webhook_dedup.py:268,278) override this with their own monkeypatch which runs after autouse fixtures and wins for that test only. 2. db_path → os.environ["ORCH_DB_PATH"] (last written value after all test modules are imported). Without this, test_webhook_dedup.py (imported first alphabetically) seeds settings.db_path = dedup.db, while test_webhooks.py setup_db tries to remove test_orchestrator.db — leaving the DB dirty between tests that share a branch name and causing get_task_by_repo_branch() to return a stale row with the wrong stage. Per-test monkeypatches in test_webhook_dedup.setup_db still override it. Root cause: both leaks come from the same singleton settings being read once at import, before any per-test isolation runs. The autouse fixture is the correct per-test reset point for process-wide singletons. Result: pytest tests/ → 294 passed, 0 failed (was 10 failed/284 passed).	2026-06-05 00:00:01 +03:00
Dev Agent	e856e0940b	test: migrate sequential_ids test to In Progress contract Some checks failed CI / test (push) Failing after 9s Details CI / test (pull_request) Failing after 9s Details	2026-06-04 22:38:09 +03:00
Dev Agent	7bbab9c38b	test: isolate webhook tests from live Plane API (fix CI) Some checks failed CI / test (push) Failing after 9s Details CI / test (pull_request) Failing after 9s Details	2026-06-04 22:15:40 +03:00
Slava	a33a971c9c	Merge pull request 'docs: Product Vision платформы (MD + PPTX)' (#25 ) from docs/product-vision into main	2026-06-04 17:37:36 +03:00
Стрим	d0c604bc66	docs: Product Vision платформы (MD + PPTX, 8 слайдов)	2026-06-04 17:37:16 +03:00
Slava	83f5020f94	Merge pull request 'fix(qg): gate testing->deploy on machine-readable test verdict, not substring (ET-013)' (#24 ) from fix/tests-machine-verdict into main	2026-06-04 16:08:10 +03:00
dev-agent	757745a221	fix(qg): gate testing->deploy on machine-readable test verdict, not substring (ET-013) check_tests_passed did "if PASS in content" over the whole 13-test-report.md body, so a report explicitly marked verdict: BLOCKED / status: blocked whose prose mentioned "23 passed" / "PASS" / "All checks passed" passed the gate. On ET-013 an unfinished feature (P1 AC-19 failed) reached Done. Now mirrors check_reviewer_verdict (S-5) and check_deploy_status: read ONLY the YAML frontmatter verdict:/status: fields. Positive tokens (PASS/PASSED/ READY-TO-DEPLOY/GREEN/APPROVED) -> True; negative tokens (BLOCKED/FAILED/...) are authoritative -> False; missing/empty/no-frontmatter/bad-YAML -> False with reason; file missing -> not found. Never raises. Positive token set derived from REAL enduro-trails reports ET-001..ET-014 (inconsistent: PASS, ready-to-deploy+status:PASSED, stage:ready-to-deploy+status:pass, PASS — ready-to-deploy). Validated: all 9 prior passing WIs stay True, ET-013 -> False.	2026-06-04 16:05:52 +03:00
Slava	34894f4684	Merge pull request 'fix(qg): find 14-deploy-log.md in origin/main when absent in feature worktree (false-FAILED deploy)' (#23 ) from fix/deploy-gate-log-path into main	2026-06-04 13:38:30 +03:00
dev-agent	4e4cc6c724	fix(qg): find 14-deploy-log.md in origin/main when absent in feature worktree ET-013: deployer writes 14-deploy-log.md and merges deploy artifacts into main via a separate PR, so the log lands in origin/main, not the feature branch worktree that check_deploy_status reads via _repo_path(repo, branch). Result: every successful deploy was falsely failed (Deploy log not found) and rolled back deploy->development. Fix: when the log is absent in the worktree, fall back to reading it from origin/main on the shared clone (git fetch origin main + git show origin/main:docs/work-items/<WI>/14-deploy-log.md). Lookup order: worktree -> origin/main -> not found. Fetch/show failures degrade to not found (never raise). Does not touch the merge-gate in gitea.py. Tests: origin/main SUCCESS->PASS (ET-013 case), origin/main FAILED->FAILED, absent everywhere->not found, fetch failure->degrades no exception, worktree log short-circuits main lookup.	2026-06-04 13:35:35 +03:00
Slava	b222d7af27	Merge pull request 'fix(tracker): no duplicate Telegram messages on not-modified/transient edits' (#22 ) from fix/tracker-edit-not-modified into main	2026-06-04 13:22:46 +03:00
dev-bot	ec9aa74492	fix(tracker): no duplicate Telegram messages on not-modified/transient edits edit_telegram now returns a distinguishable outcome (ok\|not_modified\|gone\| failed) instead of a bare bool. update_task_tracker only sends a NEW message when the original is truly gone; not_modified and transient failures no longer spawn duplicate trackers or orphan the live one. render_task_tracker shows "попытка N" on an actively re-run stage (>=2 agent runs) so the text changes between review<->development cycles. Finished (✅) lines are unchanged. Tests: edit_telegram classification (ok/not_modified/gone/failed via mocked httpx), update_task_tracker (not_modified/failed -> no send, gone -> send+id), render attempt marker.	2026-06-04 13:20:40 +03:00
Slava	3e5c74ce4f	Merge pull request 'feat(telegram): live editable task tracker (Variant B+)' (#21 ) from feat/telegram-live-tracker into main	2026-06-04 11:46:21 +03:00