orchestrator

Author	SHA1	Message	Date
claude-bot	720c31393a	fix(reaper): Tier-2 finalization grace + claim-before-act (no dup advance) Tier-2 reaped a LIVE, still-finalizing monitor: _monitor_agent writes agent_runs.exit_code FIRST, then does git push / PR / Plane comments before _finalize_job, and the agent pid is already dead in that window — so the old "exit_code recorded -> reap now" had no grace and could race a healthy job. Worse, _reap_known_outcome ran the advance (advance_stage -> enqueue_job) BEFORE the atomic claim, so a reaper that lost the race had already enqueued the next stage (dup advance / dup enqueue), violating ADR-001 Р-1. Fix: - Tier-2 grace: reap only once agent_runs.exit_code has been recorded for >= reaper_finalize_grace_s (new setting, default 300s; > max finalization window). A live finalizing monitor is never reaped (FR-1.3/AC-3). New finished_age_s column computed in get_running_jobs. - claim-before-act for exit0: evaluate the canonical QG READ-ONLY (the reconciler pattern) to choose the terminal status, then atomically claim 'done' FIRST; only the claim winner runs the advance. A loser performs no side effects -> no dup advance / dup enqueue. Docs (golden source) updated in the same change: ADR-001, global adr-0011, README, internals, .env.example, CHANGELOG (also fixes the P3 broken adr-0011 link). New tests cover the grace window, lost-claim no-side-effects, and the already-advanced idempotent path. Refs: ORCH-065 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-07 16:14:45 +00:00
claude-bot	a6b444c356	fix(merge): wire pr_already_merged guard into deployer merge path (idempotent re-merge) The pr_already_merged guard was defined + unit-tested but consulted by zero production code, while ADR-001 Р-3 / README / CHANGELOG claimed the merge path consults it before a repeat merge (reviewer P1, ORCH-065 attempt 2/3). The actual merge actor is the LLM deployer agent (it merges the feature PR at the start of the `deploy` stage), so on a reaper re-drive of an already-merged PR the deployer would blindly re-merge → Gitea error → false БАГ-8 rollback; AC-11 ("no second merge") was not met deterministically. Wire the guard at the real consultation point — the deployer prompt — so it runs merge_gate.pr_already_merged before any (re-)merge and no-ops when the PR is already merged. check_branch_mergeable is left untouched (AC-13: check_* behaviour unchanged; it runs on the first deploy-staging→deploy edge, not on a deploy-stage re-drive where the second-merge risk lives). - .openclaw/agents/deployer.md: idempotent pre-merge guard step + general rule. - src/merge_gate.py: docstring names the deployer-prompt consultation point. - docs/architecture/README.md, CHANGELOG.md: state the consultation point so golden-source matches implementation. - tests/test_merge_gate.py: regression test asserting the deployer prompt wires the guard (so it can't silently become dead code again). pytest tests/ -q: 743 passed. Refs: ORCH-065 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-07 16:14:45 +00:00
claude-bot	4bebb921ff	feat(reaper): job-reaper + stale merge-lease reclaim + idempotent merge finalization Closes the "zombie jobs" incident class: job status was set only inside the live launcher process, so a process death left jobs.status='running' forever; at max_concurrency=1 one zombie blocked ALL projects' queue (self-hosting risk). Adds a background daemon (src/job_reaper.py) with three-tier liveness (dead-pid streak / known exit_code / max-running backstop) whose only mutating write is an atomic terminal flip guarded by WHERE status='running' (no double-process). For exit0 the canonical QG is the source of truth via gate-driven advance, not "exit0". Also proactively reclaims stale merge-lease (dead pid OR TTL) via file delete only (no git ops), and makes merge finalization idempotent (pr_already_merged guard + up-to-date short-circuit on re-drive). New jobs.pid column via idempotent _ensure_column (no migration); pid stamped in launcher._spawn after Popen. Reaper start/stop in lifespan; "reaper" snapshot in GET /queue. Kill-switches: ORCH_REAPER_ENABLED, ORCH_REAPER_INTERVAL_S, ORCH_REAPER_DEAD_TICKS, ORCH_REAPER_MAX_RUNNING_S, ORCH_LEASE_RECLAIM_ENABLED. Invariants unchanged (AC-13): STAGE_TRANSITIONS, QG_CHECKS registry, check_branch_mergeable signature/behaviour, BUG-8 rollback, hook exit codes. restart-safe, never-raise per unit of background work. Docs: docs/architecture/README.md, CHANGELOG.md, .env.example. Tests: tests/test_job_reaper.py, tests/test_merge_lease_reclaim.py, tests/test_merge_gate.py (TC-16), tests/test_merge_gate_race.py (TC-17), tests/test_queue.py, tests/test_config.py (TC-19/TC-20). 742 passed. Refs: ORCH-065 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-07 16:14:45 +00:00
claude-bot	2f4c553fd8	feat(post-deploy): post-deploy prod monitoring + degradation reaction (ORCH-021) Extend pipeline responsibility past deploy->done: after the terminal transition for an applicable repo, arm a ~15min observation window that probes prod and reacts to a degradation the restart-time health-check missed ("green deploy, red prod"). - src/post_deploy.py: new leaf module (config + lazy qg/db only). Sentinel-file restart-safe state (.post-deploy-state-<repo>/<wi>/), no DB migration. probe_signals/classify/decide_action/run_rollback, all never-raise. - Reserved-agent job `post-deploy-monitor` (no-LLM, Variant B, calque of deploy-finalizer): self-requeues each tick via enqueue_job. - Deterministic classify: DEGRADED iff >= fail_threshold consecutive health failures OR window 5xx ratio > 5xx_threshold; fail-safe HEALTHY. - Self-hosting invariant (BR-5/AC-8): a tick NEVER restarts the prod orchestrator container -> orchestrator is ALWAYS ALERT_ONLY. - Conditionality (ORCH-35/36/43/58): kill-switch + CSV repos, empty -> self-hosting only. - QG_CHECKS / STAGE_TRANSITIONS / schema unchanged (AC-12). - Docs: CHANGELOG, CLAUDE artefact list (16-post-deploy-log.md), architecture README, .env.example (ORCH_POST_DEPLOY_*). Refs: ORCH-021 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-07 14:40:06 +00:00
claude-bot	9070489968	fix(staging): tolerate sandbox-infra-only FAILs (C9a/C9b) in deploy-staging verdict Some checks failed CI / test (push) Failing after 39s Details CI / test (pull_request) Failing after 35s Details The self-hosting orchestrator looped on deploy-staging -> development because scripts/staging_check.py exited 1 on ANY failed check, so two infra-only checks (C9a sandbox branch / C9b analyst-job — caused by SANDBOX bot accounts not being members of the sandbox Plane project, NOT a pipeline regress) forced staging_status: FAILED -> rollback -> loop, burning developer retries and tokens. Direction (б) per ADR-001: classify staging checks as REAL (all pipeline checks, fail-closed) vs SANDBOX_INFRA (narrow allowlist {C9a, C9b}, waivable). New leaf module src/staging_verdict.py (stdlib-only, never-raise): classify_check + compute_staging_verdict fold per-check results into a tolerant-but-fail-closed verdict — any REAL failure -> FAILED/exit1 (safety net holds under any flag); only C9a/C9b failed & tolerant -> SUCCESS/exit0 with waived list; only infra & strict -> FAILED/exit1; any internal error -> FAILED/exit1 (never a false green). staging_check.py now auto-classifies each check (public 3-tuple _items shape kept as an ORCH-048 b6 regression guard), exposes categorized_items(), prints INFRA-WAIVED/VERDICT lines, and exits via the verdict; new --strict flag forces legacy strictness per-run. Kill-switch ORCH_STAGING_INFRA_TOLERANCE_ENABLED (default true) restores legacy strict mode globally. launcher gains action_stage_no_changes_note so "no changes to commit" on action stages is logged as expected, not treated as under-delivery. Contracts unchanged: STAGE_TRANSITIONS, QG_CHECKS registry, staging_status:/ deploy_status: frontmatter, hook exit-code (0/1/2), check_staging_status; no DB migration. Docs: README, STAGING_CHECK.md, deployer.md, .env.example, CHANGELOG. Refs: ORCH-061 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-07 12:39:00 +00:00
claude-bot	4db8276f98	fix(reconciler): skip escalated / Blocked / Needs-Input tasks in F-1 All checks were successful CI / test (push) Successful in 16s Details CI / test (pull_request) Successful in 16s Details Reconciler F-1 could not tell "stuck by a lost webhook" from "escalated: max developer retries reached, waiting for a human". With CI green and a reviewer that kept sending REQUEST_CHANGES up to the cap, every tick re-unblocked development -> review -> rollback -> re-unblock (incident ET-013, infinite bounce: wasted agent runs, Telegram spam, parasitic load on the shared self-hosting instance). Add two pre-gate guards in Reconciler._reconcile_gate_task (after the existing analysis/no-gate/active-job/grace guards, before the gate pre-evaluation), each an early silent return (no advance, no unblocked_total increment, no notifications): - Guard 1 (escalated, deterministic, no network, checked first): developer_retry_count(task_id) >= MAX_DEVELOPER_RETRIES. Promote stage_engine._developer_retry_count to public developer_retry_count (single source of truth; private alias kept). Limit from the constant, not a literal 3. - Guard 2 (explicit human Plane gate, Variant A, no DB migration): new never-raise plane_sync.fetch_issue_state + Reconciler._is_blocked_or_needs_input; any error/None/unresolved project -> conservative skip. New sub-flag ORCH_RECONCILE_SKIP_BLOCKED_ENABLED mutes only the networked Guard 2. F-2 unchanged: Blocked/Needs Input are outside {in_progress, approved, rejected} so they are never replayed (regression test added). DB schema, STAGE_TRANSITIONS, QG_CHECKS, never-raise, analysis carve-out and kill-switches untouched. Refs: ORCH-060 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-07 11:50:02 +00:00
claude-bot	6ddff5583d	fix(ORCH-058): parametrize staging_check in --build-staging + explicit staging target All checks were successful CI / test (push) Successful in 19s Details CI / test (pull_request) Successful in 18s Details Round-3 review follow-up on `c53d625` (P1/P2): - P1: --build-staging now runs staging_check via parametrized STAGING_CONTAINER / STAGING_CHECK_PATH / STAGING_CHECK_MODE (default orchestrator-staging / bind-mount path / stub) instead of hardcoding $TARGET_SERVICE + the script path. docker exec runs INSIDE the staging container (ORCH-048 canonical: B6 registry isolation), after health, before exit 0. Fail-closed: any non-zero -> exit 1. STAGING only (8501). - P2a: rebuild_staging_image now passes the STAGING target EXPLICITLY (TARGET_SERVICE/TARGET_PORT/COMPOSE_PROFILE/STAGING_CONTAINER) so the self-rebuild can never drift onto prod 8500 if hook defaults change (AC-9). - P2b: TC-09 caller<->hook contract tests assert the ssh command carries GIT_SHA + BUILD_CONTEXT + the staging target and never the prod 8500 one; no-ssh-host fails closed. - P3: consolidated the three duplicate README footers into one. - Docs (golden source): DEPLOY_HOOK.md step 4 + env rows, README footer, CHANGELOG, Dockerfile ARG GIT_SHA="" comment, .env.example freshness block. Validates exactly the artefact later BUILD-ONCE retagged to prod (AC-4, ADR-001 step 3). 632 tests pass, ruff clean, bash -n OK. Refs: ORCH-058 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-07 09:24:38 +00:00
claude-bot	2ee06ae676	fix(deploy-hook): --build-staging must build from validated worktree, recreate+health 8501 All checks were successful CI / test (push) Successful in 17s Details Closes reviewer P0/P1 (ORCH-058 attempt 3): the committed --build-staging hook recomputed GIT_SHA=$(git rev-parse HEAD) in $REPO (prod clone on `main`) and built `docker build ... "$REPO"`, ignoring the caller-supplied BUILD_CONTEXT/GIT_SHA. On the deploy-staging -> deploy edge the PR is not yet merged, so `main` HEAD != the validated SHA -> the staging image got the wrong revision label and Strategy-B's guard fail-closed on EVERY valid self-deploy (AC-6 deadlock). It also only did `docker build` + exit 0 — never recreating 8501 nor health-checking — so rebuild_staging_image's rc=0 ("rebuilt and healthy") was a lie (AC-4 unmet). - Hook --build-staging now honours caller BUILD_CONTEXT (validated worktree) and GIT_SHA, recreates orchestrator-staging on the fresh image and runs the 10x6s health-check; build/health failure -> exit 1 (FAILED contract preserved). - image_freshness.rebuild_staging_image: document why COMPOSE_PROFILE/TARGET_SERVICE/ TARGET_PORT are intentionally omitted (hook STAGING defaults -> 8501 only, P2). - tests: assert the caller<->hook contract (builds from $BUILD_CONTEXT, no `git rev-parse HEAD` recompute, recreates + health-checks 8501) so the P0 regression can't pass green again (P1). Refs: ORCH-058 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-07 08:37:51 +00:00
claude-bot	83397570fe	developer(ET): auto-commit from developer run_id=264 Some checks failed CI / test (push) Failing after 17s Details	2026-06-07 07:46:19 +00:00
stream	36c1898fac	Merge remote-tracking branch 'origin/main' into feature/ORCH-036-orch-36-deploy-b All checks were successful CI / test (push) Successful in 16s Details CI / test (pull_request) Successful in 14s Details # Conflicts: # .env.example # CHANGELOG.md # docs/architecture/README.md # docs/operations/INFRA.md # src/config.py	2026-06-07 00:22:19 +03:00
claude-bot	d79defeadd	fix(deploy): clear stale self-deploy markers on rollback; document env Re-deploy after a FAILED prod deploy wedged the task on `deploy`: the sentinel markers (approve-requested/initiated/result) are keyed by the stable work_item_id, so after the БАГ-8 rollback (deploy -> development) and a developer fix, Phase B's idempotency-guard saw a STALE `initiated` and became a no-op — the detached hook never re-launched and the finalizer was never enqueued. Add self_deploy.clear_state (never-raise, idempotent) and call it on the check_deploy_status FAILED rollback and at the start of Phase A, so every fresh prod-deploy pass starts clean. Also document the new ORCH_SELF_DEPLOY_* / ORCH_DEPLOY_* descriptors in the canonical .env.example (CLAUDE.md rule #8, ТЗ §2.6), modelled on the ORCH-043 merge-gate block (placeholders only, secrets not committed). Contracts untouched: STAGE_TRANSITIONS, QG_CHECKS, _parse_deploy_status, БАГ-8, merge-gate. Refs: ORCH-036 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-06 21:07:35 +00:00
claude-bot	63187ff102	developer(ET): auto-commit from developer run_id=192	2026-06-06 21:07:35 +00:00
claude-bot	7d2d77217a	feat(reconciler): sweeper потерянных webhook (реконсиляция застрявших стадий) Конвейер продвигается только входящими webhook; потерянное событие (502 на ребилде, отсутствие ретраев у Plane/Gitea, неразрезолвленный sha→branch) оставляет задачу молча застрявшей (класс инцидента ORCH-044). Новый фоновый daemon-поток src/reconciler.py (паттерн queue_worker) доигрывает пропущенный переход через те же штатные гейты/обработчики, что и webhook: - F-1 gate-side: для задач stage≠done, без активного job и age(updated_at) ≥ grace_for_stage(stage) — read-only пред-оценка канонического QG; зелёный → stage_engine.advance_stage(..., finished_agent=None); красный → тишина (спам нотификаций структурно невозможен). analysis F-1 не трогает (человеческий гейт). - F-2 plane-side: опрос Plane API per-project (plane_sync.list_issues_by_state, курсорная пагинация, never-raise) → реплей In Progress/Approved/Rejected через существующие handle_status_start/handle_verdict (async из sync-потока, asyncio.run). - F-3: усиление sha→branch в handle_ci_status — БД-fallback по единственной development-задаче repo (неоднозначность → не резолвим), debug→info. - Анти-дубль на создании (db.create_task_atomic под process-wide Lock): гонка reconcile↔webhook не плодит второй task/branch/worktree/analyst-job (AC-4). - F-4 observability: лог-строка разблокировки + Telegram + блок reconcile в /queue. Старт/стоп в main.lifespan (после worker.start() / перед worker.stop()), restart-safe, never-raise на единицу работы. Kill-switches ORCH_RECONCILE_ENABLED / ORCH_RECONCILE_PLANE_ENABLED + grace-настройки. Схема БД и реестры STAGE_TRANSITIONS/QG_CHECKS не менялись. Тесты: test_reconciler.py, test_reconciler_plane.py, test_gitea_sha_resolve.py, test_config.py (33 новых, 563 всего зелёные). Документация обновлена (golden source): architecture/README.md, INFRA.md, README.md, CHANGELOG.md, adr-0007 → accepted. Refs: ORCH-053 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-06 20:55:25 +00:00
claude-bot	00d69d9e27	feat(merge-gate): auto-rebase onto current main + re-test + serialise merges All checks were successful CI / test (push) Successful in 15s Details CI / test (pull_request) Successful in 17s Details Deterministic (no-LLM) sub-gate on the deploy-staging -> deploy edge that catches a feature branch up to the CURRENT origin/main, re-tests the combined tree, and serialises merges with a per-repo file lease — so two green parallel branches can no longer break main (self-hosting safety for the orchestrator repo). - src/merge_gate.py: branch_is_behind_main, auto_rebase_onto_main (push --force-with-lease ONLY the task branch, NEVER main), retest_branch, and a file merge-lease (atomic O_CREAT\|O_EXCL, holder-aware release, stale reclaim). Strict never-raise contract; all git ops in the per-branch worktree. - src/qg/checks.py: check_branch_mergeable composes the primitives under the lease; registered in QG_CHECKS. Conditional rollout (merge_gate_enabled / merge_gate_repos, default self-hosting only). - src/stage_engine.py: sub-gate hook on deploy-staging (not a new stage). PASS -> advance; "merge-lock busy" -> DEFER (re-queue with available_at, anti-deadlock at max_concurrency=1, capped); conflict/red re-test -> rollback to development + developer retry (capped by MAX_DEVELOPER_RETRIES). Lease released on deploy->done / rollback / PR-merged webhook. - src/db.py: enqueue_job(available_at_delay_s=...) for the defer (no schema change). - src/webhooks/gitea.py: holder-aware lease release on PR-merged. - src/config.py + .env.example: ORCH_MERGE_* settings. Docs: README + adr-0006 (architect) already cover the design; CHANGELOG updated. Tests: test_merge_gate.py, test_qg_merge_gate.py, test_merge_gate_race.py, test_stage_engine.py::TestMergeGate, test_config.py, QG-registry snapshot. Full suite: 535 passed. Refs: ORCH-043 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-06 17:32:50 +00:00
claude-bot	05c17135c1	feat(notifications): add bump mode + russify Telegram live-tracker All checks were successful CI / test (push) Successful in 13s Details CI / test (pull_request) Successful in 13s Details ORCH-042: new ORCH_TRACKER_MODE (Settings.tracker_mode, default edit) selects the live-tracker card behaviour. bump mode re-creates the card at the bottom of the chat on every update (delete_telegram + send silently + repoint message_id), keeping the "one card per task" invariant: <=1 new message per call, repoint only on successful send, delete result never gates the send. New never-raising delete_telegram helper. Anything != "bump" resolves to edit (zero regression). Also russify/cosmetic-fix the card text (both modes): "Подтверждение BRD" label, ✅ after approve-gate, Russian stage labels, "📦 Внедрено". Docs updated in the same PR (CHANGELOG, internals.md, .env.example). Refs: ORCH-042 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-06 10:13:49 +00:00
claude-bot	03c3d77cac	feat(stage-engine): embed verbatim reviewer/tester findings in rollback task_desc All checks were successful CI / test (push) Successful in 12s Details CI / test (pull_request) Successful in 11s Details При заворотах на development task_desc теперь несёт дословный must-fix текст (P0/P1 ревьюера, причина FAIL тестера) вместо одной ссылки на файл — developer- агент видит суть претензий сразу и не повторяет ту же ошибку, экономя retry- бюджет и токены общего инстанса. - Новый defensive-модуль src/review_parse.py (never-raise): extract_review_findings (P0/P1 из 12-review.md ## Findings), extract_test_failures (фрагмент тела 13-test-report.md: pytest output / FAIL-строки / Итог), усечение по лимиту. - Две rollback-ветки stage_engine: встраивают текст + сохраняют ссылку на полный файл; graceful-фоллбэк на ссылку-строку при битом/пустом артефакте. - Последовательность отката, retry-счётчик, поля AdvanceResult, реестр QG_CHECKS не менялись. - Доки: README (Stage Engine / Откаты), CHANGELOG. - Тесты: tests/test_review_parse.py, test_stage_engine.py::TestRollbackTaskDescEmbedding. Refs: ORCH-046 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-06 04:42:11 +00:00
claude-bot	51a76e8169	fix(qg): read result: alongside verdict:/status: in tests gate All checks were successful CI / test (push) Successful in 12s Details CI / test (pull_request) Successful in 11s Details _parse_tests_verdict now accepts three equal-rank machine-readable frontmatter fields in 13-test-report.md — result: (canonical tester output), verdict: and status: (legacy/enduro-trails). Any one non-empty field suffices; a negative token in any field stays authoritative. Fixes the producer/consumer contract mismatch where the tester emits `result: PASS` (per .openclaw/agents/tester.md) but the gate only read verdict:/status:, causing a testing->development rollback loop until MAX_DEVELOPER_RETRIES (observed on ORCH-17). Token sets frozen and gate signature/QG_CHECKS unchanged for full backward compatibility. Refs: ORCH-047 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-05 21:03:32 +00:00
stream	0eff781d13	feat(qg): ORCH-045 — poll check_ci_green with retry to fix CI race (pending->success) All checks were successful CI / test (push) Successful in 12s Details CI / test (pull_request) Successful in 12s Details	2026-06-05 19:59:06 +00:00
stream	d615747d53	revert(ORCH-017): drop shared check_tests_passed gate change — moved to ORCH-47 (own ADR); keep only approve-ping links All checks were successful CI / test (push) Successful in 12s Details CI / test (pull_request) Successful in 11s Details	2026-06-05 19:28:27 +00:00
claude-bot	e62d51aa77	fix(qg): testing gate reads documented tester `result:` frontmatter key (ORCH-017) All checks were successful CI / test (push) Successful in 12s Details CI / test (pull_request) Successful in 11s Details check_tests_passed/_parse_tests_verdict gated the testing -> deploy-staging transition on `verdict:`/`status:` in 13-test-report.md, but the tester agent prompt (.openclaw/agents/tester*) documents `result: PASS \| FAIL` as THE machine-readable field. A report that followed the contract literally (ORCH-017: only `result: PASS`, no verdict:/status:) was bounced back to development with a misleading "Tests FAILED". ORCH-016 only passed because its report redundantly carried both `verdict:` and `result:`. Treat `result:` as a first-class machine field alongside verdict/status; a negative token in any field stays authoritative (ET-013 contract preserved). Self-hosting QG fix: unblocks every project whose tester emits only `result:`. Docs updated in-PR: CHANGELOG, architecture README machine-keys note. Tests: test_qg.py::TestCheckTestsPassed::test_result_pass_only_passes / _fail_only_fails. Refs: ORCH-017 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-05 18:34:25 +00:00
claude-bot	69a4aaab99	feat(notifications): direct BRD + Plane links in approve ping (ORCH-017) All checks were successful CI / test (push) Successful in 12s Details CI / test (pull_request) Successful in 12s Details notify_approve_requested now embeds two HTML <a> links into the single notifying approve-gate message: a Gitea branch-view link to 01-brd.md and a Plane issue browser link. Adds ORCH_PLANE_WEB_URL (external Plane web URL, fallback to plane_api_url) with a loopback-guard that omits the Plane link when the resolved base is localhost/empty (no broken localhost URLs in prod). Each link is built independently and omitted on missing data; the message and the "flip to Approved" call to action are always sent as exactly one ping. The shared send_telegram helper is left untouched (min blast radius for the self-hosting prod container). Dynamic labels are html.escaped; parse_mode=HTML preserved. QG registry / stages / approve handler unchanged. Docs updated in-PR: CHANGELOG, .env.example, INFRA env map. Tests: test_notify_approve_links.py, test_analysis_approve_flow_links.py. Refs: ORCH-017 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-05 17:58:00 +00:00
Slava	401bf66fe0	feat(agents): configurable LLM model + effort per-agent and per-project (ORCH-41) (#36 ) Some checks failed CI / test (push) Has been cancelled Details	2026-06-05 19:45:19 +03:00
Slava	8da571de86	feat(plane): unified status-comment format with duration line (ORCH-016) (#34 )	2026-06-05 17:50:47 +03:00
Dev Agent	00325bcab0	fix(plane): resolve issue states per-project instead of hardcoded enduro UUIDs (ORCH-10) All checks were successful CI / test (push) Successful in 12s Details CI / test (pull_request) Successful in 10s Details ORCH-10 root cause: PLANE_STATES was a global dict hardcoding enduro-trails UUIDs. The webhook comparison only matched ET UUID (b873d9eb) and silently ignored the ORCH in_progress UUID (e331bfb3), blocking pipeline start for all orchestrator-project tasks. Changes: - src/plane_sync.py: * Rename PLANE_STATES -> _DEFAULT_STATES (enduro UUIDs kept as safe fallback). * PLANE_STATES preserved as alias to _DEFAULT_STATES (backward compat). * Add get_project_states(project_id) -> {logical_key: state_uuid}: fetches Plane API GET /projects/<id>/states/, maps by state name, caches per project_id, falls back to _DEFAULT_STATES on API failure. * Add _STATES_CACHE: dict, reload_project_states(project_id=None). * Add _PLANE_NAME_TO_KEY mapping and _STAGE_TO_STATE_KEY for clean lookup. * Add stage_to_state(stage, project_id) using get_project_states(). * update_issue_state() uses stage_to_state() instead of STAGE_TO_STATE dict. * set_issue_{needs_input,in_review,blocked,done,in_progress,stage_state}() all resolve state UUID via get_project_states(project_id) instead of the global PLANE_STATES dict. - src/webhooks/plane.py: * handle_issue_updated: import get_project_states, resolve proj_states per incoming project_id, compare new_state against proj_states["in_progress"], proj_states["approved"], proj_states["rejected"]. * start_pipeline QG-0 blocked path: use get_project_states(plane_project_id) instead of PLANE_STATES["blocked"]. - tests/test_orch10_states.py: 23 new tests covering: * get_project_states returns correct UUIDs for both ET and ORCH projects. * API failure / empty response / None project_id -> _DEFAULT_STATES fallback. * Caching and reload_project_states (per-project and full flush). * stage_to_state() per-project resolution. * Webhook in_progress triggers pipeline for BOTH b873d9eb (ET) and e331bfb3 (ORCH). * Webhook approved/rejected routes correctly per project. * PLANE_STATES alias and _DEFAULT_STATES backward compat.	2026-06-05 14:23:31 +03:00
Dev Agent	e0c14fae5f	fix(pipeline): make deploy-staging gate conditional on self-hosting repo (ORCH-35) All checks were successful CI / test (push) Successful in 10s Details CI / test (pull_request) Successful in 10s Details	2026-06-05 10:36:46 +03:00
Dev Agent	e0b6e92b09	feat(pipeline): add deploy-staging gate before prod deploy (ORCH-35) All checks were successful CI / test (push) Successful in 9s Details CI / test (pull_request) Successful in 9s Details	2026-06-05 10:06:06 +03:00
dev-agent	757745a221	fix(qg): gate testing->deploy on machine-readable test verdict, not substring (ET-013) check_tests_passed did "if PASS in content" over the whole 13-test-report.md body, so a report explicitly marked verdict: BLOCKED / status: blocked whose prose mentioned "23 passed" / "PASS" / "All checks passed" passed the gate. On ET-013 an unfinished feature (P1 AC-19 failed) reached Done. Now mirrors check_reviewer_verdict (S-5) and check_deploy_status: read ONLY the YAML frontmatter verdict:/status: fields. Positive tokens (PASS/PASSED/ READY-TO-DEPLOY/GREEN/APPROVED) -> True; negative tokens (BLOCKED/FAILED/...) are authoritative -> False; missing/empty/no-frontmatter/bad-YAML -> False with reason; file missing -> not found. Never raises. Positive token set derived from REAL enduro-trails reports ET-001..ET-014 (inconsistent: PASS, ready-to-deploy+status:PASSED, stage:ready-to-deploy+status:pass, PASS — ready-to-deploy). Validated: all 9 prior passing WIs stay True, ET-013 -> False.	2026-06-04 16:05:52 +03:00
dev-agent	4e4cc6c724	fix(qg): find 14-deploy-log.md in origin/main when absent in feature worktree ET-013: deployer writes 14-deploy-log.md and merges deploy artifacts into main via a separate PR, so the log lands in origin/main, not the feature branch worktree that check_deploy_status reads via _repo_path(repo, branch). Result: every successful deploy was falsely failed (Deploy log not found) and rolled back deploy->development. Fix: when the log is absent in the worktree, fall back to reading it from origin/main on the shared clone (git fetch origin main + git show origin/main:docs/work-items/<WI>/14-deploy-log.md). Lookup order: worktree -> origin/main -> not found. Fetch/show failures degrade to not found (never raise). Does not touch the merge-gate in gitea.py. Tests: origin/main SUCCESS->PASS (ET-013 case), origin/main FAILED->FAILED, absent everywhere->not found, fetch failure->degrades no exception, worktree log short-circuits main lookup.	2026-06-04 13:35:35 +03:00
dev-bot	ec9aa74492	fix(tracker): no duplicate Telegram messages on not-modified/transient edits edit_telegram now returns a distinguishable outcome (ok\|not_modified\|gone\| failed) instead of a bare bool. update_task_tracker only sends a NEW message when the original is truly gone; not_modified and transient failures no longer spawn duplicate trackers or orphan the live one. render_task_tracker shows "попытка N" on an actively re-run stage (>=2 agent runs) so the text changes between review<->development cycles. Finished (✅) lines are unchanged. Tests: edit_telegram classification (ok/not_modified/gone/failed via mocked httpx), update_task_tracker (not_modified/failed -> no send, gone -> send+id), render attempt marker.	2026-06-04 13:20:40 +03:00
dev-bot	9a0298de9d	feat(telegram): live editable task tracker (Variant B+), replace 15-message spam Replace the ~15 separate Telegram messages per task (agent start/finish, stage transition, QG-pending, tech noise) with ONE live tracker message edited in place (editMessageText) on every stage transition. Only attention-worthy events are still sent as SEPARATE, notifying messages: approve-gate, deploy-fail, agent-fail, task error. - db.py: idempotent ALTERs — tasks.tracker_message_id, tasks.title, tasks.brd_review_started_at/ended_at, agent_runs.model. Helpers for tracker message_id + BRD-review clock. - usage.py: short_model_name() (strip provider/claude- prefix); parse model from result-JSON modelUsage; record_usage persists model. - notifications.py: render_task_tracker(task_id) (stateless render from agent_runs), update_task_tracker (sendMessage->store id->editMessageText with fallback to a new message, silent), edit_telegram(). Per-stage line in↓/out↑·cost·model, ⏸️ Ревью БРД (human time), 💰 totals, finish block (⏱️ wall/agents/yours, 🔗 PR · 📦). notify_* are now tracker-only/log-only except the four alerts. - stage_engine.py: stamp brd_review_ended on analysis->architecture advance. - webhooks/plane.py: persist task title on creation. - tests/test_telegram_tracker.py: render, short_model_name, send/edit/fallback, separate-vs-silent alert behavior.	2026-06-04 11:42:46 +03:00
Dev Agent	61e26a8930	fix(observability): merge-gate on deploy, full token input, Plane Done, artifact links 1. BUG 8 (second door): merge webhook no longer fake-completes a task at the deploy stage; done is gated by the deployer verdict (check_deploy_status). Other stages keep merge->done. 2. Token accounting: parse+persist cache_creation_input_tokens (new idempotent agent_runs column). usage_comment / task_summary now show the FULL input (input + cache_read + cache_creation) with a cached breakdown. cost_usd untouched. 3. deploy->done success now forces the Plane issue to terminal Done state. 4. All agents (architect/developer/reviewer/tester/deployer) attach artifact links to their finish comment via gitea_public_url. Tests added for each fix; pytest 244 passed / 9 failed (off-limits HMAC group).	2026-06-04 11:17:58 +03:00
dev-agent	e4a9c48395	fix(deploy): gate deploy->done on deployer verdict, not LLM exit code	2026-06-04 02:43:01 +03:00
Dev Agent	3a285de11d	fix(ci): bounce task back to developer on red CI (capped retries)	2026-06-04 01:39:40 +03:00
Dev Agent	e15d339b14	fix(qg): use check_ci_green instead of local tests on development stage	2026-06-04 01:22:43 +03:00
orchestrator-dev	90c9ffe839	fix(qg): run pytest directly instead of make in check_tests_local	2026-06-04 00:43:04 +03:00
Dev Agent	0b8013cb06	fix(stage): approved verdict advances analysis->architecture instead of re-running gate	2026-06-03 23:30:08 +03:00
Dev Agent	ca63bc26bb	feat(config): external gitea_public_url for clickable doc links	2026-06-03 22:58:18 +03:00
dev-agent	a9cdb17614	feat(plane): analyst comment asks for Approved status + links docs The analyst ready-comment used the obsolete :approved: wording (comment-based approve was removed in PR #12). Rewrite it for the status-only model: ask the stakeholder to move the issue to Approved (reject = reason comment + Rejected), and add clickable Gitea links to the analyst docs that actually exist in the worktree.	2026-06-03 22:42:53 +03:00
dev-agent	96c5e6b2f9	fix(pipeline): fetch issue name from Plane API on status-trigger start issue.updated ships only the changed fields, so name was absent and the branch slug became feature/<id>-untitled. Add fetch_issue_fields (single issue-detail GET returning name+description, reusing the endpoint/token of fetch_issue_description) and pull the name above the slug build. Empty name still falls back to untitled.	2026-06-03 22:42:53 +03:00
dev-agent	b91be74692	fix(pipeline): pass issue description to analyst task file start_pipeline built the analyst .task.md with only the Title, so the analyst received a ~101-byte file and reported the business request as empty even though the description was already fetched. Append the resolved description to task_desc.	2026-06-03 22:42:02 +03:00
Dev Agent	857bad314c	feat(webhook): pull reject reason from latest comment handle_verdict(rejected): the reason is now pulled from the issue latest Plane comment (_latest_comment_reason: GET comments, newest by created_at, HTML stripped) instead of a fixed stub. Slava writes the reason in a comment before flipping the status to Rejected. Falls back to a fixed note when there is no comment / the API call fails. tests: add test_status_only_verdict.py (test_inreview_comment_does_not_revert [bug 3 root], test_any_comment_no_pipeline_action, test_approved_status_advances_without_inprogress_reset, test_rejected_status_pulls_reason_from_comment) and test_inprogress_from_needs_input_relaunches_analyst in test_status_trigger.py. Rewrote the comment-based tests (test_verdict_status, test_plane_approved/ rejected in test_webhooks) under the status-only model: comments are no-ops, verdicts come from status changes.	2026-06-03 22:18:24 +03:00
Dev Agent	c4be50ee20	fix(webhook): drop redundant in_progress reset on Approved handle_verdict(approved): removed set_issue_in_progress(work_item_id) before _try_advance_stage. _try_advance_stage -> advance_stage -> plane_notify_stage already PATCHes the issue to the NEXT stage status, so the reset only made the board flicker In Progress before the next stage (part of bug 3).	2026-06-03 22:18:13 +03:00
Dev Agent	6b3e144949	fix(webhook): remove comment-based approve, keep status-only verdict Status-only verdict model: comments NEVER drive the pipeline. Removed the whole comment-based control mechanism from handle_comment (:approved: / :rejected: / answer-to-questions) which caused bug 3 (echo self-hit): the analyst posts its own "waiting for approval" comment, handle_comment catches its own comment and reverts In Review -> In Progress. handle_comment is now a pure logger with no side effects. handle_status_start: a return to In Progress on an EXISTING task (Slava answered the analyst questions in Needs Input) now RELAUNCHES the stage agent instead of being a no-op. Distinguished from a duplicate In Progress webhook via has_active_job_for_task() (new db helper): no active job => agent idle => relaunch; active job => busy => skip (no double launch).	2026-06-03 22:18:02 +03:00
Dev Agent	ac9f5a05a6	fix(work-item): prevent work_item_id collision and bind branch per task ET-006 was handed to two different tasks because M-6 derives work_item_id from the Plane sequence_id, which can collide -> the two tasks shared a branch/worktree slug prefix and stepped on each other. 2a: ensure_unique_work_item_id() is a uniqueness-guard LAYERED ON TOP of the M-6 derive (derive is untouched): if the derived ET-NNN already exists in tasks for the repo, it walks forward to the next free number. Applied in start_pipeline after the derive. 2b (defense-in-depth): worktree is keyed by branch; if the resulting branch is already owned by another task in the repo, disambiguate it with the unique work_item_id + plane id so two tasks can never share a worktree.	2026-06-03 21:12:51 +03:00
Dev Agent	fa746105fd	fix(webhook): fetch description from Plane API on status-start Plane issue.updated (status -> In Progress) ships only changed fields, so the webhook payload has no description and QG-0 wrongly blocked issues. start_pipeline now pulls the full description from the Plane issue detail API (reusing the same GET endpoint + shared token as fetch_issue_sequence_id) when the payload field is empty/short, before QG-0 runs. Empty API -> honest QG-0 fail (truly empty ticket).	2026-06-03 21:12:38 +03:00
Dev Agent	9a702a0216	feat(metrics): per-agent token/cost accounting Feature 4. claude is now launched with --output-format json; the run-log trailing result JSON is parsed (defensively, never fatal) for usage + total_cost_usd. New idempotent ALTERs add input_tokens/output_tokens/cache_read_tokens/cost_usd to agent_runs; the launcher monitor records usage per run, posts a per-agent finish comment under that agent bot (e.g. Developer gotov · 45.2k in / 12.1k out · $0.21), and the deployer posts an end-of-task summary (SUM over agent_runs GROUP BY agent) on done. New src/usage.py holds parse/format/record/summary helpers; test_usage.py covers parsing a real CLI JSON blob, NULL-on-garbage, recording, formatting, and the per-task aggregate.	2026-06-03 18:18:46 +03:00
Dev Agent	09b1c5e1b9	feat(webhook): start pipeline on In Progress status (not on create) Feature 1. work_item.created no longer starts the pipeline (soft QG-0 log only); the issue stays in the backlog until moved to In Progress. The pipeline-start body is extracted into start_pipeline(); a new issue updated handler routes a state change to In Progress -> handle_status_start, which is idempotent: an existing task for the plane_id is NOT re-created or restarted (protects handle_comment, which also flips issues to In Progress). Real Plane payload: event=issue, action=updated, data.state.id. Existing m6/plane_webhook/dedup tests updated to drive the new trigger; new test_status_trigger.py covers created-no-op / start / idempotent.	2026-06-03 18:18:26 +03:00
Dev Agent	a4668c0303	feat(plane): stage visibility on board + verdict status UUIDs Feature 3 + Feature 2 infra. Extend the global PLANE_STATES with the 6 new enduro status UUIDs (architecture/development/review/testing + approved/rejected), remap STAGE_TO_STATE so the 4 mid-pipeline stages move the issue across its own board column instead of all sitting in In Progress, and add the set_issue_stage_state() helper. Needs Input / In Review / Blocked keep their own explicit setters and stay higher priority. TODO(ORCH-10): statuses are per-project; resolve per project when more projects are onboarded.	2026-06-03 18:18:17 +03:00
Dev Agent	d305521067	feat(plane): per-agent bot authorship for comments add_comment now accepts an optional author (agent role) and POSTs under the matching Plane bot token via _headers_for(), so Plane shows the real author (Analyst/Architect/Developer/Reviewer/Tester/Deployer/Stream) instead of a single shared account. Unknown/empty roles or missing tokens fall back to the shared orchestrator token (autonomy preserved). GET/PATCH (find_issue_id, set_state) are unchanged and stay on the shared token. Call sites in stage_engine, launcher, webhooks/plane and the plane_sync notify helpers now pass author by stage role; stage transitions use stream. Adds tests/test_plane_author.py.	2026-06-03 10:53:25 +03:00
Dev Agent	30d6dd0557	feat(config): add per-agent Plane bot token settings Add 7 optional bot-token fields (plane_bot_analyst..stream) read from the ORCH_PLANE_BOT_* env vars, default empty. Required for per-agent comment authorship; empty values fall back to the shared orchestrator token.	2026-06-03 10:53:17 +03:00

1 2

89 Commits