Commit Graph

13 Commits

Author SHA1 Message Date
7d2d77217a feat(reconciler): sweeper потерянных webhook (реконсиляция застрявших стадий)
Конвейер продвигается только входящими webhook; потерянное событие (502 на
ребилде, отсутствие ретраев у Plane/Gitea, неразрезолвленный sha→branch)
оставляет задачу молча застрявшей (класс инцидента ORCH-044). Новый фоновый
daemon-поток src/reconciler.py (паттерн queue_worker) доигрывает пропущенный
переход через те же штатные гейты/обработчики, что и webhook:

- F-1 gate-side: для задач stage≠done, без активного job и age(updated_at) ≥
  grace_for_stage(stage) — read-only пред-оценка канонического QG; зелёный →
  stage_engine.advance_stage(..., finished_agent=None); красный → тишина (спам
  нотификаций структурно невозможен). analysis F-1 не трогает (человеческий гейт).
- F-2 plane-side: опрос Plane API per-project (plane_sync.list_issues_by_state,
  курсорная пагинация, never-raise) → реплей In Progress/Approved/Rejected через
  существующие handle_status_start/handle_verdict (async из sync-потока, asyncio.run).
- F-3: усиление sha→branch в handle_ci_status — БД-fallback по единственной
  development-задаче repo (неоднозначность → не резолвим), debug→info.
- Анти-дубль на создании (db.create_task_atomic под process-wide Lock): гонка
  reconcile↔webhook не плодит второй task/branch/worktree/analyst-job (AC-4).
- F-4 observability: лог-строка разблокировки + Telegram + блок reconcile в /queue.

Старт/стоп в main.lifespan (после worker.start() / перед worker.stop()),
restart-safe, never-raise на единицу работы. Kill-switches ORCH_RECONCILE_ENABLED
/ ORCH_RECONCILE_PLANE_ENABLED + grace-настройки. Схема БД и реестры
STAGE_TRANSITIONS/QG_CHECKS не менялись.

Тесты: test_reconciler.py, test_reconciler_plane.py, test_gitea_sha_resolve.py,
test_config.py (33 новых, 563 всего зелёные). Документация обновлена (golden source):
architecture/README.md, INFRA.md, README.md, CHANGELOG.md, adr-0007 → accepted.

Refs: ORCH-053

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-06 20:55:25 +00:00
00d69d9e27 feat(merge-gate): auto-rebase onto current main + re-test + serialise merges
All checks were successful
CI / test (push) Successful in 15s
CI / test (pull_request) Successful in 17s
Deterministic (no-LLM) sub-gate on the deploy-staging -> deploy edge that
catches a feature branch up to the CURRENT origin/main, re-tests the combined
tree, and serialises merges with a per-repo file lease — so two green parallel
branches can no longer break main (self-hosting safety for the orchestrator repo).

- src/merge_gate.py: branch_is_behind_main, auto_rebase_onto_main (push
  --force-with-lease ONLY the task branch, NEVER main), retest_branch, and a
  file merge-lease (atomic O_CREAT|O_EXCL, holder-aware release, stale reclaim).
  Strict never-raise contract; all git ops in the per-branch worktree.
- src/qg/checks.py: check_branch_mergeable composes the primitives under the
  lease; registered in QG_CHECKS. Conditional rollout (merge_gate_enabled /
  merge_gate_repos, default self-hosting only).
- src/stage_engine.py: sub-gate hook on deploy-staging (not a new stage). PASS ->
  advance; "merge-lock busy" -> DEFER (re-queue with available_at, anti-deadlock
  at max_concurrency=1, capped); conflict/red re-test -> rollback to development
  + developer retry (capped by MAX_DEVELOPER_RETRIES). Lease released on
  deploy->done / rollback / PR-merged webhook.
- src/db.py: enqueue_job(available_at_delay_s=...) for the defer (no schema change).
- src/webhooks/gitea.py: holder-aware lease release on PR-merged.
- src/config.py + .env.example: ORCH_MERGE_* settings.

Docs: README + adr-0006 (architect) already cover the design; CHANGELOG updated.
Tests: test_merge_gate.py, test_qg_merge_gate.py, test_merge_gate_race.py,
test_stage_engine.py::TestMergeGate, test_config.py, QG-registry snapshot.
Full suite: 535 passed.

Refs: ORCH-043

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-06 17:32:50 +00:00
Dev Agent
61e26a8930 fix(observability): merge-gate on deploy, full token input, Plane Done, artifact links
1. BUG 8 (second door): merge webhook no longer fake-completes a task at the
   deploy stage; done is gated by the deployer verdict (check_deploy_status).
   Other stages keep merge->done.
2. Token accounting: parse+persist cache_creation_input_tokens (new
   idempotent agent_runs column). usage_comment / task_summary now show the
   FULL input (input + cache_read + cache_creation) with a cached breakdown.
   cost_usd untouched.
3. deploy->done success now forces the Plane issue to terminal Done state.
4. All agents (architect/developer/reviewer/tester/deployer) attach artifact
   links to their finish comment via gitea_public_url.

Tests added for each fix; pytest 244 passed / 9 failed (off-limits HMAC group).
2026-06-04 11:17:58 +03:00
Dev Agent
3a285de11d fix(ci): bounce task back to developer on red CI (capped retries) 2026-06-04 01:39:40 +03:00
Dev Agent
e15d339b14 fix(qg): use check_ci_green instead of local tests on development stage 2026-06-04 01:22:43 +03:00
Dev Agent
e6a7c6de8d feat(webhook): dedup deliveries by delivery_id (M-7) 2026-06-03 09:18:02 +03:00
Dev Agent
20d6556e22 refactor(webhooks): enqueue_job instead of in-process launch (ORCH-1)
All 8 webhook launch points (plane x4, gitea x4) now enqueue a job and return
immediately instead of synchronously spawning claude in the uvicorn process.
2026-06-02 23:58:44 +03:00
Dev Agent
a6f6a43c1c fix(webhooks/gitea): ignore pushes/events for repos outside the registry
ORCH-6: get_project_by_repo None -> ignored, so events for unknown repos
do not trigger the pipeline.
2026-06-02 22:30:42 +03:00
Dev Agent
1ebe8afc23 feat(worktree): git worktree per task to isolate shared /repos (ORCH-2 / S-4)
- add src/git_worktree.py: ensure/remove/get_worktree_path
- config: worktrees_dir=/repos/_wt
- launcher: agent runs in per-branch worktree; task-file + commit/push in worktree; no shared checkout
- qg/checks: read artifacts + run make test from worktree (branch arg, backward-compatible)
- webhooks/plane: pass branch into QG dispatch; review fallback from worktree
- webhooks/gitea: keep read-only branch --contains in main clone (documented)
- tests: test_git_worktree.py (isolation) + update test_launcher write-task-file
- docs: ARCHITECTURE worktree section + BUGFIXES_2026-06-02_ORCH2

Preserves B-1/B-2/S-1/S-5 fixes (paths now point at worktree).
2026-06-02 21:12:06 +03:00
Dev Agent
b585701c62 fix(webhooks): dispatch new QGs; stop false Gitea CI alerts (S-1)
- plane._try_advance_stage handles check_tests_local + check_reviewer_verdict
- gitea.handle_ci_status: failure -> debug log only (CI not authoritative)
2026-06-02 20:12:29 +03:00
Dev Agent
0ad56e1f0a fix: tini entrypoint, event routing wildcard, orphan recovery 2026-05-22 13:52:46 +03:00
Dev Agent
b545665e2d feat: full pipeline fixes - CI status branch lookup, review webhook routing, auto-advance, plane sync
- handle_ci_status: fallback git branch -r --contains when branches[] empty
- webhook router: handle pull_request_approved event type
- handle_pr: map review.type to review.state for new Gitea format
- launcher: auto-advance stage after agent completion (_try_advance_stage)
- plane_sync: notify Plane on stage changes
- stages.py: stage machine with QG definitions
- notifications.py: stage change notifications
- safe.directory fix for container git operations
2026-05-22 01:57:02 +03:00
Dev Agent
daf8cdad9e feat: orchestrator MVP — webhooks, agent launcher, QG checks 2026-05-19 15:57:00 +03:00