src/frontmatter.py grows from a single-key reader into the full machine
contract: reader (read_frontmatter_value, unchanged), one parse primitive
(parse_frontmatter), writer (render/write_frontmatter), schema validator
(validate_schema/REQUIRED_FIELDS, warning-only by default) and a shared
strip_frontmatter helper. The five verdict gates (check_reviewer_verdict,
_parse_tests_verdict, _parse_deploy_status, _parse_staging_status,
parse_security_status) now read through the single parse_frontmatter point
instead of duplicated ad-hoc YAML logic; review_parse._strip_frontmatter and
security_gate.extract_security_findings reuse the shared helper.
Strictly backward compatible + never-raise: STAGE_TRANSITIONS, the QG_CHECKS
composition, verdict semantics (incl. ORCH-047 three-field tester + negative
token priority), reason-strings and worktree->origin/main fallback are 1:1.
The schema validator never influences a gate verdict by default; hard-fail is
reserved behind the frontmatter_validation_strict kill-switch (default False).
New formal handoff spec docs/_standards/HANDOFF_PROTOCOL.md ("stage -> required
output" + required frontmatter schema), aligned 1:1 with PIPELINE_DOCS.md.
Tests: test_frontmatter.py (TC-01..07), test_qg_verdicts.py (TC-08..15),
test_security_gate.py (TC-12), test_stages_invariants.py (TC-16). Full
tests/ green (1212).
Refs: ORCH-076
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Lift the two HUMAN gates that block an autonomous batch run (epic ORCH-088):
the BRD gate (analysis: manual Approved) and the prod-deploy gate (deploy
Phase A: manual Confirm Deploy, ORCH-059). Selective (a Plane label on the
issue), declarative, reversible, and WITHOUT touching a single technical check.
Additive, mirroring the conditional sub-gates (ORCH-035/043/058/088): leaf
src/labels.py (never-raise) + two point insertions + config flags.
STAGE_TRANSITIONS / QG_CHECKS / check_* / DB schema are NOT touched.
- autoApprove: врезка in _handle_analysis_approved_flow (files_ok branch) ->
set_issue_approved + log/Telegram/Plane-comment + advance_stage(
finished_agent=None) — the SAME path a human Approved takes (approved-via-
status -> analysis->architecture + mark_brd_review_ended). No duplicated
transition logic; re-entrancy safe.
- autoDeploy: врезка in _handle_self_deploy_phase_a after advance to deploy +
clear_state -> log/Telegram/Plane-comment + _handle_self_deploy_phase_b
(INITIATED marker, Deploying, finalizer). Only the indicative human steps are
skipped. BR-5 holds structurally: Phase A is reached only after the green edge
sub-gates, so autoDeploy can never deploy a broken build.
- plane_sync: fetch_issue_labels (None on error != []), get_project_labels
({normalized_name->uuid}, TTL cache, ambiguity sentinel), set_issue_approved.
- config flags: auto_label_enabled (kill-switch), auto_approve_label/
auto_deploy_label, auto_label_repos (empty -> self-hosting only),
auto_label_states_ttl_s. applies() (local) checked FIRST; has_label (network)
only when applies==True -> zero network / zero regression when disabled (AC-8).
- Fail-safe (never auto on doubt), transparency via log+Telegram+Plane+card,
read-only auto_labels block in GET /queue.
- Tests TC-01..TC-26 across 7 modules; docs (CLAUDE.md, architecture README,
CHANGELOG) updated in the same PR.
Refs: ORCH-089
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Этап 1 (serial e2e) пакетного автономного режима. Новая задача репо не входит
в analysis (analyst-job не выбирается, ветка не режется), пока в репо есть более
ранняя незавершённая задача (FIFO, t2.id < jobs.task_id) ИЛИ репо заморожен.
- src/serial_gate.py — новый leaf (never-raise): build_claim_clause (fail-OPEN),
is_repo_frozen (fail-CLOSED), set/clear_repo_freeze, serial_gate_applies, snapshot.
- src/db.py — идемпотентная миграция repo_freeze + serial_gate-фрагмент в claim_next_job.
- src/webhooks/plane.py + src/agents/launcher.py — отложенный срез ветки: start_pipeline
не создаёт Gitea-ветку/docs для применимого репо; релокация в _materialize_deferred_branch
на момент claim analyst-job (база = свежий origin/main с кодом предшественника, AC-6).
- src/stage_engine.py — post-deploy DEGRADED → durable per-repo freeze + Telegram-алерт.
- src/main.py — блок serial_gate в GET /queue + POST /serial-gate/unfreeze.
- src/config.py — serial_gate_enabled / serial_gate_repos / serial_gate_freeze_enabled.
FIFO-уточнение реализации (FR-2): ADR-001 D1 фиксировал t2.id != jobs.task_id; при !=
пакет одновременно созданных свежих задач взаимно блокировался бы (дедлок). t2.id <
jobs.task_id допускает самую раннюю задачу и сериализует остальные, сохраняя AC-1/R-7.
STAGE_TRANSITIONS / QG_CHECKS / check_* — без изменений. Аддитивно, под kill-switch,
never-raise, restart-safe; при выключенном флаге — нулевая регрессия (enduro не затронут).
Тесты: TC-01..TC-22 (test_serial_gate*.py + test_queue_endpoint.py); полный прогон 1114 зелёных.
Docs: README (serial gate / /queue / API / БД), CLAUDE.md, CHANGELOG.md, .env.example.
Refs: ORCH-088
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add "disable_web_page_preview": True to the JSON payload of both
low-level Telegram primitives — send_telegram (POST /sendMessage) and
edit_telegram (POST /editMessageText). Telegram no longer expands the
Plane "Modern project management" link-preview banner under every
tracker card (bump/edit) and notify/alert message, which the default
bump mode (ORCH-067) was duplicating on each transition.
Single-point fix at the primitive level — all consumers
(update_task_tracker, notify_approve_requested, notify_error, stage
alerts from launcher/stage_engine) inherit it without code changes.
parse_mode: HTML is preserved so the ORCH-NNN issue link stays
clickable; disable_notification, bump/edit logic, the one-card-per-task
invariant, return contracts and never-raise are untouched. Unconditional,
no kill-switch (ADR-001).
Tests: tests/test_link_preview_disabled.py (TC-01..06). Docs: CHANGELOG,
CLAUDE.md, docs/architecture/README.md (Notifications component).
Refs: ORCH-080
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
G1: remove the dead `model:` line from all 6 .openclaw/agents/*.md prompts —
launcher never read it; config (agent_model_*) is the single source of truth.
G2: add is_valid_model helper (format check ^claude-…$) applied inside
resolve_agent_model's resolution cascade and at the inline --fallback-model
read in _spawn. An invalid name is logged and skipped to the next valid level
(in the limit: no --model flag), never passed to the CLI, never raises. Format
check chosen over an allowlist for forward-compatibility (ADR-001).
G3 (routing) and G4 (fallback) intentionally NOT enabled — all agents stay on
claude-opus-4-8; agent_fallback_model stays "".
Docs (golden source) updated in the same change: README model/effort table +
validation, CLAUDE.md, .env.example (ORCH_AGENT_MODEL_*/EFFORT_*/FALLBACK_MODEL),
CHANGELOG. Tests: test_agent_frontmatter_no_model.py (G1), extended
test_resolve_agent_model.py (G2 never-break).
Refs: ORCH-074
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Level A — merge/deploy serialization within one repo: reuse the existing
ORCH-043/065 merge-lease (no new mechanism); the only new logic is an
unconditional pre-merge rebase in check_branch_mergeable — under the held
lease, auto_rebase_onto_main is ALWAYS called when premerge_rebase_always
(default True), not just when the branch is behind. No-op on an up-to-date
branch (rebase keeps HEAD, force-with-lease -> "Everything up-to-date", CI
not triggered). Kill-switch off -> ORCH-043 behaviour 1:1.
Level B — declarative task dependencies: additive job_deps table
(CREATE ... IF NOT EXISTS, no live-DB migration); claim_next_job gate
(NOT EXISTS) defers a job whose depends-on tasks are not yet 'done' without
occupying a max_concurrency slot; inert on empty job_deps -> zero regression.
New leaf src/task_deps.py (never-raise): is_task_ready (fail-open), DFS cycle
detection + Blocked/alert, declare/ingest_plane_relations (db source never
hits the network on the hot path), snapshot. Telegram waiting-line, /queue
observability, reconciler skip + cycle backstop, reaper untouched.
Invariants unchanged: STAGE_TRANSITIONS, QG_CHECKS registry (dep gate is a
claim_next_job врезка, not a registered QG), DB schema of existing tables,
HTTP endpoints; non-self repos remain a no-op on empty deps/scope.
Flags: ORCH_PREMERGE_REBASE_ALWAYS, ORCH_TASK_DEPS_ENABLED, ORCH_TASK_DEPS_SOURCE.
Docs: docs/architecture/README.md, CLAUDE.md, .env.example, CHANGELOG.md,
adr-0015. Tests: tests/test_orch026_*.py (64 tests); full suite 991 green.
Refs: ORCH-026
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Закрывает P0/P1 ревью (attempt 2/3): документация = golden source.
- CHANGELOG.md: запись ORCH-067 в [Unreleased] (bump-дефолт, статус-строка
карточки по модели ORCH-066, кликабельный номер задачи, новые флаги).
- CLAUDE.md: раздел «Нотификации / Telegram live-tracker» (ТЗ §5).
- .env.example: ORCH_TRACKER_MODE=bump (синхрон с новым дефолтом) +
ORCH_TRACKER_LIVE_STATUS / _TTL_S / _TIMEOUT_S.
Refs: ORCH-067
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Приводит статусы доски Plane к смыслу стадий конвейера, сохраняя
инвариант «статус — индикация, а не управление». Меняется только слой B
(отображение: src/plane_sync.py + точки выставления статуса в
stage_engine.py/webhooks/plane.py/reconciler.py); слой A — машина стадий
src/stages.py::STAGE_TRANSITIONS — остаётся байт-в-байт неизменным (AC-21).
- 6 новых логических ключей статуса (to_analyse, analysis, code_review,
awaiting_deploy, deploying, monitoring) + сеттеры и диспетчер
set_issue_stage_state.
- Project-relative alias-fallback (BR-12): новый ключ деградирует на
базовый UUID того же проекта → нулевая регрессия для enduro-trails.
- Самодеплой (ORCH-036) индицирует фазы: Awaiting Deploy / Deploying;
terminal-sync для self-hosting → Monitoring after Deploy, для прочих →
терминальный Done.
- Post-deploy монитор (ORCH-021): HEALTHY → Done, DEGRADED → Blocked
(только индикация; self-hosting ALERT_ONLY, прод не трогается, BR-5).
- Reconciler: триггер старта/резюма на To Analyse; Guard 2 учитывает
новые активные ожидания без расширения skip-set на алиасах.
- never-raise контракт сеттеров и резолвера состояний сохранён.
- Раскатка — созданием статусов в Plane оператором, без kill-switch.
Инварианты не менялись: STAGE_TRANSITIONS, QG_CHECKS (12 чеков),
check_deploy_status, exit-код-контракт хука, merge-gate, схема БД.
ADR: docs/work-items/ORCH-066/06-adr/ADR-001-plane-status-model.md
Тесты: test_plane_status_model, test_plane_to_analyse_resume,
test_plane_status_failclosed + TC в существующих наборах. 774 passed.
Refs: ORCH-066
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Split the overloaded `Approved` Plane status: it served BOTH as the human BRD
gate on `analysis` AND as the silent Phase B prod-deploy trigger on `deploy`
(ORCH-036), so a routine approve could launch a self-hosting prod restart.
ORCH-059 introduces a dedicated logical status `confirm_deploy` ("Confirm
Deploy") that triggers ONLY Phase B on `deploy`; `Approved` stays purely a
pipeline gate.
- plane_sync: map "Confirm Deploy" -> "confirm_deploy" in _PLANE_NAME_TO_KEY;
intentionally absent from _DEFAULT_STATES => fail-closed (no UUID -> .get
yields None, no KeyError, no blind deploy).
- webhooks/plane: handle_issue_updated routes "Confirm Deploy" (fail-closed
.get) to new handle_confirm_deploy (guarded to stage=="deploy") ->
_try_advance_stage(confirm_deploy=True).
- stage_engine: advance_stage gains keyword-only confirm_deploy=False; Phase B
block returns early for deploy+finished_agent is None but only initiates the
deploy when confirm_deploy=True; a plain Approved is a deterministic no-op
(returns before check_deploy_status -> no false БАГ-8 rollback).
- Phase A CTA now asks the operator for "Confirm Deploy", not "Approved".
Contracts unchanged: STAGE_TRANSITIONS, QG_CHECKS, check_deploy_status, hook
exit codes, Phases A/C, merge-gate, DB schema. Conditional like ORCH-35/36
(self-hosting only). Docs updated (CLAUDE.md, architecture/README.md, CHANGELOG).
Refs: ORCH-059
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add a deterministic (no-LLM) security sub-gate on the deploy-staging -> deploy
edge, run FIRST (before merge-gate ORCH-043 and image-freshness ORCH-058) so it
fails cheaply before any expensive rebase/rebuild, and scans origin/main..HEAD
before rebase so a task is never blamed for a CVE introduced by an updated main.
Why: the autonomous pipeline merged branches into main with no check for a leaked
secret or a vulnerable dependency. For the self-hosting orchestrator (one shared
prod instance serving every project from a shared DB) a single leak/CVE landed in
the prod of all projects (CLAUDE.md self-hosting, section 8).
- New leaf src/security_gate.py (never-raise): gitleaks (offline, fail-closed on
tool error => secrets guarantee is unconditional) + pip-audit (best-effort;
unreachable CVE feed degrades fail-open + loud warning by default, strict via
security_dep_audit_fail_closed). Verdict lives ONLY in 17-security-report.md
YAML frontmatter (write -> read-back single source of truth); FAIL is
authoritative; missing/broken frontmatter => fail-closed.
- check_security_gate thin wrapper registered in QG_CHECKS (lazy import, no cycle).
- _handle_security_gate wired FIRST in advance_stage deploy-staging block: FAIL ->
rollback to development + developer-retry (cap MAX_DEVELOPER_RETRIES); task_desc
carries verbatim findings (ORCH-046 pattern). No merge-lease release (runs before
lease acquire). Self-hosting safe: only reads/scans/writes, never deploys.
- Conditional rollout (security_gate_enabled + security_gate_repos; empty scope ->
self-hosting only). 6 new ORCH_SECURITY_* settings.
- Infra: pinned gitleaks Go binary in Dockerfile (+curl/ca-certificates), pip-audit
in requirements.txt, versioned .gitleaks.toml at repo root.
- STAGE_TRANSITIONS and DB schema unchanged.
Docs: docs/architecture/README.md (marked realized), CLAUDE.md (artifact 17),
CHANGELOG.md. Tests: test_security_gate.py, test_qg_security.py,
test_stage_engine_security_gate.py + updated registry/edge snapshots.
Refs: ORCH-022
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Extend pipeline responsibility past deploy->done: after the terminal
transition for an applicable repo, arm a ~15min observation window that
probes prod and reacts to a degradation the restart-time health-check
missed ("green deploy, red prod").
- src/post_deploy.py: new leaf module (config + lazy qg/db only).
Sentinel-file restart-safe state (.post-deploy-state-<repo>/<wi>/),
no DB migration. probe_signals/classify/decide_action/run_rollback,
all never-raise.
- Reserved-agent job `post-deploy-monitor` (no-LLM, Variant B, calque of
deploy-finalizer): self-requeues each tick via enqueue_job.
- Deterministic classify: DEGRADED iff >= fail_threshold consecutive
health failures OR window 5xx ratio > 5xx_threshold; fail-safe HEALTHY.
- Self-hosting invariant (BR-5/AC-8): a tick NEVER restarts the prod
orchestrator container -> orchestrator is ALWAYS ALERT_ONLY.
- Conditionality (ORCH-35/36/43/58): kill-switch + CSV repos, empty ->
self-hosting only.
- QG_CHECKS / STAGE_TRANSITIONS / schema unchanged (AC-12).
- Docs: CHANGELOG, CLAUDE artefact list (16-post-deploy-log.md),
architecture README, .env.example (ORCH_POST_DEPLOY_*).
Refs: ORCH-021
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>