ORCH-044 closes two blind spots that let a single de-authenticated agent
stall the shared queue for all projects:
P1 — preflight auth gate. `claude --version` answers even when logged out,
so version-only preflight was blind to auth. Adds a token-free, network-free
check of <AGENT_HOME>/.claude/.credentials.json: missing/unreadable/no-oauth
or an expired `claudeAiOauth.expiresAt` (epoch ms, vs now + skew) => preflight
FAIL; absent expiry => OK (no false positives). Result is cached on the same
preflight_cache_ttl. Post-factum safety net: launcher detects auth markers
("not logged in" / "/login" / "unauthorized" / 401) in the run log and resets
the preflight cache so the next tick re-evaluates auth. Auth failure is a gate,
not a transient — it does not spin the circuit breaker. Emergency toggle
ORCH_PREFLIGHT_CHECK_AUTH=false restores version-only behaviour.
P3 — empty log / no result-JSON => job failed. exit_code==0 with an empty or
JSON-less run log no longer counts as success: a separate result_ok flag gates
stage advance + usage comments, fires a Telegram alert, and routes the job
through the normal transient/permanent failure path (exit_code integrity in
agent_runs preserved).
Scope: P2 (--effort) is intentionally excluded and tracked in ORCH-50.
New settings: ORCH_PREFLIGHT_CHECK_AUTH, ORCH_CLAUDE_CREDENTIALS_PATH,
ORCH_AUTH_EXPIRY_SKEW_SECONDS. Docs updated (INFRA.md, internals.md, CHANGELOG).
Refs: ORCH-044
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Staging suite run inside orchestrator-staging via docker exec (canonical,
ADR-001). All 10/10 checks pass, exit 0. B6 now reads registry from the
running staging instance's own process-env -> sandbox present, prod ET/ORCH
absent, no false FAIL / spurious rollback.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Staging instance (8501) still runs a pre-ORCH-048 image without GET /projects,
so B6 deterministically FAILs (endpoint unavailable → no false PASS). Branch
code is correct; remediation is a host-side `--profile staging up -d --build`
of orchestrator-staging before re-running the gate.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
При заворотах на development task_desc теперь несёт дословный must-fix текст
(P0/P1 ревьюера, причина FAIL тестера) вместо одной ссылки на файл — developer-
агент видит суть претензий сразу и не повторяет ту же ошибку, экономя retry-
бюджет и токены общего инстанса.
- Новый defensive-модуль src/review_parse.py (never-raise): extract_review_findings
(P0/P1 из 12-review.md ## Findings), extract_test_failures (фрагмент тела
13-test-report.md: pytest output / FAIL-строки / Итог), усечение по лимиту.
- Две rollback-ветки stage_engine: встраивают текст + сохраняют ссылку на полный
файл; graceful-фоллбэк на ссылку-строку при битом/пустом артефакте.
- Последовательность отката, retry-счётчик, поля AdvanceResult, реестр QG_CHECKS
не менялись.
- Доки: README (Stage Engine / Откаты), CHANGELOG.
- Тесты: tests/test_review_parse.py, test_stage_engine.py::TestRollbackTaskDescEmbedding.
Refs: ORCH-046
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
check_tests_passed/_parse_tests_verdict gated the testing -> deploy-staging
transition on `verdict:`/`status:` in 13-test-report.md, but the tester agent
prompt (.openclaw/agents/tester*) documents `result: PASS | FAIL` as THE
machine-readable field. A report that followed the contract literally
(ORCH-017: only `result: PASS`, no verdict:/status:) was bounced back to
development with a misleading "Tests FAILED". ORCH-016 only passed because its
report redundantly carried both `verdict:` and `result:`.
Treat `result:` as a first-class machine field alongside verdict/status; a
negative token in any field stays authoritative (ET-013 contract preserved).
Self-hosting QG fix: unblocks every project whose tester emits only `result:`.
Docs updated in-PR: CHANGELOG, architecture README machine-keys note.
Tests: test_qg.py::TestCheckTestsPassed::test_result_pass_only_passes / _fail_only_fails.
Refs: ORCH-017
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
notify_approve_requested now embeds two HTML <a> links into the single
notifying approve-gate message: a Gitea branch-view link to 01-brd.md and a
Plane issue browser link. Adds ORCH_PLANE_WEB_URL (external Plane web URL,
fallback to plane_api_url) with a loopback-guard that omits the Plane link
when the resolved base is localhost/empty (no broken localhost URLs in prod).
Each link is built independently and omitted on missing data; the message and
the "flip to Approved" call to action are always sent as exactly one ping. The
shared send_telegram helper is left untouched (min blast radius for the
self-hosting prod container). Dynamic labels are html.escaped; parse_mode=HTML
preserved. QG registry / stages / approve handler unchanged.
Docs updated in-PR: CHANGELOG, .env.example, INFRA env map.
Tests: test_notify_approve_links.py, test_analysis_approve_flow_links.py.
Refs: ORCH-017
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>