feat(preflight): catch logged-out auth + treat empty result as failure (ORCH-044) #50
Closed
admin
wants to merge 10 commits from
feature/ORCH-044-preflight-auth-effort into main
pull from: feature/ORCH-044-preflight-auth-effort
merge into: admin:main
admin:main
admin:feature/ORCH-109-orch-timeout-budgets-launch-ti
admin:feature/ORCH-105-
admin:feature/ORCH-106-fix-onboard-project-py-add-col
admin:feature/ORCH-011-
admin:feature/ORCH-103-orch-10b-bundled-bootstrap
admin:feature/ORCH-102-orch-10a-lite-watchdog
admin:feature/ORCH-101-orch-10-common-smoke
admin:feature/ORCH-009-turnkey-plane
admin:docs/ORCH-009-staging-log
admin:feature/ORCH-098-fnd
admin:feature/ORCH-100-fnd-f1b-sidecar-watchdog
admin:feature/ORCH-019-
admin:feature/ORCH-057-bug-follow-up-orch-040-normali
admin:feature/ORCH-099-fnd-f1a-metrics-agent-liveness
admin:feature/ORCH-027-code-coverage
admin:docs/ORCH-057-staging-log
admin:feature/ORCH-095-bug-html-1-render-task-tracker
admin:feature/ORCH-094-bug-done-deploy-plane-awaiting
admin:feature/ORCH-093-bug-merge-gitea-405-5xx-hold-p
admin:feature/ORCH-091-bug-to-analyse-stage-deploy-st
admin:feature/ORCH-090-stop-plane
admin:chore/ORCH-090-staging-log
admin:feature/ORCH-062-infra-prune-docker-build-cache
admin:feature/ORCH-063-infra-mva154-85
admin:feature/ORCH-092-6-escalation
admin:feature/ORCH-079-orch-52f-readme-reviewer
admin:feature/ORCH-078-orch-52e-orch-nnn
admin:feature/ORCH-077-orch-52d-6-anthropic
admin:feature/ORCH-076-orch-52c-handoff-frontmatter-w
admin:feature/ORCH-075-orch-52b-docs-templates-adr-na
admin:feature/ORCH-089-autoapprove-brd-autodeploy
admin:feature/ORCH-088-orch-88-10-20
admin:feature/ORCH-087-orch-87-to-analyse-bump
admin:feature/ORCH-086-orch-86-reconciler-telegram-et
admin:feature/ORCH-080-orch-52g-telegram-link-preview
admin:feature/ORCH-082-orch-81-pr-merge-verify-hold
admin:docs/ORCH-082-staging-log
admin:feature/ORCH-081-orch-52h-env-config
admin:docs/ORCH-081-staging-log
admin:feature/ORCH-074-orch-52a-frontmatter-routing-e
admin:feature/ORCH-026-b-a
admin:feature/ORCH-073-crit-main-orch-067-069
admin:feature/ORCH-069-qg-0-title-orch-qg0-title-max-
admin:restore/orch-6769-2026-06-08
admin:feature/ORCH-067-telegram-tracker-bump-plane
admin:docs/ORCH-069-staging-log
admin:feature/ORCH-071-crit-bug-merge-main
admin:integ/restore-main-2026-06-08
admin:feature/ORCH-068-bug-reconciler-livelock-unbloc
admin:feature/ORCH-066-plane
admin:feature/ORCH-059-approve-confirm-deploy-approve
admin:feature/ORCH-022-security-secret-scanning
admin:feature/ORCH-065-bug-zombie-jobs-merge-lease-ru
admin:feature/ORCH-021-post-deploy-rollback
admin:feature/ORCH-061-bug-deploy-staging-development
admin:docs/ORCH-061-staging-log
admin:feature/ORCH-060-reconciler-escalated-max-retri
admin:deployer/ORCH-058-staging-verdict-v3
admin:deployer/ORCH-058-staging-verdict
admin:feature/ORCH-058-self-deploy-retag-staging
admin:feature/ORCH-036-orch-36-deploy-b
admin:feature/ORCH-053-sweeper-webhook-stuck-task
admin:deploy-log/ORCH-053-20260606T210404
admin:feature/ORCH-043-merge-gate-auto-rebase-re-test
admin:feature/ORCH-040-root-git
admin:feature/ORCH-042-telegram-live-tracker-bump
admin:staging-log/ORCH-044-20260606T084247
admin:docs/lessons-orch-048
admin:deploy-log/ORCH-048-20260606T071157
admin:feature/ORCH-048-staging-b6-check-reads-registr
admin:staging-log/ORCH-048-20260606T053413
admin:feature/ORCH-046-stage-engine-pass-reviewer-tes
admin:staging-log/ORCH-046-20260606T044841
admin:docs/lessons-2026-06-05
admin:feature/ORCH-047-check-tests-passed-gate-must-r
admin:feature/ORCH-045-ci-poll-retry
admin:docs/lessons-orch-017
admin:feature/ORCH-017-brd-plane-telegram
admin:feat/ORCH-41-agent-models
admin:fix/ORCH-39-webhook-tests
admin:feature/ORCH-016-plane
admin:feature/ORCH-10-per-project-states
admin:docs/ORCH-9-canon
admin:feature/ORCH-35-staging-gate
admin:feature/ORCH-34-deploy-hook
admin:feature/ORCH-33-staging-testsuite
admin:feature/ORCH-31-staging-infra
admin:fix/isolate-webhook-tests-from-plane
admin:ci/add-gitea-workflow
admin:docs/product-vision
admin:fix/tests-machine-verdict
admin:fix/deploy-gate-log-path
admin:fix/tracker-edit-not-modified
admin:feat/telegram-live-tracker
admin:fix/observability-and-merge-gate
admin:fix/deploy-verdict-gate
admin:fix/ci-fail-retry-developer
admin:fix/drop-local-tests-qg
admin:fix/qg-pytest-no-make
admin:fix/approved-advances-stage
admin:fix/gitea-public-url
admin:fix/taskmd-description
admin:fix/status-only-verdict
admin:fix/pipeline-start-bugs
admin:feature/pipeline-ux
admin:feature/plane-per-agent-author
admin:feature/ORCH-M6-plane-sequence
admin:feature/ORCH-cleanup-L1L2L3
admin:feature/ORCH-5-webhook-dedup
admin:feature/ORCH-4-stage-engine
admin:feature/ORCH-7-hardening
admin:feature/ORCH-1-job-queue
admin:feature/ORCH-6-multirepo
admin:feature/ORCH-2-worktree
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "feature/ORCH-044-preflight-auth-effort"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
ORCH-044 closes two blind spots that let a single de-authenticated agent stall the shared queue for all projects.
claude --versionanswers even when logged out, so version-only preflight was blind to auth. Adds a token-free, network-free check of<AGENT_HOME>/.claude/.credentials.json: missing/unreadable/no-oauth or expiredclaudeAiOauth.expiresAt(epoch ms vsnow + skew) ⇒ preflight FAIL; absent expiry ⇒ OK (no false positives). Cached onpreflight_cache_ttl. Post-factum safety net: launcher detects auth markers (not logged in//login/unauthorized/401) in the run log and resets the preflight cache. Auth failure is a gate, not a transient — it does not spin the circuit breaker. Emergency toggleORCH_PREFLIGHT_CHECK_AUTH=falserestores version-only behaviour.failed.exit_code==0with an empty or JSON-less run log no longer counts as success: a separateresult_okflag gates stage advance + usage comments, fires a Telegram alert, and routes the job through the normal transient/permanent failure path (agent_runs.exit_codeintegrity preserved).Scope: P2 (
--effort) is intentionally excluded per Owner correction and tracked in ORCH-50. No effort code/config/docs touched.New settings:
ORCH_PREFLIGHT_CHECK_AUTH,ORCH_CLAUDE_CREDENTIALS_PATH,ORCH_AUTH_EXPIRY_SKEW_SECONDS.Docs updated in same PR:
docs/operations/INFRA.md,docs/architecture/internals.md,CHANGELOG.md.Test plan
tests/test_preflight_auth.py— missing/expired/valid/no-expiry creds, broken JSON (no raise), no-oauth, no-network, caching, AGENT_HOME path resolution, explicit-path override, worker claim gate, toggle-off,is_auth_failure_textmarkerstests/test_empty_log_failure.py—_validate_result,_finalize_jobresult_ok transitions (done/failed+alert/requeue), monitor gating (advance/comment/alert suppression), auth-marker handlingpytest tests/ -q→ 504 passed🤖 Generated with Claude Code
ORCH-044 closes two blind spots that let a single de-authenticated agent stall the shared queue for all projects: P1 — preflight auth gate. `claude --version` answers even when logged out, so version-only preflight was blind to auth. Adds a token-free, network-free check of <AGENT_HOME>/.claude/.credentials.json: missing/unreadable/no-oauth or an expired `claudeAiOauth.expiresAt` (epoch ms, vs now + skew) => preflight FAIL; absent expiry => OK (no false positives). Result is cached on the same preflight_cache_ttl. Post-factum safety net: launcher detects auth markers ("not logged in" / "/login" / "unauthorized" / 401) in the run log and resets the preflight cache so the next tick re-evaluates auth. Auth failure is a gate, not a transient — it does not spin the circuit breaker. Emergency toggle ORCH_PREFLIGHT_CHECK_AUTH=false restores version-only behaviour. P3 — empty log / no result-JSON => job failed. exit_code==0 with an empty or JSON-less run log no longer counts as success: a separate result_ok flag gates stage advance + usage comments, fires a Telegram alert, and routes the job through the normal transient/permanent failure path (exit_code integrity in agent_runs preserved). Scope: P2 (--effort) is intentionally excluded and tracked in ORCH-50. New settings: ORCH_PREFLIGHT_CHECK_AUTH, ORCH_CLAUDE_CREDENTIALS_PATH, ORCH_AUTH_EXPIRY_SKEW_SECONDS. Docs updated (INFRA.md, internals.md, CHANGELOG). Refs: ORCH-044 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>Superseded: ORCH-044 (preflight) already in main via ORCH-1/PR#51. Duplicate branch, closing without merge.
Pull request closed