Compare commits

..

9 Commits

Author SHA1 Message Date
7481ce334e deployer(ORCH-016): deploy stage SUCCESS (artifact-only verdict)
All checks were successful
CI / test (push) Successful in 11s
CI / test (pull_request) Successful in 11s
Pre-conditions met:
- 12-review.md: APPROVED
- 13-test-report.md: PASS
- 15-staging-log.md: staging_status=SUCCESS (10/10)

Self-hosting policy honored: prod-container orchestrator:8500
NOT restarted in stage scope (CLAUDE.md). Real prod docker
pull/restart is delegated to scripts/orchestrator-deploy-hook.sh
(ORCH-36) on PR merge to main.

Quality Gate field: deploy_status: SUCCESS (uppercase) in
14-deploy-log.md frontmatter.
2026-06-05 12:52:06 +00:00
d4b02ef728 deployer(ORCH-016): staging gate SUCCESS (10/10 checks PASS)
All checks were successful
CI / test (push) Successful in 11s
CI / test (pull_request) Successful in 11s
Ran scripts/staging_check.py --base-url http://localhost:8501 --mode stub
against the live orchestrator-staging instance. All blocks green:
  - SMOKE (A1-A3): health, queue, ORCH_STAGING=true
  - ACCESS (B4-B6): Plane sandbox, Gitea sandbox, registry isolation
  - E2E   (C7-C9b): webhook -> branch -> analyst job enqueued
  - CLEANUP: branch + Plane issue deleted, no leftover task rows

Verdict (machine-readable, YAML frontmatter): staging_status: SUCCESS

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-05 12:49:39 +00:00
2fc3206f83 tester(ET): auto-commit from tester run_id=98
All checks were successful
CI / test (push) Successful in 11s
CI / test (pull_request) Successful in 12s
2026-06-05 12:46:42 +00:00
1778d8f8b8 reviewer(ET): auto-commit from reviewer run_id=97
All checks were successful
CI / test (push) Successful in 12s
CI / test (pull_request) Successful in 11s
2026-06-05 12:44:21 +00:00
0663da6e4c feat(plane): unified status-comment format with duration line (ORCH-016)
All checks were successful
CI / test (push) Successful in 11s
CI / test (pull_request) Successful in 12s
Все агенты (analyst..deployer) теперь пишут финальный коммент через единый
хелпер usage.build_status_comment(...) — заголовок «{icon} {Role} — {описание}»,
опциональная строка Verdict/Status из YAML-frontmatter, строка
«Длительность: 4m 12s» (явный duration_s от launcher, fallback из agent_runs
для аналитика), HTML-блок Документы, тех-хвост <sub>tokens · cost</sub>.

- Новые публичные функции в src/usage.py: build_status_comment, fmt_duration,
  get_agent_duration. usage_comment(...) → тонкая deprecated-обёртка (legacy
  тесты в tests/test_usage.py продолжают работать). artifact_links(...)
  переписан на HTML <li><a>…</a></li> (breaking change для внутреннего API,
  но единственный внешний клиент — _post_usage_comments — мигрирован).
- Новый модуль src/frontmatter.py: defensive YAML reader, никогда не raise.
- stage_engine._build_analyst_ready_comment(...) теперь тонкая обёртка над
  build_status_comment(agent="analyst", ...); task_id пробрасывается из
  _handle_analysis_approved_flow для DB-фоллбэка длительности (AC-14).
- launcher._post_usage_comments(...) принимает duration_s, резолвит stage из
  tasks для deployer и worktree_root для AC-8 graceful skipping.

Тесты (16 файлов, 56 новых тестовых функций, покрывают TC-01..TC-25):
fmt_duration table, build_status_comment по всем агентам, DB-фоллбэк,
authorship под per-agent ботами, дедуп-инвариант, regression на
status-only verdict аналитика и финальный notify_done, snapshot
QG_CHECKS + STAGE_TRANSITIONS.

Документация: docs/architecture/README.md (раздел Plane Sync),
CHANGELOG.md (Unreleased Added/Changed).

Refs: ORCH-016

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-05 12:39:06 +00:00
1150cd9144 architect(ET): auto-commit from architect run_id=95
All checks were successful
CI / test (push) Successful in 12s
2026-06-05 12:15:40 +00:00
57a3f6c9f7 analyst(ET): auto-commit from analyst run_id=94
All checks were successful
CI / test (push) Successful in 13s
2026-06-05 12:05:17 +00:00
0f4d8714dd analyst(ET): auto-commit from analyst run_id=93
All checks were successful
CI / test (push) Successful in 11s
2026-06-05 11:48:54 +00:00
3cb10be03f docs: init ORCH-016 business request
All checks were successful
CI / test (push) Successful in 11s
2026-06-05 14:44:25 +03:00
13 changed files with 14 additions and 554 deletions

View File

@@ -12,17 +12,11 @@ jobs:
- uses: actions/checkout@v4
- name: Install dependencies
run: |
set -euo pipefail
python3 -m pip install --user --upgrade pip
python3 -m pip install --user -r requirements.txt
- name: Test
env:
PYTHONPATH: ${{ github.workspace }}
run: |
# ORCH-39: fail the job on ANY failure. Run the WHOLE suite from the
# repo root. --strict-markers + pytest-asyncio (asyncio_mode=auto, see
# pytest.ini) make async tests actually run instead of silently
# skipping (the hole that hid red tests behind a green CI).
set -euo pipefail
export PATH="$HOME/.local/bin:$PATH"
python3 -m pytest tests/ -q -p no:cacheprovider --strict-markers
python3 -m pytest tests/ -q

View File

@@ -5,7 +5,6 @@
## [Unreleased]
### Added
- **Конфигурируемые модель LLM и режим работы (`--effort`) агентов** (ORCH-41): модель/effort каждого агента вынесены из хардкода `launcher.py` в конфиг — глобально per-agent (`ORCH_AGENT_MODEL_<AGENT>` / `ORCH_AGENT_EFFORT_<AGENT>`, дефолты `ORCH_AGENT_MODEL_DEFAULT=claude-opus-4-8`, `ORCH_AGENT_EFFORT_DEFAULT=high`) и per-project (`agent_models` / `agent_efforts` в `ORCH_PROJECTS_JSON`). Резолверы `resolve_agent_model` / `resolve_agent_effort` (приоритет project > per-agent env > default > пусто), валидация effort `{low,medium,high,xhigh,max}`, опц. `ORCH_AGENT_FALLBACK_MODEL` (`--fallback-model`). Хардкод `"model":"opus"` (architect/reviewer) удалён. Тесты: `test_resolve_agent_model.py`, `test_resolve_agent_effort.py`.
- **Единый status-коммент агентов в Plane** (ORCH-016): `usage.build_status_comment(...)` — один хелпер для ВСЕХ ролей (analyst..deployer). HTML-формат: header `{icon} {Role} — {описание}`, опциональная строка `Verdict/Status: …` из YAML-frontmatter артефакта, **строка `Длительность: 4m 12s`** (явный `duration_s` от launcher, fallback из `agent_runs` для аналитика), `<b>Документы:</b><ul><li><a>…</a></li></ul>`, тех-хвост `<sub>tokens · cost</sub>`. Утилитки: `usage.fmt_duration`, `usage.get_agent_duration`, новый модуль `src/frontmatter.py` (defensive YAML reader). ADR `docs/work-items/ORCH-016/06-adr/ADR-001-unified-status-comment.md`.
- **Документация по канону** (ORCH-9): `CLAUDE.md` (паспорт проекта), структура `docs/` (`architecture/` + `adr/`, `operations/`, `work-items/`, `history/`), `docs/operations/INFRA.md` (RUNBOOK с инфра-изоляцией и self-hosting рисками).
- **ADR**: adr-0001 (multi-repo registry), adr-0002 (job queue), adr-0003 (условный staging-гейт).

View File

@@ -326,10 +326,6 @@ jobs со статусом `running` (воркер умёр на рестарт
- `ORCH_MAX_CONCURRENCY` (default 1) — лимит параллельных jobs.
- `ORCH_QUEUE_POLL_INTERVAL` (default 2.0) — период опроса.
- `ORCH_AGENT_MODEL_DEFAULT` / `ORCH_AGENT_MODEL_<AGENT>` (ORCH-41) — модель агентов; дефолт `claude-opus-4-8`.
- `ORCH_AGENT_EFFORT_DEFAULT` / `ORCH_AGENT_EFFORT_<AGENT>` (ORCH-41) — режим `--effort` (low|medium|high|xhigh|max).
- `ORCH_AGENT_FALLBACK_MODEL` (ORCH-41) — опц. `--fallback-model` при overloaded.
- per-project override: `agent_models` / `agent_efforts` в `ORCH_PROJECTS_JSON`; резолверы `resolve_agent_model` / `resolve_agent_effort` (project > per-agent env > default > пусто).
Наблюдаемость: `GET /queue` — counts по статусам + последние 10 jobs.

View File

@@ -48,11 +48,6 @@
| `ORCH_REPOS_DIR` / `ORCH_HOST_REPOS_DIR` | каталог репозиториев (в контейнере / на хосте) |
| `ORCH_DB_PATH` | путь к SQLite БД |
| `ORCH_PROJECTS_JSON` | реестр проектов (Plane id → repo + prefix); пусто → дефолт из `src/projects.py` |
| `ORCH_AGENT_MODEL_DEFAULT` | LLM-модель агентов по умолчанию (ORCH-41); дефолт `claude-opus-4-8` |
| `ORCH_AGENT_MODEL_<AGENT>` | per-agent модель (ANALYST/ARCHITECT/DEVELOPER/REVIEWER/TESTER/DEPLOYER); пусто → default |
| `ORCH_AGENT_EFFORT_DEFAULT` | режим работы `--effort` по умолчанию (ORCH-41): low\|medium\|high\|xhigh\|max; дефолт `high` |
| `ORCH_AGENT_EFFORT_<AGENT>` | per-agent effort; дефолт: думающие → high, tester/deployer → medium |
| `ORCH_AGENT_FALLBACK_MODEL` | опц. фолбэк-модель при overloaded (`--fallback-model`); пусто → без флага |
| `DEPLOY_SSH_USER` / `_HOST` / `DEPLOY_HOOK_SCRIPT` | параметры деплой-хука |
**Секреты — только в `.env` / `.env.staging` на хосте, в гит НЕ коммитятся.** Канон — `.env.example`, `.env.staging.example`.
@@ -60,26 +55,6 @@
## Реестр проектов (`src/projects.py`, ORCH-6)
Связывает Plane project id → gitea repo + work-item prefix. Источник: `ORCH_PROJECTS_JSON`, fallback — встроенный дефолт. Прод видит: `enduro-trails` (ET), `orchestrator` (ORCH). Staging видит ТОЛЬКО `orchestrator-sandbox` (SANDBOX) — изоляция.
## Модель и effort агентов (`src/config.py` + `src/agents/launcher.py`, ORCH-41)
Модель LLM и режим работы (`--effort`) каждого агента **конфигурируемы** — глобально per-agent (env) и per-project (через `ORCH_PROJECTS_JSON`).
**Приоритет резолвинга** (`resolve_agent_model` / `resolve_agent_effort`):
1. per-project override — `agent_models` / `agent_efforts` в записи `ORCH_PROJECTS_JSON`;
2. per-agent env — `ORCH_AGENT_MODEL_<AGENT>` / `ORCH_AGENT_EFFORT_<AGENT>` (если непусто);
3. глобальный дефолт — `ORCH_AGENT_MODEL_DEFAULT` (`claude-opus-4-8`) / `ORCH_AGENT_EFFORT_DEFAULT` (`high`);
4. пусто → флаг не передаётся, действует дефолт CLI.
**Значения effort:** `low` < `medium` < `high` < `xhigh` < `max` — рычаг «качество vs стоимость/время». Дефолтная раскладка: думающие агенты (analyst/architect/developer/reviewer) → `high`, механические (tester/deployer) → `medium`. Невалидное значение → лог-warning, флаг опускается.
**Per-project override в `ORCH_PROJECTS_JSON`** (поля `agent_models` / `agent_efforts` опциональны, старые записи работают):
```json
{"plane_project_id":"...","repo":"orchestrator","work_item_prefix":"ORCH",
"agent_models":{"developer":"claude-opus-4-8","reviewer":"claude-sonnet-4-6"},
"agent_efforts":{"developer":"xhigh","tester":"low"}}
```
> ⚠️ Бюджет (ORCH-38): `claude-opus-4-8` дефолт в коде; реальное переключение прод-env делается отдельно после согласования.
## ⚠️ Self-hosting — оркестратор дорабатывает САМ СЕБЯ
**Факт:** прод-инстанс `orchestrator` (8500) — ОДИН на ВСЕ прод-проекты (enduro-trails + orchestrator), с ОБЩЕЙ БД `./data/orchestrator.db` и общей очередью задач (ORCH-1).

View File

@@ -1,13 +0,0 @@
[pytest]
# ORCH-39: make the async webhook/state tests (test_orch10_states.py) actually
# run in every environment. Without pytest-asyncio + asyncio_mode=auto these
# @pytest.mark.asyncio tests were silently SKIPPED, so a broken async path
# could pass CI. asyncio_mode=auto runs `async def test_*` natively.
asyncio_mode = auto
# Fail loudly on unknown markers so a typo'd @pytest.mark.* can't silently
# disable a test.
markers =
asyncio: mark a coroutine test to be run by pytest-asyncio.
testpaths = tests

View File

@@ -3,4 +3,3 @@ uvicorn[standard]==0.30.0
pydantic-settings==2.5.0
httpx==0.27.0
pytest==8.3.3
pytest-asyncio==0.23.8

View File

@@ -15,82 +15,6 @@ from ..plane_sync import notify_stage_change as plane_notify_stage, add_comment
logger = logging.getLogger("orchestrator.launcher")
# ORCH-41: valid --effort values accepted by the Claude CLI. An effort that is
# not in this set is treated as misconfiguration: logged and dropped (no flag),
# never passed through to the CLI.
VALID_EFFORTS = frozenset({"low", "medium", "high", "xhigh", "max"})
def _resolve_agent_attr(agent, project_id, project_map_attr, env_attr_prefix,
default_attr):
"""ORCH-41 shared resolver with priority:
1. ProjectConfig.<project_map_attr>[agent] (per-project override)
2. settings.<env_attr_prefix><agent> (per-agent env, if non-empty)
3. settings.<default_attr> (global default)
4. "" (no flag -> CLI default)
project_id is the Plane project uuid. It is resolved to a ProjectConfig via
the registry; an unknown / empty id simply skips level 1. A missing per-agent
settings attribute (e.g. unknown agent name) skips level 2.
"""
# Level 1: per-project override.
if project_id:
from ..projects import get_project_by_plane_id
proj = get_project_by_plane_id(project_id)
if proj is not None:
override = getattr(proj, project_map_attr, {}).get(agent)
if override:
return override
# Level 2: per-agent env (settings.<prefix><agent>), if defined & non-empty.
per_agent = getattr(settings, f"{env_attr_prefix}{agent}", "")
if per_agent:
return per_agent
# Level 3: global default.
default = getattr(settings, default_attr, "")
if default:
return default
# Level 4: nothing -> CLI default.
return ""
def resolve_agent_model(agent: str, project_id: str = None) -> str:
"""ORCH-41: resolve the LLM model for an agent (optionally per-project).
Returns "" when no model is configured at any level -> caller omits --model
and the CLI default applies. See _resolve_agent_attr for the priority order.
"""
return _resolve_agent_attr(
agent, project_id,
project_map_attr="agent_models",
env_attr_prefix="agent_model_",
default_attr="agent_model_default",
)
def resolve_agent_effort(agent: str, project_id: str = None) -> str:
"""ORCH-41: resolve the --effort level for an agent (optionally per-project).
Same priority as resolve_agent_model. The resolved value is validated against
VALID_EFFORTS; an invalid value is logged and dropped (returns "") so a typo
in env/projects_json can never pass a bad flag to the CLI.
"""
value = _resolve_agent_attr(
agent, project_id,
project_map_attr="agent_efforts",
env_attr_prefix="agent_effort_",
default_attr="agent_effort_default",
)
if value and value not in VALID_EFFORTS:
logger.warning(
f"Invalid effort '{value}' for agent '{agent}' "
f"(allowed: {sorted(VALID_EFFORTS)}); omitting --effort"
)
return ""
return value
def prune_run_logs(runs_dir, keep_days=30, keep_max=500, active_paths=None):
"""L-2: best-effort rotation of per-run logs (<runs_dir>/*.log).
@@ -161,6 +85,7 @@ class AgentLauncher:
"system_prompt": ".openclaw/agents/architect.md",
"task_file": ".task-arch.md",
"allowed_tools": "Read,Write,Edit,Bash",
"model": "opus",
},
"developer": {
"system_prompt": ".openclaw/agents/developer.md",
@@ -171,6 +96,7 @@ class AgentLauncher:
"system_prompt": ".openclaw/agents/reviewer.md",
"task_file": ".task-review.md",
"allowed_tools": "Read,Write,Edit,Bash",
"model": "opus",
},
"tester": {
"system_prompt": ".openclaw/agents/tester.md",
@@ -245,12 +171,6 @@ class AgentLauncher:
_br_row = get_db().execute("SELECT branch FROM tasks WHERE id=?", (task_id,)).fetchone() if task_id else None
agent_branch = _br_row[0] if _br_row else "main"
# ORCH-41: resolve the Plane project uuid for this repo so per-project
# model/effort overrides apply. Unknown repo -> None (env/default only).
from ..projects import get_project_by_repo
_proj = get_project_by_repo(repo)
project_id = _proj.plane_project_id if _proj else None
# Ensure the per-branch worktree exists and is on the right branch.
work_path = ensure_worktree(repo, agent_branch)
@@ -284,14 +204,8 @@ class AgentLauncher:
system_prompt = config["system_prompt"]
allowed_tools = config["allowed_tools"]
# ORCH-41: model + effort + optional fallback are resolved from config
# (project-override > per-agent env > default), not hardcoded in AGENT_CONFIGS.
model = resolve_agent_model(agent, project_id)
effort = resolve_agent_effort(agent, project_id)
model = config.get("model", "")
model_flag = f"--model {model} " if model else ""
effort_flag = f"--effort {effort} " if effort else ""
fb = settings.agent_fallback_model
fb_flag = f"--fallback-model {fb} " if fb else ""
# No git fetch/checkout here: ensure_worktree() already put the worktree on
# the right branch. The agent simply runs inside its isolated work_path.
@@ -304,7 +218,7 @@ class AgentLauncher:
f'cd {work_path} && '
f'{self.CLAUDE_BIN} --print '
f'--output-format json '
f'{model_flag}{effort_flag}{fb_flag}'
f'{model_flag}'
f'"$(cat {task_file})" '
f'--system-prompt "$(cat {system_prompt})" '
f'--allowedTools {allowed_tools}'

View File

@@ -78,34 +78,6 @@ class Settings(BaseSettings):
agent_kill_grace_seconds: int = 20
agent_timeout_overrides_json: str = ""
# ORCH-41: per-agent LLM model. Empty -> agent_model_default. Resolution order:
# project-override (projects_json agent_models) > ORCH_AGENT_MODEL_<AGENT> >
# agent_model_default > CLI default (no --model flag). Default is 4-8 because
# 4-7 == 4-8 in price (Slava 05.06); do NOT hardcode the version anywhere else.
agent_model_default: str = "claude-opus-4-8"
agent_model_analyst: str = ""
agent_model_architect: str = ""
agent_model_developer: str = ""
agent_model_reviewer: str = ""
agent_model_tester: str = ""
agent_model_deployer: str = ""
# ORCH-41: per-agent effort / reasoning level: low|medium|high|xhigh|max.
# Empty -> agent_effort_default. Same resolution order as model. Default split:
# thinking agents (analyst/architect/developer/reviewer) -> high; mechanical
# agents (tester/deployer) -> medium.
agent_effort_default: str = "high"
agent_effort_analyst: str = "high"
agent_effort_architect: str = "high"
agent_effort_developer: str = "high"
agent_effort_reviewer: str = "high"
agent_effort_tester: str = "medium"
agent_effort_deployer: str = "medium"
# ORCH-41: optional per-agent fallback model used when the primary is
# overloaded (--fallback-model, works with --print). Empty -> no flag.
agent_fallback_model: str = ""
# L-2: run-log rotation. Old per-run logs in <data>/runs/*.log are pruned at
# app startup (best-effort). A *.log is removed if it is older than
# log_keep_days OR not within the log_keep_max most-recent logs (whichever

View File

@@ -17,7 +17,7 @@ registry is used so the system works out of the box.
import json
import logging
from dataclasses import dataclass, field
from dataclasses import dataclass
from .config import settings
@@ -30,11 +30,6 @@ class ProjectConfig:
repo: str # gitea repo name (== folder under /repos)
work_item_prefix: str # ET / ORCH
name: str # human-readable label
# ORCH-41: optional per-project agent->model / agent->effort overrides parsed
# from projects_json. frozen dataclass + mutable default -> field(default_factory=dict)
# (a bare {} default raises ValueError). Empty dict = no override (old records work).
agent_models: dict = field(default_factory=dict)
agent_efforts: dict = field(default_factory=dict)
# Built-in default registry (used when ORCH_PROJECTS_JSON is empty/invalid).
@@ -55,23 +50,6 @@ _DEFAULT_PROJECTS = [
]
def _coerce_str_map(value, idx, field_name) -> dict:
"""ORCH-41: coerce an optional projects_json sub-object into a {str: str} dict.
Missing / null -> {} (no override). A non-object value is logged and dropped so
one malformed entry can never brick the whole registry; non-string keys/values
are stringified for safety.
"""
if value is None:
return {}
if not isinstance(value, dict):
logger.error(
f"ORCH_PROJECTS_JSON[{idx}].{field_name} is not an object, ignoring"
)
return {}
return {str(k): str(v) for k, v in value.items()}
def _parse_projects_json(raw: str) -> list[ProjectConfig] | None:
"""Parse ORCH_PROJECTS_JSON. Returns None if empty/invalid (-> use default)."""
if not raw or not raw.strip():
@@ -97,8 +75,6 @@ def _parse_projects_json(raw: str) -> list[ProjectConfig] | None:
repo=str(item["repo"]),
work_item_prefix=str(item["work_item_prefix"]),
name=str(item.get("name", item["repo"])),
agent_models=_coerce_str_map(item.get("agent_models"), i, "agent_models"),
agent_efforts=_coerce_str_map(item.get("agent_efforts"), i, "agent_efforts"),
)
)
except KeyError as e:

View File

@@ -34,27 +34,6 @@ import src.plane_sync as plane_sync # noqa: E402
ORCH_PLANE_ID = "8da6aa25-a60e-44d6-a1e2-d8ae59aa7d6a"
ENDURO_PLANE_ID = "7a79f0a9-5278-49cd-9007-9a338f238f9c"
# ORCH-39: after ORCH-10 the webhook resolves Plane state UUIDs per-project via
# get_project_states(project_id). Mock it deterministically (no network) and
# send each request with the UUID that matches its own project.
_PROJECT_STATES = {
ENDURO_PLANE_ID: {
"in_progress": "b873d9eb-993c-48cd-97ac-99a9b1623967",
"approved": "a519a341-dada-4a91-8910-7604f82b79c5",
"rejected": "ba958f3c-5db5-461d-8f82-89425e413b97",
},
ORCH_PLANE_ID: {
"in_progress": "e331bfb3-e17e-4699-ba48-4abb89c21b7b",
"approved": "63f2c8fe-dcda-4ace-952f-dd88bd0118ff",
"rejected": "4c769e90-bf80-4a52-b97a-e1c84904bfc3",
},
}
def _fake_get_project_states(project_id):
return _PROJECT_STATES.get(project_id, _PROJECT_STATES[ENDURO_PLANE_ID])
client = TestClient(app)
@@ -69,10 +48,6 @@ def setup(monkeypatch):
monkeypatch.setattr("src.webhooks.plane.verify_plane_signature", lambda body, sig: True)
# ORCH-39: deterministic per-project Plane states, clean cache per test.
plane_sync.reload_project_states()
monkeypatch.setattr(plane_sync, "get_project_states", _fake_get_project_states)
registry_json = (
f'[{{"plane_project_id": "{ENDURO_PLANE_ID}", "repo": "enduro-trails",'
f' "work_item_prefix": "ET", "name": "enduro-trails"}},'
@@ -85,7 +60,6 @@ def setup(monkeypatch):
yield
reload_projects()
plane_sync.reload_project_states()
if os.path.exists(_test_db):
os.unlink(_test_db)
@@ -129,9 +103,10 @@ def test_fetch_sequence_id_missing_field_returns_none():
# ---------------------------------------------------------------------------
# Feature 1: pipeline starts on a status change to In Progress, not on creation.
# ORCH-39: in_progress UUID is project-specific; derive it from the project.
_IN_PROGRESS = "b873d9eb-993c-48cd-97ac-99a9b1623967"
def _post(plane_id, plane_project_id=ORCH_PLANE_ID, name="A valid work item title"):
in_progress = _fake_get_project_states(plane_project_id)["in_progress"]
return client.post(
"/webhook/plane",
json={
@@ -142,7 +117,7 @@ def _post(plane_id, plane_project_id=ORCH_PLANE_ID, name="A valid work item titl
"name": name,
"description_stripped": "This is a sufficiently long description.",
"project": plane_project_id,
"state": {"id": in_progress, "name": "In Progress", "group": "started"},
"state": {"id": _IN_PROGRESS, "name": "In Progress", "group": "started"},
},
},
)

View File

@@ -33,36 +33,11 @@ from src.main import app # noqa: E402
from src.db import init_db, get_db # noqa: E402
from src import projects as P # noqa: E402
from src.projects import reload_projects # noqa: E402
import src.plane_sync as plane_sync # noqa: E402
ORCH_PLANE_ID = "8da6aa25-a60e-44d6-a1e2-d8ae59aa7d6a"
ENDURO_PLANE_ID = "7a79f0a9-5278-49cd-9007-9a338f238f9c"
UNKNOWN_PLANE_ID = "deadbeef-0000-0000-0000-000000000000"
# ORCH-39: after ORCH-10 the webhook resolves Plane state UUIDs per-project via
# get_project_states(project_id). Hardcoding the enduro in_progress UUID for an
# ORCH-project payload no longer matches, so the pipeline never starts. We mock
# get_project_states with a deterministic per-project map (no network) and send
# each request with the UUID that matches its own project.
_PROJECT_STATES = {
ENDURO_PLANE_ID: {
"in_progress": "b873d9eb-993c-48cd-97ac-99a9b1623967",
"approved": "a519a341-dada-4a91-8910-7604f82b79c5",
"rejected": "ba958f3c-5db5-461d-8f82-89425e413b97",
},
ORCH_PLANE_ID: {
"in_progress": "e331bfb3-e17e-4699-ba48-4abb89c21b7b",
"approved": "63f2c8fe-dcda-4ace-952f-dd88bd0118ff",
"rejected": "4c769e90-bf80-4a52-b97a-e1c84904bfc3",
},
}
def _fake_get_project_states(project_id):
"""Deterministic per-project state map; mirrors get_project_states' fallback
for unknown projects so the webhook still behaves sensibly."""
return _PROJECT_STATES.get(project_id, _PROJECT_STATES[ENDURO_PLANE_ID])
client = TestClient(app)
@@ -82,13 +57,6 @@ def setup(monkeypatch):
# focuses on the project filter, so bypass signature verification.
monkeypatch.setattr("src.webhooks.plane.verify_plane_signature", lambda body, sig: True)
# ORCH-39: resolve Plane states deterministically per-project (no network)
# and start from a clean per-project cache so suites don't leak into each
# other. plane.py imports get_project_states locally from ..plane_sync, so
# patch it at the src.plane_sync source.
plane_sync.reload_project_states()
monkeypatch.setattr(plane_sync, "get_project_states", _fake_get_project_states)
registry_json = (
f'[{{"plane_project_id": "{ENDURO_PLANE_ID}", "repo": "enduro-trails",'
f' "work_item_prefix": "ET", "name": "enduro-trails"}},'
@@ -101,7 +69,6 @@ def setup(monkeypatch):
yield
reload_projects() # restore from env
plane_sync.reload_project_states()
if os.path.exists(_test_db):
os.unlink(_test_db)
@@ -109,10 +76,10 @@ def setup(monkeypatch):
# Feature 1: the pipeline now starts on a status change to In Progress (not on
# creation). _post_created drives that status-change event so these ORCH-6
# routing tests still exercise task creation through the new trigger.
# ORCH-39: the in_progress UUID is now project-specific, so derive it from the
# project being posted to (matches get_project_states resolution above).
_IN_PROGRESS = "b873d9eb-993c-48cd-97ac-99a9b1623967"
def _post_created(plane_project_id, plane_id="wi-1", name="A valid work item title"):
in_progress = _fake_get_project_states(plane_project_id)["in_progress"]
return client.post(
"/webhook/plane",
json={
@@ -123,7 +90,7 @@ def _post_created(plane_project_id, plane_id="wi-1", name="A valid work item tit
"name": name,
"description_stripped": "This is a sufficiently long description.",
"project": plane_project_id,
"state": {"id": in_progress, "name": "In Progress", "group": "started"},
"state": {"id": _IN_PROGRESS, "name": "In Progress", "group": "started"},
},
},
)

View File

@@ -1,138 +0,0 @@
"""ORCH-41: tests for resolve_agent_effort + effort validation + flag assembly.
Mirrors test_resolve_agent_model's 4-level priority for the --effort lever, and
adds:
- validation: a value outside {low,medium,high,xhigh,max} is dropped -> ""
- flag assembly: --model / --effort / --fallback-model are present/absent in
the built command exactly when the resolved value is non-empty.
"""
import os
import tempfile
import pytest
os.environ.setdefault("ORCH_DB_PATH",
os.path.join(tempfile.gettempdir(), "test_orch41_effort.db"))
os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token")
from src.agents.launcher import (
resolve_agent_effort, resolve_agent_model, VALID_EFFORTS,
)
from src.config import settings
from src import projects as P
from src.projects import ProjectConfig, reload_projects
ORCH_PLANE_ID = "8da6aa25-a60e-44d6-a1e2-d8ae59aa7d6a"
@pytest.fixture(autouse=True)
def _clean_settings(monkeypatch):
monkeypatch.setattr(settings, "agent_effort_default", "high")
for a in ("analyst", "architect", "developer", "reviewer"):
monkeypatch.setattr(settings, f"agent_effort_{a}", "high")
for a in ("tester", "deployer"):
monkeypatch.setattr(settings, f"agent_effort_{a}", "medium")
monkeypatch.setattr(P.settings, "projects_json", "")
reload_projects()
yield
reload_projects()
def _install_registry(monkeypatch, agent_efforts):
reg = [ProjectConfig(
plane_project_id=ORCH_PLANE_ID, repo="orchestrator",
work_item_prefix="ORCH", name="orchestrator",
agent_efforts=agent_efforts,
)]
monkeypatch.setattr(P, "PROJECTS", reg)
monkeypatch.setattr(P, "_BY_PLANE_ID", {p.plane_project_id: p for p in reg})
monkeypatch.setattr(P, "_BY_REPO", {p.repo: p for p in reg})
# ---- default split ----------------------------------------------------------
def test_default_split():
assert resolve_agent_effort("developer") == "high"
assert resolve_agent_effort("architect") == "high"
assert resolve_agent_effort("tester") == "medium"
assert resolve_agent_effort("deployer") == "medium"
# ---- level 4: nothing -> "" -------------------------------------------------
def test_no_config_returns_empty(monkeypatch):
monkeypatch.setattr(settings, "agent_effort_default", "")
monkeypatch.setattr(settings, "agent_effort_tester", "")
assert resolve_agent_effort("tester") == ""
# ---- level 2: per-agent env beats default -----------------------------------
def test_per_agent_env(monkeypatch):
monkeypatch.setattr(settings, "agent_effort_tester", "low")
assert resolve_agent_effort("tester") == "low"
# ---- level 1: project override wins -----------------------------------------
def test_project_override(monkeypatch):
monkeypatch.setattr(settings, "agent_effort_developer", "high")
_install_registry(monkeypatch, {"developer": "xhigh"})
assert resolve_agent_effort("developer", ORCH_PLANE_ID) == "xhigh"
assert resolve_agent_effort("developer") == "high"
# ---- validation: invalid value dropped --------------------------------------
def test_invalid_default_dropped(monkeypatch):
monkeypatch.setattr(settings, "agent_effort_developer", "")
monkeypatch.setattr(settings, "agent_effort_default", "turbo")
assert resolve_agent_effort("developer") == ""
def test_invalid_env_dropped(monkeypatch):
monkeypatch.setattr(settings, "agent_effort_reviewer", "ultra")
assert resolve_agent_effort("reviewer") == ""
def test_invalid_project_override_dropped(monkeypatch):
_install_registry(monkeypatch, {"developer": "bogus"})
assert resolve_agent_effort("developer", ORCH_PLANE_ID) == ""
def test_all_valid_efforts_pass(monkeypatch):
monkeypatch.setattr(settings, "agent_effort_developer", "")
for e in VALID_EFFORTS:
monkeypatch.setattr(settings, "agent_effort_default", e)
assert resolve_agent_effort("developer") == e
# ---- flag assembly (mirror of launcher cmd construction) --------------------
def _build_flags(model, effort, fb):
model_flag = f"--model {model} " if model else ""
effort_flag = f"--effort {effort} " if effort else ""
fb_flag = f"--fallback-model {fb} " if fb else ""
return f"{model_flag}{effort_flag}{fb_flag}"
def test_flags_present_when_configured(monkeypatch):
monkeypatch.setattr(settings, "agent_fallback_model", "claude-sonnet-4-6")
model = resolve_agent_model("developer")
effort = resolve_agent_effort("developer")
fb = settings.agent_fallback_model
flags = _build_flags(model, effort, fb)
assert "--model claude-opus-4-8 " in flags
assert "--effort high " in flags
assert "--fallback-model claude-sonnet-4-6 " in flags
def test_flags_absent_when_empty(monkeypatch):
monkeypatch.setattr(settings, "agent_model_default", "")
monkeypatch.setattr(settings, "agent_model_developer", "")
monkeypatch.setattr(settings, "agent_effort_default", "")
monkeypatch.setattr(settings, "agent_effort_developer", "")
monkeypatch.setattr(settings, "agent_fallback_model", "")
model = resolve_agent_model("developer")
effort = resolve_agent_effort("developer")
fb = settings.agent_fallback_model
flags = _build_flags(model, effort, fb)
assert flags == ""
assert "--model" not in flags
assert "--effort" not in flags
assert "--fallback-model" not in flags

View File

@@ -1,156 +0,0 @@
"""ORCH-41: tests for resolve_agent_model (per-agent + per-project LLM model).
Covers the 4-level resolution priority:
1. ProjectConfig.agent_models[agent] (per-project override, from projects_json)
2. settings.agent_model_<agent> (per-agent env, when non-empty)
3. settings.agent_model_default (global default)
4. "" (no override anywhere -> CLI default)
plus: unknown project_id / no project_id skips level 1, unknown agent skips
level 2, and the frozen ProjectConfig still accepts agent_models (default {}).
We never mutate the module-global registry permanently: tests that need a
custom registry install one via monkeypatch + reload_projects and restore the
default afterwards (autouse fixture).
"""
import os
import tempfile
import pytest
os.environ.setdefault("ORCH_DB_PATH",
os.path.join(tempfile.gettempdir(), "test_orch41_model.db"))
os.environ.setdefault("ORCH_GITEA_TOKEN", "test-token")
os.environ.setdefault("ORCH_PLANE_API_TOKEN", "test-token")
from src.agents.launcher import resolve_agent_model
from src.config import settings
from src import projects as P
from src.projects import ProjectConfig, reload_projects, _parse_projects_json
ORCH_PLANE_ID = "8da6aa25-a60e-44d6-a1e2-d8ae59aa7d6a"
ENDURO_PLANE_ID = "7a79f0a9-5278-49cd-9007-9a338f238f9c"
@pytest.fixture(autouse=True)
def _clean_settings(monkeypatch):
"""Reset all per-agent/default model settings to a known baseline so tests
are order-independent regardless of what other modules set in the env."""
monkeypatch.setattr(settings, "agent_model_default", "claude-opus-4-8")
for a in ("analyst", "architect", "developer", "reviewer", "tester", "deployer"):
monkeypatch.setattr(settings, f"agent_model_{a}", "")
# default registry (no per-project overrides)
monkeypatch.setattr(P.settings, "projects_json", "")
reload_projects()
yield
reload_projects()
def _install_registry(monkeypatch, agent_models):
"""Install a single-project registry for ORCH with the given agent_models."""
reg = [ProjectConfig(
plane_project_id=ORCH_PLANE_ID, repo="orchestrator",
work_item_prefix="ORCH", name="orchestrator",
agent_models=agent_models,
)]
monkeypatch.setattr(P, "PROJECTS", reg)
monkeypatch.setattr(P, "_BY_PLANE_ID", {p.plane_project_id: p for p in reg})
monkeypatch.setattr(P, "_BY_REPO", {p.repo: p for p in reg})
# ---- Level 4: nothing configured -> "" --------------------------------------
def test_no_config_returns_empty(monkeypatch):
monkeypatch.setattr(settings, "agent_model_default", "")
assert resolve_agent_model("developer") == ""
assert resolve_agent_model("developer", ORCH_PLANE_ID) == ""
# ---- Level 3: global default ------------------------------------------------
def test_global_default():
assert resolve_agent_model("developer") == "claude-opus-4-8"
assert resolve_agent_model("architect") == "claude-opus-4-8"
# ---- Level 2: per-agent env beats default -----------------------------------
def test_per_agent_env_overrides_default(monkeypatch):
monkeypatch.setattr(settings, "agent_model_reviewer", "claude-sonnet-4-6")
assert resolve_agent_model("reviewer") == "claude-sonnet-4-6"
# other agents still fall through to default
assert resolve_agent_model("developer") == "claude-opus-4-8"
# ---- Level 1: per-project override beats per-agent env and default ----------
def test_project_override_beats_env_and_default(monkeypatch):
monkeypatch.setattr(settings, "agent_model_developer", "claude-sonnet-4-6")
_install_registry(monkeypatch, {"developer": "claude-opus-4-8"})
assert resolve_agent_model("developer", ORCH_PLANE_ID) == "claude-opus-4-8"
# without project_id, falls back to per-agent env
assert resolve_agent_model("developer") == "claude-sonnet-4-6"
def test_project_override_only_for_listed_agent(monkeypatch):
_install_registry(monkeypatch, {"developer": "claude-opus-4-8"})
# reviewer not in agent_models -> falls back to default
assert resolve_agent_model("reviewer", ORCH_PLANE_ID) == "claude-opus-4-8"
monkeypatch.setattr(settings, "agent_model_reviewer", "claude-sonnet-4-6")
assert resolve_agent_model("reviewer", ORCH_PLANE_ID) == "claude-sonnet-4-6"
# ---- unknown / empty project id skips level 1 -------------------------------
def test_unknown_project_id_skips_override(monkeypatch):
_install_registry(monkeypatch, {"developer": "x-model"})
assert resolve_agent_model("developer", "no-such-uuid") == "claude-opus-4-8"
assert resolve_agent_model("developer", None) == "claude-opus-4-8"
# ---- unknown agent skips per-agent env, still gets default ------------------
def test_unknown_agent_falls_to_default():
assert resolve_agent_model("nonexistent") == "claude-opus-4-8"
# ---- frozen ProjectConfig accepts agent_models ------------------------------
def test_projectconfig_frozen_with_agent_models():
pc = ProjectConfig(
plane_project_id="x", repo="r", work_item_prefix="P", name="n",
agent_models={"developer": "m"},
)
assert pc.agent_models == {"developer": "m"}
# default is an empty dict, not shared/mutable across instances
pc2 = ProjectConfig(plane_project_id="y", repo="r2",
work_item_prefix="P2", name="n2")
assert pc2.agent_models == {}
assert pc2.agent_models is not pc.agent_models
with pytest.raises(Exception):
pc.repo = "changed" # frozen
# ---- projects_json parsing of agent_models / agent_efforts ------------------
def test_parse_projects_json_with_overrides():
raw = (
'[{"plane_project_id":"p1","repo":"orchestrator",'
'"work_item_prefix":"ORCH",'
'"agent_models":{"developer":"claude-opus-4-8","reviewer":"claude-sonnet-4-6"},'
'"agent_efforts":{"developer":"xhigh","tester":"low"}}]'
)
parsed = _parse_projects_json(raw)
assert parsed is not None and len(parsed) == 1
pc = parsed[0]
assert pc.agent_models == {"developer": "claude-opus-4-8",
"reviewer": "claude-sonnet-4-6"}
assert pc.agent_efforts == {"developer": "xhigh", "tester": "low"}
def test_parse_projects_json_omitted_overrides_default_empty():
raw = ('[{"plane_project_id":"p1","repo":"r","work_item_prefix":"P"}]')
parsed = _parse_projects_json(raw)
assert parsed is not None and len(parsed) == 1
assert parsed[0].agent_models == {}
assert parsed[0].agent_efforts == {}
def test_parse_projects_json_malformed_override_ignored():
# agent_models is not an object -> dropped to {}, entry still valid
raw = ('[{"plane_project_id":"p1","repo":"r","work_item_prefix":"P",'
'"agent_models":"oops"}]')
parsed = _parse_projects_json(raw)
assert parsed is not None and parsed[0].agent_models == {}