Files
wiki/tasks/orchestrator/reports/dev-2026-06-05-orch34-deploy-hook.md
2026-06-05 09:30:13 +03:00

7.1 KiB
Raw Blame History

Dev Report: ORCH-34 — orchestrator deploy hook с health-check и auto-rollback

Дата: 2026-06-05 Статус: DONE

Задача

Написать scripts/orchestrator-deploy-hook.sh — деплой-хук orchestrator с:

  • Захватом PREV_IMG до рестарта (паттерн enduro-deploy-hook.sh)
  • git pull + docker compose up
  • Health-цикл 10×6с = до 60с (HTTP 200 + "status":"ok")
  • Авто-rollback при провале (+ короткий health 5×3с)
  • Параметризация через env, дефолт=STAGING (НЕ прод)
  • Отдельный PREV_IMAGE_FILE для staging

Сделано

  • Изучен образец /home/slin/bin/enduro-deploy-hook.sh
  • Создана ветка feature/ORCH-34-deploy-hook из свежего origin/main
  • Написан scripts/orchestrator-deploy-hook.sh (176 строк)
  • Написан docs/DEPLOY_HOOK.md
  • Happy-path тест — PASS
  • Rollback-path тест — PASS (авто-rollback)
  • Пруф прод не задет
  • git push + PR #30

Изменённые файлы

  • scripts/orchestrator-deploy-hook.sh — создан (176 строк, chmod +x)
  • docs/DEPLOY_HOOK.md — создан (документация: переменные, примеры, коды выхода)
  • Не тронуто: src/, tests/, docker-compose.yml, .env, .env.staging, enduro-deploy-hook.sh

Результат

Ветка

feature/ORCH-34-deploy-hook

Коммит

a6cbacb feat(staging): add orchestrator deploy hook with health-check and auto-rollback (ORCH-34)

PR

PR #30: admin/orchestrator#30

Проверка ветки

$ git log --oneline origin/main..origin/feature/ORCH-34-deploy-hook
a6cbacb feat(staging): add orchestrator deploy hook with health-check and auto-rollback (ORCH-34)

Лог тестов

1. Happy-path (--deploy против staging)

[2026-06-05T06:23:54Z] Deploy hook called: target=orchestrator-staging port=8501 args=--deploy
[2026-06-05T06:23:54Z] Saved previous image: sha256:9d1b64767f6cac77f2faa2adece2d952c227e6f84bd99c1eb1ed63bad8c9d895 -> /home/slin/repos/orchestrator/.deploy-prev-image-staging
[2026-06-05T06:23:54Z] git pull origin main
[2026-06-05T06:23:54Z] Starting orchestrator-staging (profile=staging)
[2026-06-05T06:23:54Z] orchestrator-staging restarted
[2026-06-05T06:23:54Z] Starting health-check: 10 attempts x 6s (max 60s)
[2026-06-05T06:23:54Z] deploy-health: attempt 1/10 - GET http://localhost:8501/health
[2026-06-05T06:23:54Z] deploy-health: OK (HTTP 200, body={"status":"ok","service":"orchestrator"})
[2026-06-05T06:23:54Z] Deploy SUCCESS: orchestrator-staging healthy on port 8501
Exit code: 0

После теста: curl localhost:8501/health = {"status":"ok","service":"orchestrator"}

2. Rollback-path (симуляция сломанного деплоя)

Метод симуляции: создан образ orchestrator-orchestrator-staging:broken (busybox, CMD ["sh", "-c", "exit 1"]) и перетегирован как :latest. При --deploy хук:

  1. Захватил SHA хорошего образа в PREV_IMAGE_FILE
  2. docker compose up -d поднял "сломанный" busybox-образ
  3. 10 попыток health-check — все провалились (контейнер в Restarting)
  4. Авто-rollback: retag хорошего SHA → latest, рестарт
  5. post-rollback health попытка 2/5 — OK
[2026-06-05T06:24:54Z] Deploy hook called: target=orchestrator-staging port=8501 args=--deploy
[2026-06-05T06:24:54Z] Saved previous image: sha256:9d1b64767f6cac77f2faa2adece2d952c227e6f84bd99c1eb1ed63bad8c9d895 -> /home/slin/repos/orchestrator/.deploy-prev-image-staging
[2026-06-05T06:24:54Z] git pull origin main
[2026-06-05T06:24:54Z] Starting orchestrator-staging (profile=staging)
[2026-06-05T06:24:54Z] orchestrator-staging restarted
[2026-06-05T06:24:54Z] Starting health-check: 10 attempts x 6s (max 60s)
[2026-06-05T06:24:54Z] deploy-health: attempt 1/10 - GET http://localhost:8501/health
[2026-06-05T06:24:54Z] deploy-health: not ready yet (HTTP 000000, body=)
... (attempts 2-9 skipped for brevity, all HTTP 000000) ...
[2026-06-05T06:25:49Z] deploy-health: attempt 10/10 - GET http://localhost:8501/health
[2026-06-05T06:25:49Z] deploy-health: not ready yet (HTTP 000000, body=)
[2026-06-05T06:25:49Z] deploy-health: FAILED after 10 attempts
[2026-06-05T06:25:49Z] deploy FAILED: health not ok after 60s - initiating AUTO ROLLBACK
[2026-06-05T06:25:49Z] ROLLBACK: checking /home/slin/repos/orchestrator/.deploy-prev-image-staging
[2026-06-05T06:25:49Z] ROLLBACK: retagging sha256:9d1b64767f6cac77f2faa2adece2d952c227e6f84bd99c1eb1ed63bad8c9d895 -> orchestrator-orchestrator-staging
[2026-06-05T06:25:49Z] ROLLBACK: restarting orchestrator-staging on previous image
[2026-06-05T06:25:49Z] ROLLBACK: container restarted, running post-rollback health check (5x3s)
[2026-06-05T06:25:49Z] ROLLBACK-health: attempt 1/5 - GET http://localhost:8501/health
[2026-06-05T06:25:49Z] ROLLBACK-health: not ready yet (HTTP 000000, body=)
[2026-06-05T06:25:52Z] ROLLBACK-health: attempt 2/5 - GET http://localhost:8501/health
[2026-06-05T06:25:52Z] ROLLBACK-health: OK (HTTP 200, body={"status":"ok","service":"orchestrator"})
[2026-06-05T06:25:52Z] ROLLBACK: service is healthy on previous image (sha256:9d1b64767f6cac77f2faa2adece2d952c227e6f84bd99c1eb1ed63bad8c9d895)
[2026-06-05T06:25:52Z] deploy FAILED, rolled back to previous image successfully - exit 1
Exit code: 1 (expected)

После rollback: curl localhost:8501/health = {"status":"ok","service":"orchestrator"}

3. Прод не задет

Prod StartedAt: 2026-06-04T13:08:13.208485681Z  (не изменился)
curl localhost:8500/health: {"status":"ok","service":"orchestrator"}

docker ps:
orchestrator-staging   Up 14 seconds   orchestrator-orchestrator-staging
orchestrator           Up 17 hours     orchestrator-orchestrator

4. git log

$ git log --oneline origin/main..origin/feature/ORCH-34-deploy-hook
a6cbacb feat(staging): add orchestrator deploy hook with health-check and auto-rollback (ORCH-34)

Проблемы и решения

Проблема: heredoc съел кавычки

При первой записи скрипта через ssh '...' << 'HEREDOC' кавычки внутри grep и curl были потеряны. Скрипт делал grep -q status:ok вместо grep -q '"status":"ok"' и -w %{http_code} вместо -w '%{http_code}'. Результат: все health-check падали.

Решение: записать скрипт локально в /tmp/, затем передать через ssh ... 'cat >'< /tmp/file. Верификация — grep -n "grep" на удалённом файле.

Следующий шаг (Этап 5)

Переключить прод: DEPLOY_HOOK_SCRIPT=/home/slin/repos/orchestrator/scripts/orchestrator-deploy-hook.sh в .env с env-override для прод-параметров. Это отдельная задача.