auto-sync: 2026-06-05 09:30:01

This commit is contained in:
Stream
2026-06-05 09:30:13 +03:00
parent 96bae96796
commit f4474f35bf

View File

@@ -0,0 +1,135 @@
# Dev Report: ORCH-34 — orchestrator deploy hook с health-check и auto-rollback
Дата: 2026-06-05
Статус: DONE
## Задача
Написать `scripts/orchestrator-deploy-hook.sh` — деплой-хук orchestrator с:
- Захватом PREV_IMG до рестарта (паттерн enduro-deploy-hook.sh)
- git pull + docker compose up
- Health-цикл 10×6с = до 60с (HTTP 200 + `"status":"ok"`)
- Авто-rollback при провале (+ короткий health 5×3с)
- Параметризация через env, дефолт=STAGING (НЕ прод)
- Отдельный PREV_IMAGE_FILE для staging
## Сделано
- [x] Изучен образец `/home/slin/bin/enduro-deploy-hook.sh`
- [x] Создана ветка `feature/ORCH-34-deploy-hook` из свежего origin/main
- [x] Написан `scripts/orchestrator-deploy-hook.sh` (176 строк)
- [x] Написан `docs/DEPLOY_HOOK.md`
- [x] Happy-path тест — PASS
- [x] Rollback-path тест — PASS (авто-rollback)
- [x] Пруф прод не задет
- [x] git push + PR #30
## Изменённые файлы
- `scripts/orchestrator-deploy-hook.sh` — создан (176 строк, `chmod +x`)
- `docs/DEPLOY_HOOK.md` — создан (документация: переменные, примеры, коды выхода)
- Не тронуто: src/, tests/, docker-compose.yml, .env, .env.staging, enduro-deploy-hook.sh
## Результат
### Ветка
`feature/ORCH-34-deploy-hook`
### Коммит
```
a6cbacb feat(staging): add orchestrator deploy hook with health-check and auto-rollback (ORCH-34)
```
### PR
PR #30: https://git.mva154.duckdns.org/admin/orchestrator/pulls/30
### Проверка ветки
```
$ git log --oneline origin/main..origin/feature/ORCH-34-deploy-hook
a6cbacb feat(staging): add orchestrator deploy hook with health-check and auto-rollback (ORCH-34)
```
---
## Лог тестов
### 1. Happy-path (`--deploy` против staging)
```
[2026-06-05T06:23:54Z] Deploy hook called: target=orchestrator-staging port=8501 args=--deploy
[2026-06-05T06:23:54Z] Saved previous image: sha256:9d1b64767f6cac77f2faa2adece2d952c227e6f84bd99c1eb1ed63bad8c9d895 -> /home/slin/repos/orchestrator/.deploy-prev-image-staging
[2026-06-05T06:23:54Z] git pull origin main
[2026-06-05T06:23:54Z] Starting orchestrator-staging (profile=staging)
[2026-06-05T06:23:54Z] orchestrator-staging restarted
[2026-06-05T06:23:54Z] Starting health-check: 10 attempts x 6s (max 60s)
[2026-06-05T06:23:54Z] deploy-health: attempt 1/10 - GET http://localhost:8501/health
[2026-06-05T06:23:54Z] deploy-health: OK (HTTP 200, body={"status":"ok","service":"orchestrator"})
[2026-06-05T06:23:54Z] Deploy SUCCESS: orchestrator-staging healthy on port 8501
Exit code: 0
```
После теста: `curl localhost:8501/health` = `{"status":"ok","service":"orchestrator"}`
### 2. Rollback-path (симуляция сломанного деплоя)
**Метод симуляции:** создан образ `orchestrator-orchestrator-staging:broken` (busybox, `CMD ["sh", "-c", "exit 1"]`) и перетегирован как `:latest`. При `--deploy` хук:
1. Захватил SHA хорошего образа в PREV_IMAGE_FILE
2. docker compose up -d поднял "сломанный" busybox-образ
3. 10 попыток health-check — все провалились (контейнер в Restarting)
4. Авто-rollback: retag хорошего SHA → latest, рестарт
5. post-rollback health попытка 2/5 — OK
```
[2026-06-05T06:24:54Z] Deploy hook called: target=orchestrator-staging port=8501 args=--deploy
[2026-06-05T06:24:54Z] Saved previous image: sha256:9d1b64767f6cac77f2faa2adece2d952c227e6f84bd99c1eb1ed63bad8c9d895 -> /home/slin/repos/orchestrator/.deploy-prev-image-staging
[2026-06-05T06:24:54Z] git pull origin main
[2026-06-05T06:24:54Z] Starting orchestrator-staging (profile=staging)
[2026-06-05T06:24:54Z] orchestrator-staging restarted
[2026-06-05T06:24:54Z] Starting health-check: 10 attempts x 6s (max 60s)
[2026-06-05T06:24:54Z] deploy-health: attempt 1/10 - GET http://localhost:8501/health
[2026-06-05T06:24:54Z] deploy-health: not ready yet (HTTP 000000, body=)
... (attempts 2-9 skipped for brevity, all HTTP 000000) ...
[2026-06-05T06:25:49Z] deploy-health: attempt 10/10 - GET http://localhost:8501/health
[2026-06-05T06:25:49Z] deploy-health: not ready yet (HTTP 000000, body=)
[2026-06-05T06:25:49Z] deploy-health: FAILED after 10 attempts
[2026-06-05T06:25:49Z] deploy FAILED: health not ok after 60s - initiating AUTO ROLLBACK
[2026-06-05T06:25:49Z] ROLLBACK: checking /home/slin/repos/orchestrator/.deploy-prev-image-staging
[2026-06-05T06:25:49Z] ROLLBACK: retagging sha256:9d1b64767f6cac77f2faa2adece2d952c227e6f84bd99c1eb1ed63bad8c9d895 -> orchestrator-orchestrator-staging
[2026-06-05T06:25:49Z] ROLLBACK: restarting orchestrator-staging on previous image
[2026-06-05T06:25:49Z] ROLLBACK: container restarted, running post-rollback health check (5x3s)
[2026-06-05T06:25:49Z] ROLLBACK-health: attempt 1/5 - GET http://localhost:8501/health
[2026-06-05T06:25:49Z] ROLLBACK-health: not ready yet (HTTP 000000, body=)
[2026-06-05T06:25:52Z] ROLLBACK-health: attempt 2/5 - GET http://localhost:8501/health
[2026-06-05T06:25:52Z] ROLLBACK-health: OK (HTTP 200, body={"status":"ok","service":"orchestrator"})
[2026-06-05T06:25:52Z] ROLLBACK: service is healthy on previous image (sha256:9d1b64767f6cac77f2faa2adece2d952c227e6f84bd99c1eb1ed63bad8c9d895)
[2026-06-05T06:25:52Z] deploy FAILED, rolled back to previous image successfully - exit 1
Exit code: 1 (expected)
```
После rollback: `curl localhost:8501/health` = `{"status":"ok","service":"orchestrator"}`
### 3. Прод не задет
```
Prod StartedAt: 2026-06-04T13:08:13.208485681Z (не изменился)
curl localhost:8500/health: {"status":"ok","service":"orchestrator"}
docker ps:
orchestrator-staging Up 14 seconds orchestrator-orchestrator-staging
orchestrator Up 17 hours orchestrator-orchestrator
```
### 4. git log
```
$ git log --oneline origin/main..origin/feature/ORCH-34-deploy-hook
a6cbacb feat(staging): add orchestrator deploy hook with health-check and auto-rollback (ORCH-34)
```
---
## Проблемы и решения
### Проблема: heredoc съел кавычки
При первой записи скрипта через `ssh '...' << 'HEREDOC'` кавычки внутри grep и curl были потеряны. Скрипт делал `grep -q status:ok` вместо `grep -q '"status":"ok"'` и `-w %{http_code}` вместо `-w '%{http_code}'`. Результат: все health-check падали.
**Решение:** записать скрипт локально в `/tmp/`, затем передать через `ssh ... 'cat >'< /tmp/file`. Верификация — `grep -n "grep"` на удалённом файле.
## Следующий шаг (Этап 5)
Переключить прод: `DEPLOY_HOOK_SCRIPT=/home/slin/repos/orchestrator/scripts/orchestrator-deploy-hook.sh` в `.env` с env-override для прод-параметров. Это отдельная задача.