Files
wiki/tasks/flightradar24/ingest/preprocess/README.md
2026-04-26 13:20:01 +03:00

74 lines
2.5 KiB
Markdown

# Preprocess Service
Reads unprocessed `raw_packets` from PostgreSQL and builds aircraft, flights, tracks, and track_points.
## What it does
1. Waits for PostgreSQL to be ready
2. Reads `fr24.processing_state` to find the last processed `raw_packet_id`
3. Fetches the next batch of unprocessed packets (BATCH_SIZE, default 500)
4. For each packet:
- Decodes base64 SBS-1 payload
- Parses MSG type: MSG1=callsign, MSG3=position, MSG4=speed/heading/vrate
- MSG4 velocity is cached per icao24 (30 sec TTL) and used to enrich MSG3
- Skips ground-stationary aircraft (`onground=1`)
- **M4 outlier filter:** if computed speed between consecutive points > `MAX_SPEED_MS` (350 m/s) — point is dropped, warning logged
- Upserts `aircraft`, creates/reuses a `flight` (gap 30 min → new flight), appends a `track_point`
5. Advances the cursor in `processing_state`
6. Touches `/tmp/preprocess-ready` for Docker healthcheck
## M4 outlier filter (added 2026-04-26)
Filters coordinate outliers caused by corrupted dump1090 decodes. Algorithm:
- Caches `last_lat/lon/last_point_ts` per icao24 in `aircraft_state`
- On each MSG3 point: computes `haversine_m(last, current) / dt`
- If `dt ≤ 30s` and computed speed > `MAX_SPEED_MS` → skip point, log WARNING
- If `dt > 30 min` (gap/new flight) → reset last_point cache before checking
- Backup of original: `/home/fr24/backups/openclaw/20260426-100236_main.py.bak`
## Dependencies
- `psycopg2-binary`
- stdlib only (`math`, `os`, `base64`, `json`, etc.) — no extra packages
## Environment variables
| Variable | Default | Description |
|---|---|---|
| `POSTGRES_HOST` | required | DB host |
| `POSTGRES_PORT` | `5432` | DB port |
| `POSTGRES_DB` | required | DB name |
| `POSTGRES_USER` | required | DB user |
| `POSTGRES_PASSWORD` | required | DB password |
| `POLL_INTERVAL_SECONDS` | `5.0` | How often to poll for new packets |
| `BATCH_SIZE` | `500` | Packets per processing batch |
| `MAX_SPEED_MS` | `350.0` | M4 outlier threshold, m/s (Mach 1) |
## Run locally
```bash
pip install -r requirements.txt
export POSTGRES_HOST=localhost POSTGRES_DB=fr24 POSTGRES_USER=fr24 POSTGRES_PASSWORD=change-me
python main.py
```
## Build & run via Docker
```bash
cd /home/fr24/projects/fr24/compose
sg docker -c 'docker compose build preprocess && docker compose up -d preprocess'
```
## Logs
Normal operation:
```
2026-04-26T10:00:00 [preprocess] INFO Processed 46 packets (cursor→5384624, total=4002247)
```
M4 outlier detected:
```
2026-04-26T10:00:01 [preprocess] WARNING M4 outlier icao24=ABC123 speed=1240 m/s — skipping
```