HydraIssues

Stream session termination: reliability map and open gaps (2026-05-22)
open unclassified Project: hydracluster Reporter: 22 May 2026 06:12

Description

## What works (as of 2026-05-22)

### Versions deployed
- hydracluster: v2.0.95
- hydrabody: v2.0.60 (all 4 active bodies)
- hydraheadipad: v0.2.150 (CI building)

### Termination paths and event coverage

| Path | Trigger | Event emitted | Body idle in |
|---|---|---|---|
| User taps Exit (iPad) | `notifyStreamStopped` → TerminatePending | `terminate_pending` (amber, Success:true) | ≤5s |
| Admin stops body | `DELETE /api/v1/nodes/{id}/stream` | `stream_stop` | ≤15s |
| Session watchdog | body stale >60s OR head-only stale >120s | `terminate_pending` (red, Success:false) | ≤125s |
| Sunshine self-closes (clean Moonlight disconnect) | body tick: streaming→idle | **none — silent** | ≤5s after Sunshine closes |
| Body goes offline while streaming | offline detection loop | **none — silent** | n/a (body is gone) |

## Open gaps

### Gap 1 + 2: Two `closeSession` calls emit no event

In `pkg/api/handlers_body.go`:
- Line 346: `closeSession(bodyID, "body_idle")` — fires when body self-reports streaming→idle (Sunshine closed after clean Moonlight disconnect). No event.
- Line 1020: `closeSession(bodyID, "body_offline")` — fires when offline detection loop sees a streaming body go offline. No event.

**Fix**: emit `stream_stop` at both call sites.

```go
// after closeSession(nc.node.ID, "body_idle") — line 346
s.emitEvent(EventRecord{
Type: EventStreamStop,
BodyID: nc.node.ID,
BodyName: nc.node.Name,
Message: "Sunshine session ended — body self-reported idle",
Success: true,
})

// after closeSession(node.ID, "body_offline") — line 1020
s.emitEvent(EventRecord{
Type: EventStreamStop,
BodyID: node.ID,
BodyName: node.Name,
Message: "body went offline while streaming",
Success: false,
})
```

### Gap 3: No body-side orphan detection

TCP port watchdog (polling 47984/47989/48010) was removed in hydrabody v2.0.59 due to NAT table false positives. No body-side mechanism now detects an orphaned stream where Sunshine is still running but no Moonlight client is connected. Fully dependent on hydracluster 120s head watchdog as fallback.

### Gap 4: Heartbeat failure root cause unknown

hydraheadipad v0.2.150 adds `Heartbeat failed: <error>` logging. Once deployed to fleet and a stream runs, logs will confirm whether silent heartbeat failures are causing premature watchdog kills. Check the log viewer (step 5c of testbook) after any session that ends unexpectedly.