Stack monitoring
Every provisioned stack gets two complementary layers of health monitoring out of the box: an off-box port:22 probe that detects when the host stops answering, and an opt-in on-box agent that reports disk, RAM, and Docker telemetry. Both surface in the stack detail page; the off-box probe is what alerts you when the host is genuinely down.
The two layers
| Off-box port monitor | On-box Heartbeat agent | |
|---|---|---|
| Answers | Is the host reachable right now? | What's my disk %, RAM, container count, load? |
| Where it runs | OwnStack's monitoring service, externally | On the stack itself |
| Works when the host is dead | Yes — that's the point. | No — the agent dies with the host. |
| Detect time on a crash | ~60–180 s | Whenever the agent times out (~5 min) |
| Install required | No (auto-provisioned) | Yes — install via Stack → Add-ons → Heartbeat |
They're not competing tools — they're two layers of the same product. The off-box probe is liveness; the on-box agent is telemetry.
Off-box port monitor
When a stack reaches provisioned, OwnStack creates a port-22 monitor in the Heartbeat service. It probes tcp://<instance_ip>:22 every 60 seconds. Two consecutive failures (≈ 180 s) marks the host as down; the next success marks it recovered.
On a down event:
- Heartbeat POSTs an HMAC-signed webhook to the control plane.
- The control plane writes a
host_unreachablealert against the stack and stampsunreachable_since. - Account users with
notify_host_unreachable=trueget an email. - The stack list card switches to a red dot; the stack detail's Host status panel flips to Unreachable for X.
On recovery the same flow runs in reverse: alert resolved, last_reachable_at stamped, recovery email sent.
On-box agent (Heartbeat add-on)
Opt-in per stack from Stack → Add-ons → Heartbeat → Install agent. The agent installs over SSH, runs under systemd, and pushes disk/RAM/docker/load every 60 s. The data drives the Health panel on the stack page (disk-use gauge, container counts, alerts at 80% / 90% disk).
The agent doesn't replace the off-box probe — when the host dies the agent dies too. Keep both.
Host actions
The Host status panel exposes provider-level controls for AWS stacks:
| Action | What it runs | When the button shows |
|---|---|---|
| Start | StartInstances | Provider state is stopped / shutting-down |
| Stop | StopInstances | Provider state is running (confirms with EBS-billing warning) |
| Restart | RebootInstances | Provider state is running, or the host is unreachable |
| Console | GetConsoleOutput | Always (for AWS stacks with an instance) |
Each action is recorded as a StackHostAction row with the before/after provider state, user, and result. The Timeline section on the stack detail lists them alongside alert events.
Auto-recovery (opt-in)
Turn the toggle on in the Host status panel and pick a threshold (default 5 min). If the off-box probe reports the host unreachable for at least that long and AWS still says the instance is running, OwnStack fires a restart on its own, recorded with triggered_by=auto_recovery. It's rate-limited to one auto-recovery per 30 minutes per stack to avoid loops.
A reboot fixes a hung kernel and a wedged systemd, but it also drops in-flight requests and forces every container to restart. Use it for hosts you'd rather lose 60 s of traffic on than leave dark for hours; leave it off for stacks where a graceful shutdown matters.
The serial console
For AWS stacks, the Host status panel's "View log" link calls GetConsoleOutput against EC2. That returns the most recent ~64 KB of hypervisor-captured serial output — the boot log, kernel messages, OOM dumps, panic traces. It's what you reach for when the OS is too sick to SSH into. The output is a snapshot, not a stream.
From the CLI
$ ownstack stack health <stack>
=== node-hub-538 ===
IP: 44.237.78.27
Provider state: running
Host reachability: ● Active (port:22, 60s probe)
On-box agent: ○ Not installed
Active alerts (1):
• [critical] host_unreachable — TCP timeout (2026-05-10T18:13:25Z)
Read next
- Stacks overview
- App resource limits — pair host monitoring with per-app memory caps so one runaway can't take the host down.
- CLI: ownstack stack