# Monitoring VPS Health Over MAVLink

The VPS reports its health on the same MAVLink link you already use for telemetry, so you can monitor it from QGroundControl, Mission Planner, MAVProxy, a Lua script on the autopilot, or any other MAVLink consumer.

Two messages do the work:

1. A standard [`HEARTBEAT`](https://mavlink.io/en/messages/common.html#HEARTBEAT) that tells you the VPS lifecycle state (booting, ready, active, failed, etc.).
2. A [`NAMED_VALUE_INT`](https://mavlink.io/en/messages/common.html#NAMED_VALUE_INT) named `VPS_ERR` that carries a numeric error code explaining *why* the VPS is in its current state.

You only need the heartbeat for a quick "is it alive and active?" check. Add `VPS_ERR` when you want to know what to do about it.

## At a Glance

### HEARTBEAT

Sent at 1 Hz with [`type = MAV_TYPE_ONBOARD_CONTROLLER` (18)](https://mavlink.io/en/messages/common.html#MAV_TYPE_ONBOARD_CONTROLLER). The field you care about is `system_status`, a standard [`MAV_STATE`](https://mavlink.io/en/messages/common.html#MAV_STATE) value. The most common ones to watch for:

* **`MAV_STATE_STANDBY` (3)** — VPS is ready.
* **`MAV_STATE_ACTIVE` (4)** — VPS is operating normally.
* **`MAV_STATE_CRITICAL` (5)** — VPS is running but degraded.
* **`MAV_STATE_EMERGENCY` (6)** — VPS has failed.

If the heartbeat stops entirely, treat that as "VPS not running" — the heartbeat only flows while the VPS service is alive.

### VPS\_ERR

Sent at 1 Hz as a `NAMED_VALUE_INT` with `name = "VPS_ERR"`. The integer value is the error code. **`0` means everything is healthy.** Any non-zero value identifies a specific condition.

`VPS_ERR` is published by a separate monitoring service that runs independently of the VPS itself, so it keeps reporting even when the VPS has crashed or has been stopped. This is the channel to listen on if you want to know *why* the heartbeat went away.

## How to Read These Messages

If you are using QGroundControl or Mission Planner, both messages are visible in the standard MAVLink Inspector. `VPS_ERR` appears alongside any other `NAMED_VALUE_INT` from the vehicle.

From pymavlink:

```python
from pymavlink import mavutil

conn = mavutil.mavlink_connection('udpin:0.0.0.0:14550')

while True:
    msg = conn.recv_match(blocking=True)
    if msg is None:
        continue
    if msg.get_type() == 'HEARTBEAT' and msg.type == 18:
        print(f"VPS lifecycle: MAV_STATE={msg.system_status}")
    elif msg.get_type() == 'NAMED_VALUE_INT' and msg.name.startswith('VPS_ERR'):
        print(f"VPS_ERR={msg.value}")
```

From an ArduPilot Lua script, use `mavlink:receive_chan()` and filter on message ID 251 (`NAMED_VALUE_INT`) with the name `VPS_ERR`.

## Common Conditions

These are the codes you are most likely to see during normal use. The full table is at the end of this guide.

| `VPS_ERR` | What it means                                           | What to check                                                             |
| --------- | ------------------------------------------------------- | ------------------------------------------------------------------------- |
| `0`       | Healthy                                                 | Nothing — all systems nominal                                             |
| `1`       | No autopilot heartbeat received yet                     | Serial / UART connection between the VPS and the flight controller        |
| `2`       | Time sync between VPS and autopilot is still converging | Wait — this clears on its own once enough samples arrive                  |
| `3`       | No camera detected                                      | USB / CSI camera connection                                               |
| `4`       | Camera detected but not streaming                       | Power-cycle the camera or restart the camera service                      |
| `5`       | Missing autopilot odometry streams                      | `ATTITUDE` and `LOCAL_POSITION_NED` stream rates on the flight controller |
| `20`      | VPS service is stopped (clean shutdown)                 | Start the service                                                         |
| `30–50`   | VPS service crashed                                     | See the crash codes table below for the specific reason                   |

Codes 1–5 are normal during startup and clear themselves as each subsystem comes up. Anything in the 20s or above means operator attention is needed.

## What to Expect During a Healthy Boot

When the system powers on, you will typically see `VPS_ERR` step through these values as each subsystem initializes:

```
1  -> waiting for autopilot heartbeat
2  -> heartbeat received, time sync converging
3  -> time sync done, waiting for camera
0  -> all subsystems up, VPS healthy
```

In parallel, `system_status` in the heartbeat moves from `BOOT` (1) through `CALIBRATING` (2) and `STANDBY` (3) to `ACTIVE` (4).

If `VPS_ERR` gets stuck on a non-zero value for more than about 30 seconds, that subsystem has a problem worth investigating.

## Reference Tables

### `MAV_STATE` Mapping

The `system_status` field in the heartbeat reflects the VPS lifecycle:

| `MAV_STATE`             | Value | Meaning                                                              |
| ----------------------- | ----- | -------------------------------------------------------------------- |
| `MAV_STATE_UNINIT`      | 0     | Not yet initialized                                                  |
| `MAV_STATE_BOOT`        | 1     | Starting up — loading config, waiting for autopilot, or syncing time |
| `MAV_STATE_CALIBRATING` | 2     | Waiting for the camera to start streaming                            |
| `MAV_STATE_STANDBY`     | 3     | Ready, waiting to be armed                                           |
| `MAV_STATE_ACTIVE`      | 4     | Operationally active                                                 |
| `MAV_STATE_CRITICAL`    | 5     | Running with issues                                                  |
| `MAV_STATE_EMERGENCY`   | 6     | Failed                                                               |
| `MAV_STATE_POWEROFF`    | 7     | Stopped                                                              |

### `VPS_ERR` Codes

#### Boot and waiting states (1–9)

These are normal during startup. They clear themselves once the corresponding subsystem comes up.

| Code | Name                      | Message                                                                                                   |
| ---- | ------------------------- | --------------------------------------------------------------------------------------------------------- |
| `0`  | `NONE`                    | System healthy                                                                                            |
| `1`  | `WAITING_MAV_HEARTBEAT`   | No autopilot detected. Check serial connection.                                                           |
| `2`  | `WAITING_TIMESYNC`        | MAV connected, time sync converging.                                                                      |
| `3`  | `WAITING_CAMERA_IMAGES`   | Camera not publishing images. Check USB/CSI.                                                              |
| `4`  | `CAMERA_OPEN_FAILED`      | Camera detected but not streaming. Restart usb-camera service.                                            |
| `5`  | `WAITING_MAV_EKF_STREAMS` | Not receiving `ATTITUDE` / `LOCAL_POSITION_NED` from the autopilot. Check stream rates and FC connection. |

#### Service-level errors (20–29)

These are reported when the VPS process itself is not running.

| Code | Name                  | Message                  |
| ---- | --------------------- | ------------------------ |
| `20` | `SERVICE_NOT_RUNNING` | VPS service not running. |

#### Crash reasons (30–50)

When the VPS service has exited with an error, the code identifies the failure category.

| Code | Name               | Message                                                      |
| ---- | ------------------ | ------------------------------------------------------------ |
| `30` | `CRASH_UNKNOWN`    | VPS crashed. Check logs.                                     |
| `31` | `CRASH_LICENSE`    | VPS crashed: license validation failed.                      |
| `32` | `CRASH_CONFIG`     | VPS crashed: invalid configuration file.                     |
| `33` | `CRASH_MAVLINK`    | VPS crashed: MAVLink init failed. Check serial connection.   |
| `34` | `CRASH_TIMESYNC`   | VPS crashed: time sync init failed.                          |
| `35` | `CRASH_ECAL`       | VPS crashed: internal messaging init failed.                 |
| `36` | `CRASH_MAP`        | VPS crashed: map matcher failed. Check GPS fix and map data. |
| `37` | `CRASH_VIO`        | VPS crashed: VIO init failed.                                |
| `38` | `CRASH_EKF`        | VPS crashed: EKF init failed.                                |
| `39` | `CRASH_CAMERA`     | VPS crashed: camera pipeline failed.                         |
| `40` | `CRASH_GEOSPATIAL` | VPS crashed: geospatial data error.                          |
| `41` | `CRASH_MODEL`      | VPS crashed: inference model failed to load.                 |
| `42` | `CRASH_FILESYSTEM` | VPS crashed: file I/O error.                                 |
| `50` | `CRASH_SIGNAL`     | VPS killed by signal (segfault/abort).                       |

A crash code is informational — it tells you *what* failed so you know where to look. For full diagnostics, collect the system logs and contact support.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.theseus.us/gcs-software/mavlink/monitoring-vps-health-over-mavlink.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
