Running a node

An AttestMesh node is a dstack CVM running two containers: your application and the AttestMesh sidecar (cluster-mesh-agent). The sidecar owns everything mesh-related; the application consumes a simple gRPC API over a unix socket.

Boot phases

The sidecar’s health endpoint (:9090/healthz) reports a phase you can watch through the dstack gateway:

Phase	Meaning
`booting`	Deriving keys from the TEE, discovering the member contract and cluster.
`registering`	Building the attestation proof; submitting the sponsored `dstack_register` UserOperation. Retries while the operator’s allowlist entries land.
`subscribing`	Registration confirmed; starting mesh bring-up and indexer discovery.
`waiting-peers`	Registered, but no other members on chain yet.
`wg-configuring`	Peers found; configuring wireguard over the gateway transport.
`pulling-csk`	Mesh up; acquiring the Cluster Shared Key.
`heartbeating`	Signed heartbeats flowing; waiting for convergence.
`healthy`	Converged + CSK held. The application gate opens.

The endpoint returns HTTP 200 only when healthy:

{"csk_acquired":true,"first_converged":true,"live_peers":1,"phase":"healthy"}

first_converged latches once and never resets — brief peer flaps after first convergence degrade live_peers, not the gate.

What the sidecar maintains

Wireguard mesh — one attestmesh0 interface; one peer entry per cluster member, keys and mesh IPs pinned from chain state. Transport bootstraps over the dstack gateway (see Mesh networking).
Heartbeats — Ed25519-signed liveness packets every 2 seconds inside the mesh. A peer is live only when its heartbeats verify against the key learned from its encrypted endpoint envelope.
Event intake — a verified subscription to the shared Indexer for low-latency pushes, with direct chain-log polling always running underneath. Either path alone is sufficient.
CSK custody — held in memory, zeroized on drop, re-acquirable on every restart without operator involvement.

The application API

The sidecar serves gRPC on a unix socket (default /var/run/attestmesh/agent.sock). Your application container never touches keys, ciphertexts, the chain, or the Indexer — only this surface:

RPC	Purpose
`GetMeshStatus`	Phase, gates, live peer count — the same data as `/healthz`.
`ListPeers`	Member ids, mesh IPs, liveness.
`SendMessage`	Encrypts to the recipient’s on-chain key and submits a sponsored `MessageFacet.send`.
`SubscribeMessages`	Stream of messages addressed to this node, already decrypted and sender-authenticated.
`SubscribePeerEvents`	Joins and liveness transitions.
`GetClusterSharedKey`	The CSK — only after the health gates pass.

Gate your application’s startup on the sidecar’s health endpoint (compose depends_on + healthcheck), then talk to peers over their mesh IPs.

Node identity and restarts

Everything a node is derives from its TEE identity:

Keys are deterministic per app identity — a restart re-derives the same x25519/Ed25519/wireguard keys and the same EIP-4337 owner key. There is no key material to back up.
Registration is restart-safe — the sidecar detects its existing member record and skips straight to mesh bring-up.
The CSK is recovered, not stored: the originator re-derives it and checks it against the on-chain commitment; everyone else re-pulls it from a live peer over the mesh.

A node can be restarted, re-imaged (with an allowlisted compose), or moved without any state hand-off — the chain plus the TEE seed reconstruct everything. A two-node cluster typically returns to healthy in under two minutes after a simultaneous restart of both nodes.