Skip to content

Mesh networking

Confidential VMs are hostile networking environments: no public IP, no inbound UDP, and the only way in is TCP through the platform’s gateway. AttestMesh forms a full wireguard mesh anyway — with the chain and the gateway as the only infrastructure. No STUN servers, no rendezvous service, no exchanged config.

For each peer, the sidecar reads from the cluster contract:

  • the wireguard public key (pinned at registration),
  • the mesh IP (derived deterministically from the member id inside the cluster’s CIDR, 10.13.0.0/16 in v1), and
  • the member contract address — which on dstack is the CVM’s app id, and the app id determines the node’s gateway hostname:
<app_id>-<port>s.<gateway-domain>

So a node can compute every peer’s reachable endpoint from on-chain data plus one config value (GATEWAY_DOMAIN). Registration and discovery are fully peer-to-peer; there is nothing to coordinate off chain.

Wireguard speaks UDP; the gateway speaks TCP. The sidecar bridges the two with a small transport layer (the mullvad udp-over-tcp framing — each datagram length-prefixed on a TCP stream):

node A node B
┌────────────────────────┐ ┌────────────────────────┐
│ kernel wireguard │ │ kernel wireguard │
│ peer endpoint = │ │ listen :51821 │
│ 127.0.0.1:<bridge> │ │ ▲ │
│ │ │ TLS (SNI) │ │ UDP loopback │
│ per-peer bridge ──────┼───────────────►│ TCP ingress :51900 │
│ (UDP⇄framed TCP) │ via gateway │ (TLS-terminating) │
└────────────────────────┘ └────────────────────────┘
  • Each node runs a TLS-terminating TCP ingress (port 51900) that every peer can reach through the gateway’s TLS-passthrough route.
  • For each peer, the node spawns a bridge: a loopback UDP socket that kernel wireguard uses as the peer’s endpoint, pumped over a TLS connection to the peer’s ingress.
  • Both sides initiate; wireguard’s endpoint roaming converges on whichever leg is live, and its keepalives re-exercise a fresh TCP leg after any drop.

Two deliberate choices:

  • The TLS on this path is routing, not security. The gateway routes by SNI, so a TLS handshake must happen — but the ingress uses a throwaway self-signed certificate and the bridge verifies nothing. All confidentiality and authenticity come from wireguard itself, whose peer keys come from the chain.
  • Ports are split: the wireguard kernel socket listens on 51821, while in-mesh heartbeats use 51820 inside the tunnel. (Kernel wireguard owns its UDP port exclusively — they cannot share.)

Wireguard needs only chain state, but heartbeat verification needs each peer’s Ed25519 key. Nodes exchange these as PeerEndpoint envelopes: sealed-box encrypted to the recipient’s on-chain x25519 key and sent through MessageFacet.send — so the sender is authenticated by the chain itself, and the payload is opaque to everyone else. Each node absorbs envelopes addressed to it from MessageSent logs (or faster, when the indexer push wakes it).

Heartbeats carry each sender’s view of who it considers connected (itself included). The cluster is converged when every live member’s view matches the full live set. First convergence latches a permanent gate — the cluster has proven it can fully form — and feeds the node’s healthy state.

The gateway TCP path is the bootstrap, chosen because it is deterministic. Live probes on the target platform verified the upgrade path for later:

  • outbound UDP works, and the NAT’s mapping is endpoint-independent;
  • a two-sided simultaneous hole-punch succeeds, including hairpin between same-host CVMs (one-sided does not — filtering is restricted);
  • a node can learn its own egress IP by resolving its gateway hostname — matching what a STUN query reports, with no STUN involved.

Because every member pair already shares an authenticated channel (the TCP-bootstrapped mesh), punch coordination can happen in-band, keeping the upgrade STUN-free too. Until then, TCP carries the mesh — measurably fine for coordination-sized traffic.