Technical writing
Swarm SDK session establishment: X3DH prekey bundles and the initial drone-to-drone handshake
The problem: asynchronous session establishment
The Double Ratchet provides per-message forward secrecy and post-compromise recovery once a session is established — but it requires both parties to agree on an initial shared secret first. For drone swarms, this initial key agreement has an awkward constraint: the target drone might not be online when the session is initiated. A drone that just entered radio range cannot wait for a full round-trip handshake before sending the first targeting message.
The Swarm SDK solves this with Extended Triple Diffie-Hellman (X3DH), the same asynchronous key agreement protocol used by Signal. X3DH allows Drone A to establish a shared secret with Drone B and send an encrypted message in a single transmission — without Drone B being online. Drone B decrypts when it comes back online, derives the same shared secret, and the Double Ratchet session is running.
Key material: the prekey bundle
Before a drone can receive X3DH-initiated sessions, it must publish a prekey bundleto the swarm gossip mesh. The bundle contains three components:
/// Published to the gossip mesh; stored by peers for async session init.
#[derive(Serialize, Deserialize)]
pub struct PrekeyBundle {
/// Long-term identity key — X25519 public key (or ML-KEM-768 encapsulation key).
/// Signed by the device's Ed25519 signing key.
pub identity_key_pub: IdentityKeyPublic,
/// Signed prekey — rotated every 7 days.
/// Includes a timestamp and an Ed25519 signature by identity_key_pub.
pub signed_prekey: SignedPreKey,
/// One-time prekeys — consumed once and never reused.
/// The peer picks one from this list on session init; the drone removes it.
pub one_time_prekeys: Vec<OneTimePreKey>,
/// Device certificate signed by the Fleet CA.
/// Allows the initiator to verify this bundle belongs to a legitimate fleet member.
pub device_cert: DeviceCertificate,
}
#[derive(Serialize, Deserialize)]
pub struct SignedPreKey {
pub key_id: u32,
pub public_key: X25519PublicKey,
pub signature: Ed25519Signature, // over key_id ++ public_key ++ timestamp
pub timestamp: u64, // Unix epoch seconds
}
#[derive(Serialize, Deserialize)]
pub struct OneTimePreKey {
pub key_id: u32,
pub public_key: X25519PublicKey, // No signature — identity_key covers the bundle
}The signed prekey (SignedPreKey) is rotated on a 7-day schedule via the gossip mesh — the key management process is described in the key management article. One-time prekeys (OTPs) are single-use: each is consumed by exactly one session initiation and never reused. This prevents replay attacks on the initial handshake.
Prekey provisioning: setup and auto-refresh
The Swarm SDK generates prekey bundles during drone provisioning and automatically replenishes them when the supply runs low:
use swarm_sdk::{SwarmAgent, PrekeyConfig};
// Provision 20 one-time prekeys at startup
let mut drone = SwarmAgent::new(identity, transport)?;
drone.setup_prekeys(PrekeyConfig {
initial_count: 20,
refresh_threshold: 3, // request more when < 3 remain
batch_size: 10, // replenish 10 at a time
})?;
// The SDK auto-refreshes when count drops below threshold.
// Replenishment happens asynchronously over the gossip mesh —
// the drone announces new OTPs via PrekeysAvailable gossip message.If all OTPs are exhausted (high-churn environment), the SDK falls back to the signed prekey alone. This weakens the protocol's forward secrecy guarantees slightly — an attacker who later compromises the signed prekey could decrypt sessions established without an OTP — but it allows session establishment to continue without blocking. OTP exhaustion is logged and reported to the fleet ground station for operational awareness.
The X3DH handshake: four DH operations
When Drone A wants to establish a session with Drone B (using B's prekey bundle), it performs four Diffie-Hellman operations and combines the results with HKDF-SHA-256:
/// X3DH session initiation on the sender (Drone A) side.
pub fn initiate_x3dh_session(
sender_identity: &DeviceIdentity, // A's long-term keys
recipient_bundle: &PrekeyBundle, // B's published prekey bundle
) -> Result<(X3dhInitMessage, SharedSecret)> {
// 1. Generate a fresh ephemeral keypair for this session only
let ephemeral = X25519KeyPair::generate();
// 2. Select one of B's one-time prekeys (or fallback to signed prekey)
let otpk = recipient_bundle.one_time_prekeys.first();
// 3. Four DH operations (Signal spec Section 2.2):
//
// DH1 = DH(A_ik, B_spk) sender identity × recipient signed prekey
// DH2 = DH(A_ek, B_ik) sender ephemeral × recipient identity
// DH3 = DH(A_ek, B_spk) sender ephemeral × recipient signed prekey
// DH4 = DH(A_ek, B_otpk) sender ephemeral × recipient one-time prekey
//
// Combined secret: HKDF(DH1 || DH2 || DH3 || DH4)
let dh1 = x25519_dh(&sender_identity.ik_x25519.private,
&recipient_bundle.signed_prekey.public_key);
let dh2 = x25519_dh(&ephemeral.private,
&recipient_bundle.identity_key_pub.x25519);
let dh3 = x25519_dh(&ephemeral.private,
&recipient_bundle.signed_prekey.public_key);
let dh4 = otpk.map(|k| x25519_dh(&ephemeral.private, &k.public_key));
let ikm = chain(&dh1, &dh2, &dh3, dh4.as_deref());
let shared_secret = hkdf_sha256(ikm, INFO_X3DH)?;
// 4. Package the init message — recipient needs the ephemeral public key
// and the consumed OTP key_id to reconstruct DH4 on their side.
let init_msg = X3dhInitMessage {
sender_ik_pub: sender_identity.ik_x25519.public.clone(),
ephemeral_pub: ephemeral.public.clone(),
otpk_id: otpk.map(|k| k.key_id),
// First message payload is encrypted with shared_secret immediately:
ciphertext: aead_encrypt(&shared_secret, plaintext)?,
};
Ok((init_msg, shared_secret))
}Recipient processing: reconstructing the shared secret
When Drone B receives an X3dhInitMessage, it reconstructs the same four DH operations to derive the shared secret:
pub fn receive_x3dh_session(
recipient_identity: &DeviceIdentity,
signed_prekey: &SignedPreKeyPair, // B's SPK private key
one_time_prekeys: &mut PreKeyStore, // B's OTP private keys
msg: &X3dhInitMessage,
) -> Result<(DoubleRatchetSession, Vec<u8>)> {
// Look up and consume the OTP private key (removes it from the store)
let otpk_pair = msg.otpk_id
.and_then(|id| one_time_prekeys.consume(id));
// Mirror the four DH operations (reversed roles)
let dh1 = x25519_dh(&signed_prekey.private, &msg.sender_ik_pub);
let dh2 = x25519_dh(&recipient_identity.ik_x25519.private, &msg.ephemeral_pub);
let dh3 = x25519_dh(&signed_prekey.private, &msg.ephemeral_pub);
let dh4 = otpk_pair.as_ref()
.map(|k| x25519_dh(&k.private, &msg.ephemeral_pub));
let ikm = chain(&dh1, &dh2, &dh3, dh4.as_deref());
let shared_secret = hkdf_sha256(ikm, INFO_X3DH)?;
// Decrypt the first message payload
let plaintext = aead_decrypt(&shared_secret, &msg.ciphertext)?;
// Bootstrap a Double Ratchet session from the shared secret
// The sender's ephemeral key becomes the initial DH ratchet public key
let session = DoubleRatchetSession::from_x3dh(
shared_secret,
&msg.ephemeral_pub, // initial ratchet state
);
Ok((session, plaintext))
}The critical property: once consume(id) removes the OTP private key from the store, that DH operation can never be reproduced by anyone — not even the recipient. This is forward secrecy for the initial handshake itself, not just the subsequent Double Ratchet messages.
Post-quantum adaptation: replacing DH1 with ML-KEM-768
Classical X25519 X3DH is vulnerable to harvest-now-decrypt-later: an adversary who records an X3dhInitMessage today and later has a quantum computer can break DH1–DH4 and decrypt the session. The Swarm SDK's post-quantum mode replaces the classical identity key exchange (DH1) with ML-KEM-768:
// PQ prekey bundle — identity key uses ML-KEM-768 encapsulation key
// instead of X25519 for DH1 (all others remain X25519)
pub struct PqPrekeyBundle {
pub ik_kem_pub: MlKem768EncapKey, // replaces X25519 for DH1
pub ik_x25519_pub: X25519PublicKey, // still used in DH2 (classical fallback)
pub signed_prekey: SignedPreKey,
pub one_time_prekeys: Vec<OneTimePreKey>,
pub device_cert: DeviceCertificate,
}
// Sender side: DH1 becomes KEM encapsulation
let (ciphertext_kem, shared_secret_kem) =
ml_kem_768_encap(&recipient_bundle.ik_kem_pub)?;
// DH2, DH3, DH4 remain X25519 as before
let dh2 = x25519_dh(&ephemeral.private, &recipient_bundle.ik_x25519_pub);
let dh3 = x25519_dh(&ephemeral.private, &recipient_bundle.signed_prekey.public_key);
let dh4 = otpk.map(|k| x25519_dh(&ephemeral.private, &k.public_key));
// Combined key: HKDF over KEM output + classical DH outputs
let shared_secret = hkdf_sha256(
chain(&shared_secret_kem, &dh2, &dh3, dh4.as_deref()),
INFO_X3DH_PQ,
)?;
// The init message carries the KEM ciphertext so the recipient can decapsulate
let init_msg = PqX3dhInitMessage {
kem_ciphertext: ciphertext_kem, // recipient decapsulates to get shared_secret_kem
ephemeral_pub: ephemeral.public,
sender_ik_x25519_pub: sender_identity.ik_x25519.public,
otpk_id: otpk.map(|k| k.key_id),
ciphertext: aead_encrypt(&shared_secret, plaintext)?,
};This is the same hybrid approach used throughout the Swarm SDK: classical X25519 for the DH operations that don't need KEM encapsulation, ML-KEM-768 for the one operation where quantum resistance matters most (the long-term identity key exchange). Breaking the session now requires breaking both ML-KEM-768 and X25519 simultaneously.
The SDK API surface
From an application perspective, X3DH session establishment is one API call:
use swarm_sdk::SwarmAgent;
// Establish a session and send the first message — Drone B need not be online
let drone_b_bundle = swarm.get_prekey_bundle(drone_b_did).await?;
await drone_a.init_session_x3dh(
drone_b_did,
drone_b_bundle,
b"initial targeting data", // first message, encrypted in the init packet
)?;
// Later, when Drone B comes online:
while let Ok(msg) = drone_b.recv().await {
if msg.session_init {
// SDK automatically decrypts the init message and bootstraps
// a Double Ratchet session. Subsequent sends/recvs use the ratchet.
println!("Session from {}: {:?}", msg.sender_did, msg.plaintext);
}
}Fleet CA verification of prekey bundles
A plain X3DH implementation has a TOFU (trust on first use) problem — how does the initiator know the prekey bundle wasn't injected by a rogue node on the gossip mesh? The Swarm SDK solves this by including the DeviceCertificate in every prekey bundle. Before using a bundle, the initiator verifies the cert chain:
pub fn verify_bundle(bundle: &PrekeyBundle, fleet_ca: &FleetCaCert) -> Result<()> {
// 1. Verify device certificate is signed by the fleet CA
fleet_ca.verify(&bundle.device_cert)?;
// 2. Verify the signed prekey is signed by the identity key in the cert
let ik = &bundle.device_cert.identity_key_pub;
verify_ed25519(ik, &bundle.signed_prekey.signature,
&bundle.signed_prekey.signing_message())?;
// 3. Verify the bundle's identity_key_pub matches the cert
if bundle.identity_key_pub != bundle.device_cert.identity_key_pub {
return Err(Error::BundleMismatch);
}
// 4. Check cert validity window
let now = unix_timestamp();
if now < bundle.device_cert.valid_from || now > bundle.device_cert.valid_until {
return Err(Error::ExpiredCert);
}
Ok(())
}This anchors the X3DH handshake to the fleet CA hierarchy — a rogue node that injects a fabricated prekey bundle cannot pass the cert chain verification unless it has compromised the Fleet CA private key.
X3DH → Double Ratchet: the handoff
X3DH establishes the initial shared secret, but it's only used to encrypt one message: the X3dhInitMessage payload. All subsequent messages use theDouble Ratchet initialized from that shared secret and the sender's ephemeral public key:
// Inside DoubleRatchetSession::from_x3dh(): // // The X3DH shared secret seeds the root chain: // root_key = HKDF(x3dh_shared_secret, "root_key") // // The sender's ephemeral key becomes the initial DH ratchet state: // ratchet_pub = ephemeral_pub // ratchet_priv = None (recipient doesn't have the ephemeral private key) // // On the first reply, the recipient advances the DH ratchet: // new_ratchet_pair = generate() // (root_key, chain_key) = kdf_rk(root_key, DH(ratchet_priv, sender_ratchet_pub)) // // This is the standard Signal X3DH-to-Double-Ratchet bootstrap. // Once the DH ratchet advances, the X3DH shared secret is gone from memory.
After the first DH ratchet step, the initial X3DH shared secret is replaced by a new root key derived from a fresh DH exchange. Past messages encrypted with the X3DH secret cannot be decrypted even if the ratchet state is later compromised — the post-X3DH forward secrecy property.
Benchmarks: STM32H7 and Jetson Nano
X3DH session establishment is more expensive than individual message encryption. On the embedded platforms the Swarm SDK targets:
| Operation | STM32H7 (480 MHz) | Jetson Nano (Cortex-A57) |
|---|---|---|
| Classical X3DH init (sender) | 18ms p50 / 24ms p99 | 2.1ms p50 / 3.0ms p99 |
| Classical X3DH receive | 16ms p50 / 22ms p99 | 1.9ms p50 / 2.7ms p99 |
| PQ X3DH init (ML-KEM-768 + X25519) | 62ms p50 / 78ms p99 | 6.8ms p50 / 9.1ms p99 |
| PQ X3DH receive | 58ms p50 / 74ms p99 | 6.2ms p50 / 8.4ms p99 |
| OTP consumption (in-memory store) | 0.1ms p50 | 0.01ms p50 |
| Bundle verification (cert chain) | 8ms p50 / 11ms p99 | 0.9ms p50 / 1.3ms p99 |
PQ X3DH costs ~3.4× more than classical X3DH on STM32H7 — dominated by the ML-KEM-768 encapsulation operation. Session establishment happens once per drone pair per mission, so the 62ms cost is acceptable in the mission setup window. For comparisons, each subsequent Double Ratchet message costs 1.8ms on STM32H7.
Operational considerations
A few practical notes for integrators:
- Prekey bundle size: A bundle with 20 OTPs is approximately 1.8 KB. Over a LoRa link at 250 bytes/packet, that's 8 packets — acceptable for mission setup, not for in-flight real-time exchange. Prekey bundle synchronization should happen before takeoff on a higher-bandwidth link.
- OTP exhaustion: If a drone is popular (many peers try to establish sessions simultaneously), OTPs can exhaust before the drone can replenish. The fallback to signed-prekey-only mode is logged but silent to the application. Set
initial_countto at least 2× the expected simultaneous sessions. - Stale bundles: Prekey bundles cached from the gossip mesh may be hours old. The SDK checks the signed prekey's timestamp and rejects bundles where the SPK is older than 8 days (1 day beyond the normal 7-day rotation window).
- Revocation: If a drone's DeviceCertificate is revoked via RevocationMessage (see key management), peers will reject its prekey bundle during cert chain verification. The revocation propagates across the gossip mesh in approximately 30 seconds.
Related technical articles:
For what happens after X3DH completes — per-message keys via Double Ratchet, header encryption, and out-of-order handling: The Swarm SDK double ratchet: forward secrecy and post-compromise security in drone mesh networks →
For how device certificates, signed prekey rotation, and the Fleet CA hierarchy work: Swarm SDK key management: device provisioning, certificate rotation, and revocation →
For the ML-KEM-768 + X25519 hybrid key exchange design and CNSA 2.0 compliance mapping: Post-quantum mesh cryptography for drone swarms: the Swarm SDK design →
For how prekey bundles are propagated through the gossip mesh and the bounded fanout routing that delivers them across the swarm: Swarm SDK gossip mesh: bounded fanout routing, message deduplication, and network partition handling →
For the full OneTimePreKey lifecycle — batch generation on STM32H7, gossip distribution, optimistic consumption tracking, exhaustion fallback, and SignedPreKey rotation: Swarm SDK prekey bundle management: generating, distributing, and consuming OneTimePreKeys across a drone fleet →