feat(cluster): authenticate replica peers with PSK + BLAKE3 handshake#3425
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #3425 +/- ##
============================================
+ Coverage 74.29% 74.31% +0.02%
Complexity 943 943
============================================
Files 1245 1247 +2
Lines 121687 122232 +545
Branches 97959 98530 +571
============================================
+ Hits 90403 90841 +438
- Misses 28328 28389 +61
- Partials 2956 3002 +46
🚀 New features to boost your workflow:
|
97f4d4d to
67dc98d
Compare
The server-ng replica port (tcp_replica) was plaintext: any TCP peer that learned the cluster id could inject VSR frames or register as an arbitrary replica. mTLS does not fit - the replica conn is a dup'd plaintext fd round-robined across shards, state rustls cannot carry. Authenticate with a pre-shared cluster key and a 3-message mutual BLAKE3 keyed-MAC handshake (ReplicaHello / ReplicaChallenge / ReplicaFinish) over the reserved GenericHeader bytes: no new typed header, the stream stays a dupable plaintext fd. These are dedicated Command2 variants, not a Ping/Pong squat (left free for a future VSR liveness ping). ReplicaChallenge carries a status field, so a reject is the same frame with a nonzero status and the finish is identified by command, not position. The MAC proves PSK possession (cluster membership, not per-replica identity - the registry still trusts the announced id, so keep the port on a trusted boundary). No secret - PSK, encryption key, or JWT keys - serializes to the on-disk config snapshot. Configured under [cluster.auth] (enabled, shared_secret). Enabling it, and adding the consensus-plane commands, is a coordinated-restart change (cluster_id is derived from the cluster name, single-node included).
|
/ready |
|
Ok, the idea is sound and looks good, there is one piece that I think is worth mentioning, Currently we are supplying the replica_id through an CLI argument, it's fine for an "seed" cluster values (e.g perfect information first bootstrap), but falls flat in cases where we would like to have dynamic cluster counts. One way to address this isssue would be to keep the initial seed, but on the first bootstrap (when the metadata log is empty, so the cluster is fresh), write that seed configuration into the metadata log with op=0. Make the |
The server-ng replica port (tcp_replica) was plaintext: any TCP peer
that learned the cluster id could inject VSR frames or register as an
arbitrary replica. mTLS does not fit - the replica conn is a dup'd
plaintext fd round-robined across shards, state rustls cannot carry.
Authenticate with a pre-shared cluster key and a 3-message mutual
BLAKE3 keyed-MAC handshake (ReplicaHello / ReplicaChallenge /
ReplicaFinish) over the reserved GenericHeader bytes: no new typed
header, the stream stays a dupable plaintext fd. These are dedicated
Command2 variants, not a Ping/Pong squat (left free for a future VSR
liveness ping). ReplicaChallenge carries a status field, so a reject is
the same frame with a nonzero status and the finish is identified by
command, not position. The MAC proves PSK possession (cluster
membership, not per-replica identity - the registry still trusts the
announced id, so keep the port on a trusted boundary). The PSK is never
serialized to disk.
Configured under [cluster.auth] (enabled, shared_secret). Enabling it,
and adding the consensus-plane commands, is a coordinated-restart change
(cluster_id is derived from the cluster name).