Fanout, Rooms, and Horizontal Realtime Scaling
A chat-room message reaches every live connection subscribed to that room.
One application event arrives. The server finds the group it belongs to, and that group expands into a list of connection IDs. Each connection then has a separate write path and a separate queue policy, and any one of them can fail without the others knowing. That expansion is what fanout means.
event: message.created
room: project:42
recipients: c18, c44, c91
writes: c18 <- payload, c44 <- payload, c91 <- payloadFanout is the operation where one event becomes many outbound sends. A direct send targets one connection. Fanout targets a whole set, and that set comes from application state. The grouping key might be a room, a channel, a user, or any other label your API exposes. The transport underneath can be WebSocket, SSE, or long polling without changing the mechanical problem. You pick the recipients and enqueue the message for each one, working through the per-connection send path from Subchapter 03.
Servers tend to go wrong on that last part, because fanout has to pass through backpressure. Write directly to every socket from one global loop and you have already chosen a policy, even if nothing in the code says so. The policy you chose lets memory grow without a bound. It also leaves cleanup inconsistent and ties every client's experience to the slowest consumer in the loop.
All of this starts inside one process. A single Node process keeps two kinds of local state for realtime work. The first is a connection registry, which turns a connection ID into the live connection object behind it. The second is a subscription registry, which goes the other way and tracks the connection IDs that asked for a given stream of events.
Both registries are plain application state held in V8 memory. They change constantly, every time a connection opens, authenticates, joins or leaves a group, fails a heartbeat check, reconnects, or closes. Node does not track any of this for you across processes, so how the state gets stored is your decision.
Local Registries
Take the connection registry first. It maps an application connection ID to the live connection state for that client. That state usually carries the transport object, the authenticated identity from Subchapter 04, and the per-connection send queue from Subchapter 03, along with a few liveness fields and the cleanup hooks for that connection.
The subscription registry maps a routing key to the connections that should receive events for that key. Most servers store both directions at once, room to connection IDs for sending and connection ID to rooms for cleanup. Teams forget the reverse direction more often than the forward one, and that omission is the usual source of leaks.
const connections = new Map();
const roomMembers = new Map();
const connectionRooms = new Map();Those three maps hold everything the local process knows. connections stores live connection state. roomMembers tells you which connection IDs should receive a given room event, and connectionRooms tells you which rooms to clean up when a connection closes.
A room is a group of realtime subscribers that your application defines, and its membership is the set of connection IDs currently tied to that group. Strings like project:42, conversation:abc, or document:9 all work as room IDs. Until your API contract gives one meaning, a room name is only a routing key.
function joinRoom(conn, roomId) {
const ids = roomMembers.get(roomId) ?? new Set();
ids.add(conn.id);
roomMembers.set(roomId, ids);
}That code records one direction only. You can send to the room with it, but once a connection disappears, close cleanup needs the reverse direction too, so a real server records both.
function rememberRoom(conn, roomId) {
const rooms = connectionRooms.get(conn.id) ?? new Set();
rooms.add(roomId);
connectionRooms.set(conn.id, rooms);
}With both maps in place, close cleanup has a small, bounded job. Read the rooms for that connection, remove the connection ID from each room set, then delete the connection entry. Without the reverse map, cleanup turns into a scan of every room looking for the connection, and that scan slows down as the room count grows.
The full join path updates both maps in a single operation. Treat it as one registry mutation, even though the data is stored across two Map objects.
function join(conn, roomId) {
joinRoom(conn, roomId);
rememberRoom(conn, roomId);
conn.send({ type: 'joined', roomId });
}That acknowledgement is optional, but the order of the two mutations carries meaning. Membership has to exist before the server confirms the join. If a send fails after membership already exists, the close path can still clean it up later. Reverse that order and you get a real bug. When the acknowledgement goes out before membership exists, a client can think it joined while the very next room event passes it by.
Leaving runs the same way, in reverse.
function leave(conn, roomId) {
roomMembers.get(roomId)?.delete(conn.id);
connectionRooms.get(conn.id)?.delete(roomId);
conn.send({ type: 'left', roomId });
}The leave path should tolerate membership that is already gone. A reconnect or a duplicate tab can send a leave message after cleanup has already run, and a heartbeat timeout can close the socket while a final client message is still being parsed. When the registry mutations are idempotent, none of those orderings cause trouble.
Empty sets deserve attention as well. Once a room drops to zero local members, delete it from roomMembers, and once a connection holds zero rooms during close, remove it from connectionRooms. If you leave the empty sets around, metrics and lazy backplane subscriptions get harder to read, because the registry keeps reporting rooms that have no live recipients.
function deleteIfEmpty(map, key) {
if (map.get(key)?.size === 0) map.delete(key);
}One invariant ties this together. Every connection ID in a room set should map to a real connection entry, and every room listed under a connection should contain that connection's ID. Servers break this for brief windows during close, reconnect, and heartbeat timeouts, and the cleanup code's job is to bring the state back into agreement quickly.
You can draw the local graph using only local registry data.
room project:42 -> c1, c2, c7
room project:99 -> c2
c1 -> project:42
c2 -> project:42, project:99
c7 -> project:42Connection c2 belongs to two rooms, so closing c2 removes it from both room sets. A send to project:42 expands to c1, c2, and c7, and each of those connections then decides on its own whether it can accept the payload. The room graph chooses who is in the send, and each connection queue still controls whether the message actually goes through.
Join and leave should both be idempotent at the registry level. Adding a connection ID that is already in a Set still leaves one membership record, and removing an ID that was never there should finish without complaint. Repeated lifecycle events are normal here, coming from reconnects, duplicate tabs, token refreshes, and transport retries, so the registry API has to absorb the repeats.
Duplicate connections call for a policy of their own. Subchapter 04 introduced duplicate connection policy for presence, and fanout inherits the same idea. A single user might have three devices open on the same room. A chat or collaboration feature usually wants all three to receive the event, while a single-session dashboard might want only the newest connection. The registry should be able to express that choice directly.
One common local arrangement keeps user identity separate from connection identity.
const connectionsByUser = new Map();
function attachUser(userId, conn) {
const ids = connectionsByUser.get(userId) ?? new Set();
ids.add(conn.id);
connectionsByUser.set(userId, ids);
}A connection identity refers to one transport instance. A user identity refers to the authenticated subject sitting behind that connection, which may have several transports at once. Fanout code has to decide which of the two it is working with. A direct notification to a user goes through connectionsByUser, while a plain room event goes through roomMembers. A presence update can touch both.
When a connection closes, the cleanup path has to update every map it appears in.
Subscriptions, Rooms, Channels, and Topics
A subscription records that a connection wants future events of some kind. In this server it is a routing record attached to a live connection. It can show up as a WebSocket message like { "type": "join", "room": "project:42" }, as an SSE endpoint path like /events/projects/42, or as a long-poll request carrying a cursor for one stream.
The subscription registry stores those routing records inside the process. The key can be a room ID, a channel name, a user ID, a tenant ID, or a compound of several of those. That key format is part of the realtime API contract, since clients and server handlers both have to build it the same way.
Rooms, channels, and topics overlap in many codebases, but each word carries a different emphasis.
Start with the room. A room points at a group of connections, and that group shifts as clients join and leave. Chat rooms, document rooms, and project rooms all behave this way. The question a room answers is which live connections are in the group right now.
A channel leans on a different idea. It groups updates that belong to the same category, with names like presence, notifications, or comments. A channel still expands to a set of connections in the end, but the name draws attention to the kind of event rather than to who happens to be a member.
Topic is the word brokers and backplanes reach for when they route delivery. In an application server a topic might read as tenant.7.orders or room.project:42. The part that counts is the routing contract. Publishers write events to a name, and whoever subscribed to that name receives them.
The implementation can store all three with the same data structure.
const subscriptions = new Map();
function subscribe(key, connId) {
const ids = subscriptions.get(key) ?? new Set();
ids.add(connId);
subscriptions.set(key, ids);
}The name you give the API still does work. Call a handler joinRoom() and future readers understand that membership follows the connection lifecycle. Call it subscribeTopic() and they read a hint that the routing key may cross a backplane later. A bare Map cannot say any of that on its own.
Keep group keys simple. Add tenant or organization scope when the API is multi-tenant, and normalize any user-controlled strings before they become registry keys. Don't derive room names from untrusted payloads in the middle of the send loop. Validation belongs at join time, before membership exists at all.
Security has a small role here too, and Chapter 24 covers authorization in full. The mechanical rule is enough for now. A connection should enter a room only after the server has accepted that membership. After that, fanout code reads the membership it already established rather than re-deciding access on every write, unless the application has a clear reason to re-check.
Namespace the keys when different grouping concepts share one registry.
const roomKey = id => `room:${id}`;
const userKey = id => `user:${id}`;
const topicKey = id => `topic:${id}`;Those prefixes stop 42 the user from colliding with 42 the room, and they make debug output easier to read. When a dump shows room:project:42, an operator can see the routing dimension without opening any application code.
The registry should hold stable identifiers. A request object carries headers, a body stream, framework references, and other state tied to the initial request lifecycle, and a realtime connection outlives that lifecycle. Store the connection identity, the auth context fields you actually need, and the connection ID. Keep the raw request object out of any long-lived membership map.
Room metadata follows the same rule. The room registry only routes to members, while the domain storage owns the actual room records. When the display name for project:42 changes, event creation should load the current domain data first, then send the normalized event through the fanout layer.
Subscriptions also carry a property that simplifies the send path. They can look transport-specific at the edge while staying transport-neutral inside the registry. A WebSocket client sends a join message, an SSE client connects to /projects/42/events, and a long-poll client includes ?room=project:42&cursor=.... After validation, all three become the same subscription key, so the send path runs on one model even when the accept path differs by transport.
The Local Fanout Loop
Local fanout covers the case where every recipient connection sits inside the current Node process. That one process owns the connection registry, the subscription registry, and all the per-connection queues. Nothing has to cross a network between choosing recipients and enqueueing the writes.
A minimal local loop is short.
function sendToRoom(roomId, event) {
const ids = [...(roomMembers.get(roomId) ?? [])];
const payload = JSON.stringify(event);
for (const id of ids) {
const conn = connections.get(id);
if (conn?.open) conn.queue.push(payload);
}
}The code takes a snapshot of the recipients, serializes the payload one time, and pushes the same bytes into each live connection's queue. From there the send queue decides when those bytes actually reach the WebSocket, the SSE response, or whatever long-poll response is pending. One slow consumer stays contained inside its own queue policy and does not hold up the rest.
The snapshot gives the loop a stable input. Membership can change while fanout is running. A client leaves, another joins, a socket closes, all while the loop is partway through. JavaScript runs one callback at a time, yet the send loop can still call application code, queue microtasks, or touch objects that later cleanup mutates. The snapshot stops the current pass from seeing membership in a half-changed state.
The snapshot also draws a clear delivery cutoff. Whoever is present at snapshot time becomes a candidate for this event. A connection that joins after the snapshot picks up later events instead, and one that leaves mid-pass may still be skipped if the connection registry already shows it as closed or gone from the room.
That last membership check is cheap to add.
function stillInRoom(roomId, connId) {
return roomMembers.get(roomId)?.has(connId) === true;
}A loop that snapshots the IDs and then rechecks membership right before enqueueing will skip a connection that left after the snapshot. Some systems prefer pure snapshot semantics, where everyone captured at the start gets the event regardless of what happens next. Either approach works. The trouble comes from leaving the rule implicit, so pick one and write a test that pins it down.
Serialization is a local decision too. If every recipient gets the same JSON, one JSON.stringify() call covers the whole room. Once recipients need personalized fields, that single call becomes one call per recipient. Personalization tends to come from authorization rules, per-user read state, localization, or capability negotiation with the connection. Beyond the extra CPU, the payloads now vary, which makes coalescing them harder.
Compression shifts the cost once more. WebSocket permessage-deflate came up in Subchapter 01, and Subchapter 03 already went through the CPU side of broadcast pressure. For fanout code the question is where the compression happens. Compress the payload a single time and reuse the bytes across every connection, and you risk colliding with the per-connection compression state a WebSocket library maintains. The protocol-safe alternative is to compress separately for each connection, which respects the rules but runs the CPU work once per recipient. Whichever you pick, count compression as real work inside the send path.
The loop should call a connection method rather than writing raw sockets inline.
function enqueue(conn, payload) {
if (conn.queue.full()) return conn.close(1013);
conn.queue.push(payload);
conn.flush();
}That small method puts the policy where it belongs. Queue limits, send deadlines, overflow behavior, coalescing, and close codes all live in the connection send path. The room loop chooses recipients and hands each message to a connection that decides for itself whether it can take more.
Fanout also runs into sockets that have already closed. Close handlers normally remove registry entries, but the timing is racy. A socket can close between the snapshot and the enqueue. A heartbeat timeout can mark a connection dead mid-pass. A send attempt can discover the underlying transport is already gone. When the loop hits a missing or closed connection, it should treat that as routine cleanup, not an error.
function dropClosed(roomId, connId) {
roomMembers.get(roomId)?.delete(connId);
connectionRooms.get(connId)?.delete(roomId);
}Fanout amplification is what happens when one application event turns into many writes. Each recipient adds queued bytes, a serialization step, maybe a compression step, and a failure check. Send a 2 KB event to 10,000 live recipients and you have queued or written on the order of 20 MB, before protocol overhead, object overhead, and per-connection metadata get counted. Your event rate can stay flat while the write count and queued bytes climb fast behind it.
Backend teams often measure the inbound event rate and never look at the recipient count. The numbers a realtime server actually wants are per-room recipient counts, fanout duration, enqueue failures, queue overflows, and how many closed connections got removed during a send. Metrics systems are a Chapter 29 topic, but the counters themselves have to live right where these decisions get made.
The local algorithm should stay small.
read room membership
snapshot connection IDs
serialize shared payload
enqueue per live connection
apply per-connection pressure policy
clean stale membershipEvery line in that short list can fail. Missing membership gives you zero recipients. Very large membership gives you amplification instead. Serialization can throw on an unexpected value. Enqueue can reject when the queue is full. Flush can run into a transport that already closed. Cleanup can even drop state that some other feature counted on. The loop reads as short because most of the difficulty sits in the state it operates on.
Measure room size before the loop and once more after cleanup. The before count is the recipient set you were asked to reach. Accepted tells you how many queues actually took the event, while dropped and closed together tell you whether send policy is holding the process up. Those four numbers account for most of what a local fanout pass does.
room=project:42 members=128
accepted=121 dropped=3 closed=4
bytes=18432 duration_ms=2.7That single line carries far more than a broadcast sent log ever could. It puts amplification and cleanup in one place, and it gives a load test something concrete to assert against as rooms get bigger.
Skip per-recipient logs in the hot path unless you are chasing a bug in one specific room. A 10,000-member room writes 10,000 log lines for a single event, which is a second fanout aimed straight at your logging pipeline. Count outcomes inside the loop, and only reach for per-recipient detail when something forces you to.
The Loop Under Load
The difficult part of fanout is the interaction between registry state, JavaScript execution, and transport queues while a send is still going. From the top of the stack a room send reads as synchronous. You grab the IDs, run the loop, call enqueue. The actual delivery reaches well past that frame. It plays out across the current callback, later microtasks, stream flushes, kernel send buffers, the WebSocket frame writer, the SSE response write, and the speed at which each client reads.
Follow one event from the start. A handler receives message.created for room:project:42. That event could have arrived from an HTTP route, a WebSocket message handler, a database change listener, or a backplane adapter. Whatever the source, the local fanout function should get a normalized event object. Normalized means the loop can read the event type, the room ID, the payload, the ordering fields, and any metadata it needs as plain fields on that object.
The first lookup is plain registry work. roomMembers.get(roomId) returns a Set of connection IDs when the room exists in this process. Storing IDs in those sets instead of transport objects keeps a stale room entry from pinning a closed connection in memory. The connection registry holds transport state, and room membership holds only routing state. They stay separate, and that is the reason cleanup is straightforward to reason about.
The snapshot copies the current set into an array. That allocation is not free, but it gives you a stable recipient list and stops iterator behavior from shifting if the set mutates mid-loop. In a small room the cost disappears into the noise. In a very large room the snapshot allocation becomes a real line in the fanout budget, and rooms that big may call for chunked fanout, shard ownership, or an adapter that streams recipient IDs out of another structure. Those larger designs show up later in this chapter and in the architecture chapters. The local model still opens with a snapshot, since correctness has to come before any of that optimization.
Payload preparation comes next. Build the shared payload once per event. A workable structure is a small envelope carrying the event type, a sequence number or cursor when the transport supports recovery, the room ID or channel name, and the domain payload. That envelope is part of the realtime API contract, and clients rely on its fields being stable from one event to the next.
Per-recipient customization happens after that, not before. A server might add readByMe, hide certain fields from some members, or use a different event name for older clients. None of that belongs in the shared payload path. The usual approach has two stages. Build the shared domain event once, then let each connection wrapper attach its own connection-specific metadata just before enqueue. The room loop never has to grow into a second authorization system that way.
From here the loop works one connection at a time, and connections.get(id) can come back empty. An empty result means the membership registry is holding stale state, so drop that entry. A low rate of these is nothing to worry about, since abrupt disconnects and reconnect races leave stale entries behind as a matter of course. A high rate is the one to investigate, because it usually traces back to cleanup bugs or process lifecycle problems.
An open connection can still turn the payload down. The queue might be full. The send deadline might have passed. The WebSocket bufferedAmount might be over its configured budget. The SSE response might have returned false from write() and be waiting on drain. For any of these, the connection layer should hand the loop back a small result, one of accepted, dropped, coalesced, closed, or stale. The room-level code counts those outcomes, and the transport detail stays down in the connection layer.
const result = conn.send(payload);
if (result === 'closed') dropClosed(roomId, conn.id);Keep the contract small. A room send only needs to know two things, whether the connection stayed in the room and whether the event made it into that connection's policy. Kernel send-buffer details stay down in the connection layer.
Ordering happens inside each connection queue. A fanout event is appended after whatever was already queued for that connection, so two room events sent in order from one process reach each recipient's queue in that same order. Cross-process fanout makes this messier, since remote events can come off a backplane on their own schedule. Delivery semantics across processes are a Chapter 21 subject. The guarantee here is smaller and concrete. One process keeps enqueue order per connection for the events it sends in sequence.
Fairness starts to show once rooms get large. A synchronous loop over 100,000 recipients can hold the current event-loop turn long enough to delay unrelated timers, pings, HTTP handlers, and close cleanup. I/O stays non-blocking, but the CPU work and the JavaScript allocation sit on the thread for the whole loop. The fix in practice is to chunk the work.
A synchronous loop over a very large room runs to completion before the event loop gets another turn. Heartbeat pings and timers all queue up behind it for the duration. If it stalls long enough, your own liveness checks fire late, the server decides healthy clients are dead, and it disconnects them in the middle of the broadcast. That kicks off a reconnect storm while the process is already saturated. Cap the work per turn with chunked fanout so liveness checks and I/O keep running.
function scheduleFanout(ids, i = 0) {
if (i >= ids.length) return;
setImmediate(() => {
fanoutChunk(ids.slice(i, i + 500));
scheduleFanout(ids, i + 500);
});
}That example splits the recipient array into chunks and schedules one chunk per turn. A setImmediate() queued from inside the running callback runs on a later event-loop iteration, which lets pending timers, I/O callbacks, close callbacks, and other room work run in between the fanout batches. queueMicrotask() does not behave the same way. Node drains queued microtasks before timers and I/O in the same turn, so it breaks the function body into callbacks while the full fanout pass can still hold that one turn. Which one you choose depends on your latency goals and on how much other work has to run between batches. The thing you are really enforcing is a budget, a cap on how much fanout runs per turn.
Chunking shifts delivery timing, because some recipients now get the event a little before others. Most features people interact with directly can live with that spread. A handful of coordination protocols cannot, so the API contract has to state which group it belongs to. A collaborative cursor update can soak up a few milliseconds of spread and nobody notices. A distributed lock notification is a different problem for a later chapter.
Payload bytes are not the only source of memory pressure. A single queued message can allocate an envelope object, a string, a Buffer, a closure, a timer, and some tracking metadata, and once you multiply that by room size, slow consumers can hold those objects long enough to age them into the old generation. Heap behavior is covered in the V8 chapters. The fanout code's part is narrower, which is to keep its queue bounds honest. A per-connection array with no limit stays unlimited no matter how tidy the room registry around it looks.
Failure handling stays local and small at this level. When one connection rejects the event because its queue is full, the loop moves on to the next recipient and leaves that one alone. A serialization failure before the loop starts is different, since nobody receives the malformed event and the source handler gets the error back instead. The adapter can also send the same event back to the origin process, and loop-prevention metadata is what blocks that duplicate local delivery. During shutdown, room sends stop taking new work while the existing connection queues drain or close under the shutdown policy from the deployment chapters.
A broker has no place in the single-process case. That case already holds enough state to design carefully on its own.
Rooms also contend for the same process turn. A busy room can take up most of a turn when each event expands to many recipients, which leaves smaller rooms and direct notifications waiting behind it. One global queue shared by every connection ties unrelated clients together and hides per-connection pressure. A better layout keeps the per-connection queues and adds a scheduler above fanout that caps how much room work runs in a single turn.
room event queue
-> fanout batch for project:42
-> fanout batch for project:99
-> direct notification batchThat scheduler is still a local thing. It lives in one process and stops any single room from taking over JavaScript execution. Choose a batch size that is small enough to keep timers and I/O moving, and not so small that scheduling overhead starts to dominate.
Batching also changes how errors surface. A serialization failure at normalization time takes out every batch before any enqueue happens. When one batch lands on many closed connections, the cleanup for them spreads across several turns. A crash partway through the batches leaves delivery partial. A lossy realtime stream can usually accept a partial result like that. Durable delivery cannot, and it needs the design the messaging chapters get into.
The backpressure budget should run before the event ever enters a connection queue. Accept the event first and check the queue size afterward, and the memory is already spent. Check size, age, and byte budget ahead of the push, and the queue can reject cleanly with nothing allocated. The fanout loop counts the rejection either way and carries on.
function payloadBytes(payload) {
return typeof payload === 'string'
? Buffer.byteLength(payload)
: payload.byteLength;
}
function canAccept(conn, payload) {
return conn.queue.bytes + payloadBytes(payload) <= conn.queue.maxBytes;
}That check stays local to the connection on purpose. A string payload needs encoded-byte accounting, because JSON strings report UTF-16 code units through .length while the queue budget is counted in bytes. Buffer.byteLength() returns the encoded size for a string, and Buffers and typed arrays already expose .byteLength. Consider a room with 1,000 recipients where 990 queues are healthy and 10 are slow. Penalizing all 1,000 because 10 of them lag turns per-connection pressure into a room-level failure. Some applications really do want that once too many recipients fall behind. When that is the goal, make it a separate threshold driven by aggregate outcomes, and write it down as room policy.
A JavaScript string's .length counts UTF-16 code units rather than bytes. A multi-byte character, such as an emoji or non-Latin text, makes the real encoded size larger than .length reports. A byte budget built from .length under-counts as a result, and the queue runs past its limit. Measure string payloads with Buffer.byteLength(). Buffers and typed arrays already expose the correct .byteLength.
Process Boundaries Split the Room
Cross-process fanout begins as soon as one event's recipients are spread over more than one Node process. Each process owns only the sockets and registries that are local to it. A room of 300 members might be split across four workers, or three hosts, or several regions, and the process that receives the event sees only the members attached to itself.
The failure shows up right away.
process A: room project:42 -> c1, c2
process B: room project:42 -> c8, c9
event enters process A
local fanout reaches c1, c2
c8 and c9 receive nothingNothing here is malfunctioning. The local data is correct for process A. The recipients it misses are held in another registry over on process B.
Each process sees only the sockets it accepted. roomMembers in process A holds c1 and c2 because those connections upgraded on A. It has no entry for c8 and c9, which sit in process B's heap. For the connections this process owns, the send loop is doing everything it should. Once a room spans more than one process, local-only fanout reaches only a subset, and nothing in the local code path tells you about the gap.
So the real problem is reaching recipient connections that several processes own at once. The system needs a way to move the event from the process that first sees it to every other process that might hold subscribers. The component that does this is the realtime backplane.
A realtime backplane is the distributed delivery piece that realtime servers use to pass fanout events between separate processes. It can sit on Redis or Valkey pub/sub, a message broker, a database notification feature, a platform event bus, or a service you write yourself. Its internal data structures and delivery semantics come up in later chapters. What belongs here is the contract at the edge of it. Local realtime code publishes room events to the backplane and listens for backplane events arriving from other processes.
Once a backplane is in place, every process has two sources of events.
local application handler -> local fanout + publish
backplane message -> local fanoutA local handler publishes outward, since other processes may be holding subscribers for this room. A backplane handler does the opposite and delivers inward, since this process may hold subscribers too. Once the event is normalized, both of them call the same local fanout function. That shared path keeps connection queues, slow-consumer policy, and stale cleanup behaving the same no matter where the event came from.
Cross-process state should stay modest at this layer. A process rarely needs every remote connection ID. It needs to know which remote events to take in, or it can receive all room events and filter them locally. Whichever subscription strategy you pick changes the cost, but the local registry still owns the live sockets in both cases. Avoid building a fake global connection registry in every process unless the architecture chapter has chosen that tradeoff on purpose.
Global registries go stale, they add write amplification, and they open up split-brain cleanup paths. Presence does sometimes need distributed state, but presence is derived and driven by timeouts, the way Subchapter 04 described it. For most cases a local fanout server runs on local connection state plus a backplane event stream, and nothing else.
A two-process trace is the test that exposes this.
1. c1 joins project:42 on process A
2. c8 joins project:42 on process B
3. c1 sends message m7 to process A
4. process A sends m7 to local project:42 members
5. process A publishes m7 to the backplane
6. process B receives m7 and sends to local membersStep 4 reaches c1 and any other local member on process A. Step 6 does the same for c8 and the local members on process B. The local fanout function is identical in both, and the only thing that changes is where the event entered.
Add a process C that has zero members in project:42. When C subscribes to every room event, it receives this one, checks local membership, and sends to nobody. That spends CPU and inbound bandwidth for no delivery, though the result is still correct. The other option is for C to subscribe lazily, only to rooms where it actually has local members, so it never sees this topic at all. That removes the wasted work and trades it for subscription churn.
Cross-process fanout amplifies along two axes at once. One is recipients per process, the local factor from before. The other is how many processes receive the event at all. Eager backplane mode can put one room event onto every process even when only two hold members. Lazy mode keeps that process count closer to where the members actually are. Sharded mode can cap it by shard ownership.
total local writes = sum(local recipients per process)
backplane deliveries = number of subscribed processesBoth numbers carry weight. Ten recipients spread across ten processes is cheap to deliver locally but generates traffic on the backplane. Flip that around, with ten thousand recipients all on one process, and the backplane stays quiet while the local cost is high. Capacity planning is a later topic, but the fanout layer should already expose both counts.
The cross-process event envelope needs origin metadata.
import { randomUUID } from 'node:crypto';
const nodeId = randomUUID();
adapter.onMessage(msg => {
if (msg.origin === nodeId) return;
sendLocal(msg.roomId, msg.payload);
});The origin field keeps a process from picking up its own published event and sending it a second time. A production envelope usually carries more than that, things like an event ID, a tenant ID, a room ID, a timestamp, and a sequence or trace field. Keep this duplication control apart from the business payload. Adapter metadata should stay inside the server unless the API contract sets out to expose it.
Duplicates can still slip through. A backplane reconnect might replay a message. Two local handlers might publish the same domain event. A process might crash after local fanout but before it publishes outward, or publish outward and then crash before its own local delivery. Delivery semantics, retries, and idempotency at the messaging layer are Chapter 21. Realtime API code should still attach an application event ID anywhere clients can de-duplicate safely.
The local rule that helps most is a single entry point. Every event reaches local fanout through one function regardless of its origin, and anything source-specific stays in the adapters.
Remote events should go through the same validation as local normalized events, in a slightly narrower form. The adapter rejects malformed envelopes before they reach room membership at all, checking the tenant fields, the room ID format, the event type, and the payload size. For messages arriving from another component, that validation is basic hygiene.
Remote events need a defined behavior during shutdown as well. While the process drains, it may hold existing sockets open for a short period while turning away new connections, and backplane events for those existing rooms still need local delivery in that window. Once it starts closing realtime connections, the adapter can stop taking room messages. Make that transition explicit. A process that unsubscribes too early will drop events for connections it still holds. Wait too long to unsubscribe and it burns work on connections that have already closed.
The Adapter Layer
The adapter layer sits between transport-local realtime code and the distributed fanout mechanism. On the local side are the WebSocket, SSE, and long-poll responses, the connection registries, room membership, and the per-connection queues. On the far side is the backplane client.
The layer keeps the realtime server from spreading broker-specific code through message handlers.
const adapter = {
publishRoom(roomId, event) {},
onRoomMessage(handler) {},
close() {},
};The interface stays narrow. It publishes a room event, receives a room event, and closes itself during shutdown. A production adapter grows more over time, with error events, reconnect state, and health reporting, but the application send path should still face this small a surface.
With the adapter in place, the local send path is two steps.
async function emitRoom(roomId, event) {
sendLocal(roomId, event);
await adapter.publishRoom(roomId, event);
}That ordering is a policy choice. Local-first delivers to nearby clients with low latency and then pushes the event out to the other processes. Going publish-first instead can make the cross-process order match the backplane more closely. Some systems publish once and let every process, the origin included, deliver only from backplane messages. That gives one delivery route for everyone, and local clients pay an extra hop for it.
None of these orderings wins by default.
For most application realtime features, local-first is the easiest to follow and the fastest. The process accepts a message, sends to its local subscribers, publishes to the backplane, and the other processes deliver once they receive it. When cross-process ordering counts for more than local latency, move to backplane-first or backplane-only and write tests that lock in that contract.
Loop prevention belongs in the adapter envelope. The adapter marks outbound messages with the origin process ID and skips inbound messages that match it. Even when the backplane itself provides sender exclusion, keep the application-level origin field if duplicate sends are expensive. Infrastructure behavior changes over time, and an event envelope is easier to inspect during debugging than broker internals.
A process can subscribe to the backplane eagerly or lazily. Eager subscription means every process listens to all room events and filters them locally. Lazy subscription means a process listens to a room topic only while it actually has local members in that room.
Eager subscription is the simpler one to operate, since every process receives the same stream. What it wastes is inbound events on processes that hold zero local recipients.
Lazy subscription removes those wasted events and brings churn in exchange. The process joins a backplane topic when the first local member enters the room and leaves when the last one exits, which means handling the races that happen while membership is changing. It can also open small timing gaps, where a local room is created in the short interval between the adapter unsubscribing and a remote event being published.
Many Node systems begin eager at low scale and switch to lazy or sharded later on. That path is fine as long as the adapter layer is there from day one. Application code should see the same publishRoom() result whether the adapter turns it into one broker topic, many topics, a cluster IPC message, or a platform event bus.
Error handling at the adapter layer needs a defined behavior. Say local fanout succeeds but the backplane publish fails. Local clients saw the event and remote clients did not. In another case local fanout waits on the publish and the publish is slow, so local clients are now stuck behind the backplane. And if the adapter disconnects outright, a new room send has to do something deliberate, whether that is failing, degrading to local-only, or closing the affected realtime connections.
Redis and Valkey pub/sub deliver at-most-once. A message reaches whoever is subscribed at that instant and is then dropped, with no replay. A process that is restarting, or that subscribed a moment too late, gets nothing. If publishRoom() fails or the backplane loses the frame, local clients saw the event and remote clients never will. Classify each event by how much loss it can tolerate, and route anything that cannot be lost through a durable path rather than raw pub/sub.
Pick one behavior per event class. A typing indicator can drop to local-only and nobody is harmed. A financial approval notification is the opposite, where the right move is usually to fail the source operation or fall back to durable delivery. Durable messaging itself is a Chapter 21 topic. The fanout layer's job is to sort its events by how much loss, delay, and duplication each one can survive.
Adapter lifecycle is tied to process lifecycle. A graceful shutdown runs through a sequence. New connections are refused, new room joins are refused, the per-connection queues flush or close, any required leave or presence events go out, the backplane subscription ends, and the adapter connections close. The full drain protocol is a deployment-chapter subject. What the realtime server owes it is the set of hooks to drive that sequence.
Adapter APIs should stay clear of transport terms. Name a method broadcastWebSocket() and you pull the distributed layer toward one transport. Keep it as publishRoom() and the interface stays at the application routing level, where WebSocket, SSE, and long-poll clients can all subscribe to the same room key when the server supports them.
The adapter should also be the only thing that knows broker topic names. Application code passes roomId and nothing else. Inside, the adapter can turn that into realtime.room.project:42, tenant.7.room.project:42, a hash bucket, or a shard topic. That mapping changes as scale grows, and keeping it inside the adapter means a topic migration never has to touch every handler.
Backplane health should flow upward in a small status object.
const status = adapter.status();
if (status.writable === false) rejectRoomSend();Writable here means the process can publish outward, and readable means it can receive remote events. The two can come apart during a reconnect. A process that can read but not publish will still deliver remote room events, while it has to reject local-origin sends that depend on cross-process delivery. The reverse is worse. When a process can publish but not read, local users keep producing events while remote events simply vanish from what they see. In both states, the accept path needs to be able to read the current policy.
Keep adapter retries out of the connection handlers. When publishRoom() retries once or reconnects on its own, the handler should still only ever see success, failure, or a bounded timeout. An adapter call with no bound can hold an HTTP response open, stall a WebSocket message acknowledgement, or leave application state half-updated. Retry policy in general comes up later. The point for now is that the fanout layer's calls have to be bounded.
Sticky Sessions and Affinity
Long-lived realtime connections change how a load balancer behaves, because the connection state stays in whichever process was chosen. A stateless HTTP API can send one request to process A and the next to process B without any trouble. A WebSocket connection cannot move like that. It stays attached to the process that accepted the upgrade until it closes, and an SSE response stays attached to the process writing it. A long-poll request is shorter-lived, but its cursor and pending state still sit somewhere for as long as the request hangs.
A sticky session is a routing policy that keeps a client or session pinned to the same backend target across multiple requests or reconnects. Connection affinity is the wider idea behind it, where related connections or requests end up on the backend that already holds the relevant state.
Affinity can key off a cookie, a source-IP hash, connection metadata, gateway state, or platform-specific routing. The algorithms and platform routing details are Chapter 35. What you need here is the local effect. When affinity is working, a reconnect usually returns to the same process it left.
A raw WebSocket stays pinned to the process that accepted its upgrade. Any transport that starts as HTTP polling and then upgrades, which is the default for libraries such as Socket.IO, makes several separate HTTP requests during the handshake. Without sticky sessions those requests can land on different processes, and the handshake fails or flaps with session-not-found errors before a socket ever opens. To run a polling-capable transport behind more than one process, enable affinity at the load balancer first.
That property does a lot for reconnects. When a client drops for a moment and comes back on the same process, its resume window, recent event cursor, duplicate connection policy, and pending presence state can all still be in local memory. When it comes back on a different process instead, none of that is local, so the new process has to pull the state from shared storage, the backplane, or a source-of-truth HTTP API.
This reconnect failure is common, and the steps are easy to lay out.
t0: c44 connects to process A and joins room project:42
t1: c44 disconnects during a network switch
t2: c44 reconnects through the load balancer
t3: new connection lands on process B
t4: resume state for old c44 is unavailable on process BProcess B can still authenticate the client and rejoin its rooms, provided the client sends enough information and the server can validate it. What process B cannot reach is process A's heap. If the resume state has to outlast the move, store enough of it outside any single process, or route reconnects with affinity for the length of the resume window.
Sticky sessions cut down on movement without removing spread. Members of one room can still be scattered across processes by different affinity keys, and cross-process fanout has to cope with that. A room event accepted by process A still has to reach members on process B and process C.
Affinity interacts with deploys, too. When a backend target drains, its existing connections either close or move through a reconnect, and the next connection can land elsewhere because the old process is leaving service. A realtime server should treat reconnect-to-a-new-process as normal even when sticky routing is turned on.
SSE and long polling each add a complication. SSE reconnects on its own through EventSource, and the browser sends Last-Event-ID when it has one. A sticky policy can keep the client near the same process, but the event cursor still has to carry meaning for the times the process changes anyway. Long polling makes repeated HTTP requests, so each poll is a fresh load-balancer decision unless affinity ties the sequence together. For long polling, the cursor is the part that travels.
Affinity gives you lower latency and the reuse of local state. Getting cross-process delivery correct is a different job, and it depends on a backplane and shared state. Affinity does not substitute for either of those.
Affinity can also mask a missing backplane while you are testing locally. One developer, one browser, one process, one room, and everything appears to work. Putting a load balancer with sticky routing in front does not expose anything either, because that single developer still lands on one process.
Because that single-process setup hides the gap, the bug can pass straight through staging without ever showing itself. It appears once two clients in the same room land on different processes, and not before. Write that test on purpose. Two clients, two processes, one of them sends and the other has to receive it.
client A -> process 1 -> room project:42
client B -> process 2 -> room project:42
client A sends event
client B must receive eventThat test is the one that proves cross-process fanout. A reconnect test is a different test, and it proves affinity and resume behavior. Keep them apart, because each one is checking a mechanism the other does not touch.
Connection affinity creates skew as well. When one tenant or one classroom opens many connections and the affinity key sends all of them to the same backend, that backend can end up with a lopsided share of the sockets. The algorithms behind this are in the load-balancer chapter. Realtime code should still track, per process, the local connection count, the room count, and how room members are distributed. The same affinity that improves locality can pile load onto one process.
Every open connection is a file descriptor, and the OS caps how many a process can hold. On Linux the default ulimit -n is often 1024, which a single busy realtime process exceeds quickly. Once the limit is reached, accept() begins failing with EMFILE, and new connections are refused while the existing ones still look healthy. Raise the soft and hard fd limits for the process, and treat connections-per-process as a real capacity number rather than an unlimited one.
When skew does show up, changing the affinity key is one fix. Sharding by room, shrinking room size, splitting transports, or changing how presence events get delivered are others. All of those are architecture decisions. The fanout layer's contribution is narrower, which is accurate local counts and a clear statement of what each process owns.
Realtime Shards
A realtime shard is a partition of realtime responsibility. The partition can be a subset of rooms, tenants, users, regions, or topics handed to a particular process group. Sharding becomes relevant when every process receiving every event grows too expensive, or when very large rooms need ownership that is more deliberate.
At small scale every process can subscribe to the same backplane stream and filter locally, and the wasted events do not hurt. Once the scale grows, processes start subscribing only to the rooms they currently host. Past that, the system can assign room ownership to a shard, route writes through that shard, and keep membership state nearer to the connections that need it.
Room ownership means one selected shard coordinates fanout for a room. That shard can take charge of sequencing, membership lookup, and backplane publication for it. The exact assignment algorithm is in the sharding and architecture chapters. The distinction to hold onto here is smaller. A room moves from a model where any process can publish and every process filters, to one where a single selected shard handles that room's realtime traffic.
Shards remove fanout amplification from the places where it produces nothing. With every process receiving every room event, the backplane cost climbs with the process count even when only one process holds recipients. Assign rooms, or make subscriptions lazy, and fewer processes receive events they do not care about. That cost does not vanish, though. It moves into shard routing, membership movement, and rebalancing.
Shards also make failure domains easier to see. When one shard is overloaded by a giant room, the other shards keep serving their rooms without noticing. When a single process holds a very hot room, the shard may have to split that room, move some of its connections elsewhere, or change how the feature delivers updates. Multi-region fanout adds a further layer beyond this, and it comes later.
The shard concept should reach the API carefully, if it reaches it at all. Choosing shards is server-side work. Clients join rooms, subscribe to channels, reconnect with resume tokens, and send cursors, and the server routes each of those to the current owner. Once you expose shard IDs to clients, your internal routing decisions become part of the contract and get hard to change.
Presence calls for the same restraint.
Presence fanout is how presence changes get distributed to the subscribers watching for them. A user might come online, go idle, open a second connection, lose a heartbeat, or time out. The server recomputes the derived presence state and sends events to the watchers. Inside one process this is local room fanout over a presence room or channel. Across processes it needs the backplane or shared presence state.
Presence fanout amplifies fast, because every presence change has two audiences at once, the user whose connections changed and everyone watching that user or room. A team of 1,000 members is much better off with coalesced state transitions than with every small heartbeat change broadcast to all of them. Subchapter 04 described presence as derived state with timeouts. Fanout adds the delivery policy over that, which is to coalesce changes, drop the noise, and send only the transitions a client can use.
connection close -> presence state recalculated
presence changed -> publish presence event
watchers receive -> local fanout through presence channelThe recalculation step carries most of the weight here. When a connection closes, recompute the user's presence from the connections that are still live. The duplicate connection policy from Subchapter 04 feeds that recomputation before any fanout runs. What you then send out is the derived result rather than the raw socket event.
In cross-process presence, a process can only read its own local connections. It can tell whether user u7 has a live connection on this process. Whether u7 is online anywhere at all is a different question, and answering it takes shared state, a backplane, or shard ownership. Redis-style sets, leases, and timeout counters come up in Chapter 20, and message broker semantics in Chapter 21. The fanout layer's part is to expose the split between the two. A local presence change turns into a distributed presence event, and a distributed presence event arriving from elsewhere turns into local sends to the watchers.
The same adapter layer works for this. A publishPresence() method can carry derived presence events while keeping Redis commands and broker topics out of the connection handlers. Local watchers still receive through their own per-connection queues.
Choosing How to Scale
The decision tree starts with one process.
Begin with the connection and subscription registries. Make join, leave, close, and heartbeat cleanup idempotent. Send to rooms through one local fanout function, and push into per-connection queues instead of raw sockets. Count recipients and enqueue outcomes, and treat stale membership as routine cleanup.
Then add a second process in a test environment and watch the room split. As soon as a room event accepted on one process misses subscribers on another, you need cross-process fanout. Add an adapter layer before committing to any specific backplane. Publish normalized events, receive remote events through that same local fanout function, and carry origin metadata and event IDs anywhere duplicate sends would cause harm.
Use sticky sessions for connection locality, for lower reconnect friction, and for better cache hit rates across short resume windows. Treat a reconnect that lands on another process as a normal case rather than a failure. Affinity is only a routing aid. The distributed state still has to live in a backplane, a shared store, or a shard-owned service.
Move toward shards once process-wide backplane traffic or room size becomes the limiting cost. Start from room or tenant keys when those are already in the API. Keep the shard choice on the server side, and let clients keep speaking in rooms, channels, topics, cursors, and resume tokens.
The hard bugs all trace back to treating local state as if it were global. Take a Map in one process. It can be entirely correct about the connections it knows and still be missing half the room. Sticky routing has the same trap, sending reconnects to a sensible place while the room's members remain scattered across workers. The backplane introduces a different failure, since moving events between hosts opens the door to duplicates and reordering under its own delivery rules. Presence has yet another, where a map that looks accurate on one process is already stale on the rest.
Name each layer in the code for what it is, the local registry, local fanout, the adapter, the backplane, affinity, and the shard. Each of those names tells you where the state lives and which component owns the next hop. Realtime scaling gets easier once you can say, for any send step, exactly which connections it is able to see.