Get E-Book
HTTP Servers, Clients & Proxies

Node.js HTTP Keep-Alive, Agents, and Connection Pools

Ishtmeet Singh @ishtms/June 10, 2026/36 min read
#nodejs#http#keep-alive#agents#connection-pooling

Two HTTP requests to the same host can reuse one TCP connection. That reuse only becomes possible after the first request and response have both finished cleanly. The client needs to know the response is complete. The server needs to know the request is complete. The socket also needs to stay open after that exchange, so both sides can safely put another HTTP message on the same TCP connection.

Start with a tiny server -

js
import http from 'node:http';

const server = http.createServer((req, res) => {
  res.end(`ok ${req.url}\n`);
});

server.listen(3000);

This server sends a complete response for every request. Calling res.end() finishes the HTTP response from Node's side. After that, Node looks at the HTTP connection state and decides whether the accepted socket can stay around for another request.

Add a TCP connection log -

js
server.on('connection', socket => {
  console.log('tcp', socket.remoteAddress, socket.remotePort);
});

That log runs when Node accepts a new TCP connection. If two sequential HTTP requests reuse one socket, the server prints one tcp line and sends two responses. If reuse does not happen, the server prints two tcp lines.

That is the first mental model to keep - request count and TCP connection count are separate numbers.

Now create a client-side agent -

js
import http from 'node:http';

const agent = new http.Agent({ keepAlive: true });

The agent owns reusable client sockets for core node:http. With keepAlive: true, a socket can return to the agent after a response finishes. The agent can then assign that same socket to another compatible request.

Here is a helper that makes one request and consumes the response body -

js
function once(path) {
  return new Promise(resolve => {
    const req = http.get({ port: 3000, path, agent }, res => {
      console.log(path, req.reusedSocket);
      res.resume();
      res.on('end', resolve);
    });
  });
}

The res.resume() call is doing real work. It drains the response body. Without consuming the body, Node cannot finish the response stream, so the agent cannot safely return the socket to the free pool.

Run two requests one after another -

js
await once('/a');
await once('/b');
agent.destroy();

The first request usually logs false. The second usually logs true, assuming the server kept the connection open and the first response body reached 'end'.

A lot happened behind that one boolean. The first request made the agent create a socket. The response finished. The agent detached that socket from the completed request and kept it as a free socket. The second request targeted the same origin. The agent found a compatible free socket, attached it to the new request, and marked it active again.

That is connection pooling in core node:http. No framework is involved. The agent keeps sockets, groups them by origin, and moves them between active and free states.

HTTP Keep-Alive Is Protocol Reuse

HTTP keep-alive means one TCP connection can carry more than one HTTP request-response exchange.

A persistent connection is the TCP connection that stays open after an HTTP exchange finishes. Connection reuse is what happens when a later HTTP exchange uses that same connection.

Chapter 9 covered TCP keep-alive. Keep that separate from HTTP keep-alive.

TCP keep-alive is an operating-system feature that probes an idle TCP connection. HTTP keep-alive is HTTP-level reuse after a complete request and response. A socket can have TCP keep-alive probes enabled and still close after one HTTP response. A socket can also have no TCP probes and still carry many HTTP exchanges if the HTTP layer keeps it reusable.

Client requests expose the lower TCP probe switch too -

js
const req = http.get(url, { agent }, res => {
  res.resume();
});

req.setSocketKeepAlive(true, 30_000);

That call configures TCP keep-alive probing on the underlying socket when available. The HTTP agent still owns HTTP reuse.

For HTTP/1.1, persistent connections are the normal case. A peer can still retire a connection by sending Connection: close, using response framing that ends only when the socket closes, hitting a server limit, timing out, or running into an error.

The Connection header is the HTTP signal. In HTTP/1.1, the connection can stay open across messages unless the framing or headers say otherwise. Connection: close says the current exchange is the final exchange on that connection. The peer can finish the current message and then close.

Core http.Agent also writes the connection intent on client requests. Node's docs describe the constructor behavior this way - an agent sends Connection: keep-alive unless the caller supplied a Connection header or unless the agent is using the no-keep-alive default configuration with unlimited sockets, where Connection: close is used. In normal application code, set the agent options and let the agent manage the header.

A reusable HTTP/1.1 connection needs this sequence -

text
request headers and body complete
response headers and body complete
neither side asked for close
socket remains open
next HTTP exchange starts

Keep-alive saves connection setup work. The process does not need to create a fresh socket, complete another TCP handshake, allocate another descriptor, and rebuild connection-level kernel state for every request to the same origin.

That saved work has a cost. Idle sockets are still open sockets. They consume descriptors and memory while they wait in the pool.

At low request rates, keep-alive can reduce latency for calls spaced a few seconds apart, but the peer may close the socket while it sits idle. At high request rates, keep-alive saves repeated handshakes and keeps sockets recently used, but the process may hold many active and idle descriptors. During bursts, the agent may open several sockets, finish the burst, and keep some free sockets until limits or timeouts shrink the pool.

So HTTP keep-alive is a retention policy. The HTTP layer keeps lower connection state around because the completed message and headers allow another exchange. TCP keep-alive probes can run underneath. HTTP reuse begins only after a complete HTTP exchange.

Reuse Needs Complete Message Ends

Node can reuse a connection only after the HTTP parser knows the current message is complete.

For responses with Content-Length, the parser counts bytes. For chunked transfer coding, it reads chunks until the terminating chunk and trailers. For responses where the body ends only when the connection closes, the client learns the response ended at the same moment the reusable socket is gone.

HTTP/1.1 messages sit on a byte stream. The socket gives Node ordered bytes. The HTTP parser decides where the current message ends and where the next one can begin. If even one byte from the previous body is still unread, the next byte is ambiguous. It might be more body data. It might be the start of the next request. The parser can decide only from the framing rules and its current parsing state.

For a client response, the reuse path looks like this -

text
response headers parsed
body bytes consumed
message reaches end
agent receives socket as a reuse candidate
socket returns to pool or closes

For an inbound server request, the path looks like this -

text
request headers parsed
request body consumed or absent
response finishes
socket waits for next request bytes
idle keep-alive timer starts

message.complete helps during failures. It tells you whether Node received a complete HTTP message before the connection ended. A response can emit 'close' after a complete message, or because the connection ended early. Completion state lets you tell those cases apart.

Connection: close is explicit. The peer is saying the socket should close after the current exchange. Node treats that as retirement, even if the current message arrived successfully.

On the client side, your code also has to consume the response body. The core docs call this out for http.ClientRequest - when a response handler exists, the response data must be read through read(), a 'data' handler, or .resume(). The 'end' event waits for body consumption. Until the body reaches its end path, the agent is still waiting for the response to finish.

Small clients often miss this in error branches -

js
http.get(url, { agent }, res => {
  if (res.statusCode >= 400) {
    res.resume();
    return;
  }

  res.pipe(destination);
});

The rejected response still has a body. Draining it lets Node finish the response lifecycle and decide whether the socket can return to the pool. Leaving it unread keeps the response attached to the socket and delays reuse. If the server closes before the body finishes, the agent removes the socket.

The same issue appears when the caller only wanted headers. A health-check client may inspect statusCode and return early. A metadata client may read one header and ignore a JSON body. Both still received a response stream. If the application wants the socket to remain reusable, it must either consume the body or intentionally destroy the request path.

Destroying is a valid policy. It closes the socket and frees the agent from waiting on unread bytes. The next request will pay for a new connection. Draining keeps reuse available, but it can spend time reading data the caller no longer needs. Pick the policy based on body size, where the data came from, and latency budget.

Servers have the same problem with request bodies. A handler may reject a request before reading its body, but the connection still contains those body bytes. The parser cannot treat the next byte as a new request line while unread body bytes from the previous request are still ahead of it.

For a route that rejects a body but wants the connection to stay reusable, drain or destroy intentionally -

js
if (req.headers['content-type'] !== 'application/json') {
  req.resume();
  res.writeHead(415);
  res.end();
  return;
}

That code drains the incoming request stream while sending a short response. For small unwanted bodies, this is fine. For large or abusive bodies, draining can waste bandwidth and memory on input the server already rejected. A bounded body parser gives you a better route policy - read up to a limit, then close or destroy according to what that route promises.

Keep-alive depends on framing. The parser needs a known message end. JavaScript needs to consume the stream far enough for that end to occur. The socket needs to remain open after both sides finish.

Some messages have no body. A HEAD response, a 204, a 304, and most informational responses have rules that make the body end immediate. For keep-alive, the takeaway is simple - a body can end because of the method and status code, Content-Length, chunked framing, or connection close. Only the first three leave the socket available for reuse.

Server Timers After the Response

A Node HTTP server has a keep-alive timer for the quiet period after it finishes writing a response.

js
const server = http.createServer({
  keepAliveTimeout: 10_000,
  keepAliveTimeoutBuffer: 1_000,
}, (req, res) => {
  req.resume();
  res.end('ok\n');
});

server.keepAliveTimeout is the idle wait after the last response has been written. In Node v24, the default is 5000 milliseconds.

server.keepAliveTimeoutBuffer is extra internal time added to the actual socket timeout. In current Node v24, the default is 1000 milliseconds. Node advertises one timeout and keeps the internal socket timer slightly longer. That extra time reduces resets when a client sends the next request near the advertised cutoff.

server.keepAliveTimeoutBuffer requires Node v24.6.0 or newer. Early v24 builds expose keepAliveTimeout only. Since this book targets current v24, the examples use the current option. Libraries that support older runtimes should check for support before relying on it.

The actual socket timeout is -

text
socket timeout = keepAliveTimeout + keepAliveTimeoutBuffer

The setting applies to new incoming connections. Existing sockets keep the timer setup they already received. Configure these values near server construction so the behavior is easy to read.

Here is the sequence after res.end() -

text
response is marked as ending
final headers and body bytes are queued
bytes are handed to the socket
response emits 'finish'
request side is complete
socket becomes eligible for idle keep-alive
idle timer starts

The response 'finish' event means Node handed the bytes to the operating system for transmission. It does not prove the client received them.

Once the socket is idle and reusable, Node arms the keep-alive timer. If another request byte arrives before the timer fires, the parser starts reading the next request. If the timer fires first, Node destroys the socket. That close is normal server policy.

Be careful with raw socket timeouts from the lower connection event -

js
server.on('connection', socket => {
  socket.setTimeout(60_000);
});

For HTTP servers, a timeout set there can later be replaced by the server's HTTP keep-alive timeout after the socket has served a request. The HTTP server knows when a socket has moved from active request handling to idle reuse. Prefer HTTP server options for HTTP timing, and use the raw socket hook only for raw socket policy.

server.requestTimeout and server.headersTimeout protect earlier periods. headersTimeout limits how long the parser waits for complete request headers. requestTimeout limits how long the server waits to receive the full request. keepAliveTimeout starts after the server has finished the last response and is waiting for another request on a reusable connection.

server.timeout is another timer from the older net.Server layer. It is a socket inactivity timeout, and its default is 0. For HTTP request protection, headersTimeout, requestTimeout, and keepAliveTimeout usually describe intent better. A generic socket inactivity timer can still be useful in custom server setups, but mixing all four without naming the period each one covers makes debugging harder.

The symptoms are different -

text
headers too slow -> 408 and close
whole request too slow -> 408 and close
idle after response -> socket destroyed

An idle keep-alive close is normal. A client that sends a request right as the server closes the idle socket can see ECONNRESET or a failed write. The server decided the reusable period expired. The client decided the socket was still worth trying. keepAliveTimeoutBuffer makes that timing less tight on the server side, but timing races across processes and networks can still happen.

The server also has a keepAlive creation option -

js
const server = http.createServer({
  keepAlive: true,
  keepAliveInitialDelay: 30_000,
}, handler);

That option enables TCP keep-alive probing on accepted sockets. It works at the TCP layer. HTTP reuse still depends on complete messages, headers, server limits, and HTTP idle timers.

server.maxRequestsPerSocket retires sockets by request count -

js
server.maxRequestsPerSocket = 1_000;

A value of 0 means no request count limit. When the limit is reached, Node sets Connection: close on the response. If a client sends another request after the limit, Node emits 'dropRequest' and sends 503 Service Unavailable.

That gives a busy server a way to cycle long-lived HTTP/1.1 sockets instead of letting one descriptor carry unlimited exchanges.

Count limits are useful when per-socket state can build up, when downstream policy wants periodic reconnection, or when you want a bounded lifetime for long-lived connections. A low limit also increases connection setup work. Set it intentionally.

Clients can see this policy. A well-behaved client reads Connection: close and retires the socket after the response. A client that sends another request anyway receives the 503 drop path. That response comes from server connection policy, not ordinary application routing.

Good logs should name the reason for retirement -

text
socket idle close - keepAliveTimeout
socket count close - maxRequestsPerSocket
socket request close - requestTimeout

Without that detail, every closed connection becomes a vague "client disconnected" message. Keep-alive bugs are much easier to debug when the log says which timer or limit owned the close.

The Agent Owns Client Reuse

An http.Agent owns client-side socket reuse for core node:http. It decides whether a request gets a new socket, a free socket, or a place in a pending request queue.

http.globalAgent is the default agent used by http.request() and http.get() when the call omits agent. Since Node 19, the global agent has HTTP keep-alive enabled and a 5 second socket timeout.

A custom new http.Agent() starts from constructor defaults, so pass keepAlive: true when you want the custom agent to retain free sockets.

That global default can surprise older codebases. Before Node 19, many developers treated the global agent as mostly non-persistent unless they opted into keep-alive. Current Node defaults are better for repeated outbound calls, but they also mean a process can hold idle outbound sockets through the global agent. Explicit agents make the pooling policy easier to review.

Most service clients should make the agent explicit -

js
const agent = new http.Agent({
  keepAlive: true,
  maxSockets: 50,
  maxFreeSockets: 10,
  scheduling: 'lifo',
});

That creates a pool policy for outbound requests made through this agent. maxSockets limits active sockets per origin. maxFreeSockets limits idle retained sockets per origin. scheduling controls which free socket is selected when several are available. Current Node defaults to 'lifo', which selects the most recently used free socket.

Attach the agent to a request -

js
const req = http.get('http://api.local/users', { agent }, res => {
  console.log({ reused: req.reusedSocket });
  res.resume();
});

request.reusedSocket is a client-side boolean. It is true when the request used a socket that persisted through keep-alive. Use it as a diagnostic flag for the agent path. A reused socket can succeed. A new socket can fail. The flag only tells you how the agent assigned the socket.

The outbound path through core HTTP is compact, but it carries several states -

text
http.request()
normalize URL and options
compute agent key
find free socket, open socket, or queue request
attach socket to ClientRequest
write request bytes
read IncomingMessage response
free or close socket

Each step can add latency or change failure behavior. DNS work can happen before a new socket connects. Pool waiting can happen before a socket exists. Request writing can fail if the socket closes. Response reading keeps the socket active until the body ends.

A deadline around the whole operation includes all of that. A socket timeout only starts after a socket exists.

The agent groups sockets by agent key. The agent key is derived from connection options that decide socket compatibility. For core HTTP, agent.getName() uses host, port, local address, and address family. HTTPS adds TLS options through its own agent path because certificate and cipher choices affect reuse.

For plain HTTP, the key is based on inputs like these -

text
host:port:localAddress
host:port:localAddress:family

The exact string is an internal detail exposed through agent.getName(), but the inputs are important. localAddress binds outbound connections to a specific local interface. family separates IPv4 and IPv6. Two requests that look similar at the URL level can use separate pools if these connection options vary.

Different hostnames can resolve to the same IP address and still use separate pools. The agent groups by request options, not by a later guess about service identity. A request to api.local:80 and a request to 10.0.0.7:80 can hit the same machine and still live in separate agent entries.

DNS changes follow the same idea. An agent key can stay the same while new sockets later resolve the hostname to another address. Existing sockets keep talking to the address they already connected to. New sockets use DNS at connection time.

The agent: false option creates an isolated one-use agent for that request -

js
http.get({
  host: 'localhost',
  port: 3000,
  agent: false,
}, res => res.resume());

That is useful for one-off calls that should avoid shared pool state. It also disables reuse for that request, so the call pays for a new connection.

Long-lived requests have another escape hatch. A socket can emit 'agentRemove' to leave the agent. That is useful when a request turns into behavior that should no longer occupy the shared pool policy. Most application code should use higher-level client behavior, but the event explains why a socket can disappear from an agent without a normal close.

Agents need shutdown too. agent.destroy() destroys sockets currently owned by the agent. A short-lived script can call it after work completes. A service usually keeps the agent for the process lifetime and destroys it during shutdown or when removing an upstream client. Letting idle sockets wait for the peer to close them works, but those descriptors stay open until that happens.

Inside the Agent Pool

The agent tracks three groups per key - active sockets, free sockets, and pending requests.

text
agent key: api.local:80:

active sockets: [#1 handling /users] [#2 handling /teams]
free sockets:   [#3 idle, reusable]
pending queue:  [/events] [/stats]

An active socket is attached to an in-flight ClientRequest. The request has a socket and is using it to write the request and read the response. The socket stays active until the response finishes, the request fails, or the connection closes.

A free socket is a connected socket retained by a keep-alive agent after a response finishes. It has no active request attached. The agent can attach it to a later compatible request.

A pending request is waiting because no socket is available under the current pool limits. When a socket becomes available, the agent assigns it to one of the queued requests.

The public properties mirror those groups -

text
agent.sockets     -> active sockets by key
agent.freeSockets -> free sockets by key
agent.requests    -> pending requests by key

Treat those objects as diagnostic views. Do not mutate them from application code. Configure the agent through options, then observe request events, socket events, and errors.

You can count them while debugging -

js
function count(map) {
  return Object.values(map).reduce((n, xs) => n + xs.length, 0);
}

console.log({
  active: count(agent.sockets),
  free: count(agent.freeSockets),
  pending: count(agent.requests),
});

Those counts are snapshots. Under load, they can change before the next log line. They are still useful because each one points to a separate situation. High active count means the socket budget is in use. High pending count means callers are waiting for sockets. High free count means descriptors are being retained for later reuse.

Here is the normal request path.

A request enters http.request(). Node normalizes the URL and options. The agent computes the key. If a free socket exists for that key, the agent can reuse it. The socket moves out of freeSockets, attaches to the request, and becomes active under sockets. Node calls the agent's reuseSocket() hook. The default hook refs the socket again so active work can keep the process alive.

If no free socket exists, the agent checks limits. maxSockets applies per key. maxTotalSockets applies across all keys in that agent. If creating a new socket stays within the limits, the agent creates one through createConnection(), which follows the net.createConnection() path for core HTTP.

If the limits are full, the request goes into agent.requests for that key. From the caller's point of view, the request has started. From the network's point of view, it may still be waiting for a socket. A user-level deadline with AbortSignal can cancel the request while it is still pending inside the agent. A socket timeout cannot fire before a socket exists.

For a busy origin, the pending queue is an early signal that the client has more outbound work than the configured socket budget can run at once. Raising maxSockets may reduce local waiting and increase pressure on the upstream. Lowering it may protect the upstream and increase local wait time. That is a client policy decision.

When a response finishes, the agent decides whether the socket becomes free or retires. Several things can force retirement -

text
server sent Connection: close
response ended by closing the socket
socket errored
socket timed out
free socket pool is full
agent keepSocketAlive() returned false

The default keepSocketAlive() behavior enables TCP keep-alive on the socket, unreferences it, and returns true. Unref lets the process exit if the only remaining work is an idle pooled socket.

That ref and unref behavior affects CLIs and background tools. An active request refs its socket. A free pooled socket is unrefed by default. When the agent reuses that socket, the default reuseSocket() hook refs it again. Free idle state allows process exit. Active request state keeps the process alive.

If a pending request exists for the same key, a just-finished socket can go straight into the next request instead of sitting in the free pool.

That means freeSockets may stay small even while keep-alive is working -

text
request A finishes
pending request B exists
same key can reuse the socket
socket moves directly to B
freeSockets never grows

During a burst, you may see high active count and high pending count with a small free count. Keep-alive may still be active. The pool is busy enough that finished sockets are immediately reused by queued work.

request.reusedSocket is most reliable when a request receives a socket from the free pool. A queued request can still receive the same TCP connection through a direct handoff from the previous request while the flag remains false.

maxSockets and maxTotalSockets control concurrency. With maxSockets: 1, requests to the same key serialize through one active socket unless the code uses another agent. With maxSockets: 50, the agent can open up to fifty active sockets for that key. Extra requests wait. The default Infinity gives the agent permission to open as many concurrent sockets as the process and host can support, which can be too much during bursts.

maxTotalSockets helps when one process talks to many origins. Per-origin limits alone can still produce a large total descriptor count. Fifty sockets to origin A, fifty to origin B, and fifty to origin C add up quickly. A total cap gives the agent a process-level ceiling across all keys it owns.

maxFreeSockets controls idle retention. A high value keeps more ready sockets around. A low value closes more sockets after responses finish. The default is 256 per host when keep-alive is enabled. That is generous for many apps. A process that talks to many origins can still accumulate many idle descriptors if each origin keeps a free pool.

Free sockets also expire. A socket can emit 'timeout', and the agent removes it from freeSockets. The server can close it, which emits 'close' and removes it. The free pool is always temporary. Reading agent.freeSockets twice during traffic can produce two valid but different answers.

scheduling controls which free socket gets picked. 'lifo' selects the most recently used free socket. At lower request rates, that often reduces the chance of picking a socket that has been idle long enough for the peer to close it. 'fifo' selects the least recently used free socket. At high request rates, that can spread work across the free pool. Current Node defaults to 'lifo'.

Node v24.7.0 or newer also has agentKeepAliveTimeoutBuffer. It subtracts time from a server-provided Keep-Alive: timeout=... hint when deciding when a free socket should expire. The goal is narrow - retire a pooled socket a little before the server's advertised idle cutoff, so the next request is less likely to land on a socket the server is about to close.

These limits only govern sockets owned by one agent. If each module creates its own agent, each module creates its own pool and its own limits. A service client should usually own one agent per upstream policy and share it across calls. That gives you one place to cap active sockets, idle sockets, scheduling, and shutdown.

Put that policy near the client code -

js
export const apiAgent = new http.Agent({
  keepAlive: true,
  maxSockets: 100,
  maxFreeSockets: 20,
});

Callers import the client or the agent instead of creating new pools per request. One upstream gets one pool policy. One shutdown path can destroy the agent. One metrics path can record active, free, and pending counts.

Shared agents and isolated agents both have a place. A shared agent gives reuse and one socket budget. An isolated agent gives one caller independent pool behavior. Use isolation when the caller has separate timeouts, localAddress, proxy settings, or lifetime. Use sharing when the calls belong to the same upstream policy.

Reading Pool State

Pool state gives you a fast first read before packet captures and source inspection.

Start with active sockets. A high active count means requests have sockets and are using them. If latency is high while active count is high, the wait is probably inside the exchange - upstream response time, request upload, response body consumption, stream backpressure, or network transfer. Raising maxSockets can add parallelism, but it also sends more concurrent work to the same origin.

Pending requests mean the agent has more work than its socket budget can run right now -

text
active=maxSockets
pending grows
free stays low
caller latency includes pool wait

That points to pool admission. The request may still be waiting before DNS or TCP connect. A caller with a short deadline can time out while waiting inside the agent queue. If the upstream can handle more concurrency, raise the socket limit. If the upstream is already under pressure, keep the limit and make callers back off earlier.

High free count means the process is retaining idle connection state -

text
active low
pending zero
free high
descriptors retained for reuse

That can be good for frequently used clients. It can waste descriptors for clients that call rarely. If the free count stays high and the next burst sees many ECONNRESET errors with reusedSocket: true, idle sockets are probably expiring below the client. Lower maxFreeSockets, shorten the client socket timeout, or align the server's keep-alive timeout with the client's pool policy.

Zero free sockets with repeated traffic usually means one of these -

text
keepAlive disabled
server sends Connection: close
responses are still active
free pool limit is zero or very small
socket errors remove reuse candidates

The fix depends on which line is true. request.reusedSocket tells you when Node marked a request as reused. Connection counts and pool state reveal direct handoff cases. Response headers tell you whether the server asked for close. Active counts tell you whether requests are finishing. Error logs tell you whether sockets close before they can become free.

One agent per request creates the most misleading version of the problem. Each request gets its own private pool, then that pool dies with the request. The code can contain keepAlive: true and still get almost no cross-request reuse because no later request uses the same agent. Keep the agent alive for at least as long as the client object that represents the upstream.

Stale Pooled Sockets

A stale pooled socket is a free socket the agent still has, even though the peer or network path has already closed, reset, or forgotten the connection.

The failure usually appears on reuse -

text
response finishes
agent stores socket as free
server idle timer closes socket
client assigns socket to next request
write or read reports ECONNRESET

The client sees the error after choosing a reused socket whose lower connection state changed while it was idle. The server may have closed with FIN. A network device may have removed the flow. The peer process may have restarted and lost its old connection state.

Chapter 9 covered the TCP side. The HTTP agent adds the pool state that makes the error show up on the next request.

The hardest timing happens when a close or reset reaches the client while the agent is moving the socket out of the free list. The socket looked reusable when the agent selected it. The write starts. Then the pending close or reset becomes visible. JavaScript sees the error on the new request, even though the cause began while the socket was idle.

That timing explains why this can appear more often under low traffic. During steady traffic, sockets spend little time free. During sparse traffic, a socket can sit close to the server idle cutoff. The next request can arrive right as the server closes the connection.

The server can retire a socket for several valid HTTP reasons -

text
idle keep-alive timeout fired
maxRequestsPerSocket limit reached
response used Connection: close
parser rejected later bytes
application destroyed the socket

The client-side symptom may still be one ECONNRESET on a reused request. The server-side reason decides the fix. A count limit asks the client to respect Connection: close. An idle timeout asks for timer alignment. A parser rejection asks for malformed traffic analysis. An application destroy asks for route-level cleanup.

request.reusedSocket helps separate this from a first-use connection failure -

js
const req = http.get(url, { agent }, res => {
  res.resume();
});

req.on('error', err => {
  console.error(req.reusedSocket, err.code);
});

When the flag is true and the error is ECONNRESET, a stale pooled socket is plausible. Keep the word plausible. A reset can still happen for other reasons. The flag only says the agent used a persisted socket.

For a first-use request, reusedSocket is false. Errors there point more toward DNS, connect, route, refusal, TLS in later chapters, or an early server reset. For a reused request, the connection setup had already succeeded earlier. That narrows the search to idle closure, server retirement, network path expiry, or request timing on a reused descriptor.

A narrow diagnostic can use the flag -

js
req.on('error', err => {
  if (req.reusedSocket && err.code === 'ECONNRESET') {
    console.warn('stale pooled socket');
  }
});

This chapter stops at diagnosis. Retrying a request depends on method semantics, whether the request body was sent, whether the origin processed it, and what the application considers safe to repeat. Chapter 27 owns retry policy. Here, the useful signal is that the request reused a pooled socket and hit a reset.

Body uploads make retry decisions harder. A client can start writing a request body on a reused socket and then receive ECONNRESET. The local process may not know how many bytes reached the peer kernel, whether the peer application read any of them, or whether the operation committed side effects. request.reusedSocket only answers how the socket was selected.

Even for GET, retries need a budget. One retry can be reasonable in many internal clients when the error came from a stale pooled socket. An unlimited retry loop can make an outage worse. Leave the policy for the resilience chapter. The HTTP mechanism gives you the input signal.

You can reduce stale reuse in a few practical ways.

Keep client idle timeouts shorter than server idle timeouts when you control both sides. Use scheduling: 'lifo' for low-rate clients so recently used sockets are picked first. Keep free pools small when traffic is sparse. Destroy agents during shutdown or when an upstream is removed from configuration. Consume responses fully so sockets reach a known free or closed state.

The server-side buffer helps from the server side. server.keepAliveTimeoutBuffer keeps the internal socket timer slightly longer than the advertised keep-alive timeout. That gives clients a small grace period around the advertised cutoff. The agent-side buffer helps from the client side when a server provides a keep-alive timeout hint.

A client can also use a shorter socket timeout on the agent -

js
const agent = new http.Agent({
  keepAlive: true,
  timeout: 4_000,
});

That timeout applies when the socket is created. Treat it as part of the pool policy and test it with the server's keep-alive timer. A client timeout shorter than the server's advertised idle period can retire sockets before the server does. A longer client timeout can leave stale sockets available if the server closes first.

Logs should include reuse, error code, target key, and timing -

text
target=api.local:80 reused=true code=ECONNRESET
socketAgeMs=29844 idleMs=4998 method=GET

Those fields show whether failures cluster around a server idle cutoff, a client socket timeout, or a traffic pattern where pooled sockets sit idle between calls.

Counters help too -

text
http.client.reused_socket_errors{code="ECONNRESET"}
http.client.free_sockets{origin="api.local:80"}
http.client.pending_requests{origin="api.local:80"}

One reset can be packet timing. A cluster near the keep-alive cutoff is pool policy. A rising pending queue means the socket budget is too tight for the current workload. A large free pool with sparse traffic is a stale-socket candidate. Name the state instead of treating every client error as the same outage.

The easiest local reproduction is a small timing mismatch -

text
server keepAliveTimeout = 5000ms
client sends every 5000ms
agent keeps free socket
next request sometimes races server close

The race may appear only on some runs. That inconsistency is part of the behavior. Idle close and reuse happen in separate processes with timers and packet delivery between them.

Closing Idle and Active Server Connections

Server shutdown has two socket groups - idle keep-alive connections and active connections.

server.close() stops accepting new connections. Since Node 19, it also closes idle HTTP connections before returning. Active exchanges keep running until they finish or close. This matches the lifecycle from Chapter 9 - the listener and accepted sockets are managed separately.

An idle HTTP connection is between exchanges. It has no request currently being parsed and no response currently being written. An active HTTP connection is carrying an exchange. Headers may be arriving, a request body may be streaming, application code may be preparing a response, or response bytes may be flushing.

server.closeIdleConnections() closes HTTP connections that are connected to the server and idle between request-response exchanges -

js
server.close(() => {
  console.log('listener closed');
});

server.closeIdleConnections();

For Node 19 and newer, calling it with server.close() is usually redundant. It is still harmless and useful for libraries that support older Node versions. Call it after server.close() when using both. That order avoids a gap where new connections can arrive between cleanup calls.

The order is important because server.close() owns listener shutdown. If you call server.closeIdleConnections() first on a busy server, a new connection can arrive immediately after. Calling server.close() first starts listener shutdown, then idle cleanup handles the accepted connections that remain.

server.closeAllConnections() is stronger -

js
server.close(() => {
  console.log('listener closed');
});

server.closeAllConnections();

That closes established HTTP connections, including active ones that are sending a request or waiting for a response. It skips upgraded sockets such as WebSocket and HTTP/2 upgrade paths. Use it when local policy says the server is done with active HTTP exchanges. Any request in progress can fail because the server destroyed the connection under it.

The behavior is easy to remember -

text
idle keep-alive socket -> closeIdleConnections() destroys it
request in progress -> closeIdleConnections() leaves it alone
request in progress -> closeAllConnections() destroys it

Upgraded sockets need separate tracking. Once an HTTP connection upgrades, Node hands application code the raw socket path. The HTTP server cleanup APIs skip that upgraded state. WebSocket chapters handle the long-lived protocol behavior. The local point is that HTTP cleanup APIs target HTTP connections.

Those APIs are connection cleanup tools. Production draining also needs readiness changes, deadlines, supervisor behavior, load balancer timing, and application shutdown rules. Later chapters own that choreography. The HTTP fact here is simple - idle sockets can be closed separately from active sockets, and active sockets can be cut when local policy demands it.

A good server also avoids invisible idle buildup during normal operation. Keep keepAliveTimeout finite. Set maxRequestsPerSocket when long-lived sockets need request count limits. Watch descriptor usage. Track connection counts separately from request counts.

A service can have low request throughput and high idle connection count if clients hold many persistent sockets. CPU can look fine while memory and descriptors grow. Request metrics alone miss that retained connection state. The fix might be lower idle timeouts, lower client free-socket counts, fewer client processes, or a different client pooling policy.

Server-side counters should separate accepted connections from requests -

text
requests per second
current HTTP connections
idle HTTP connections
active HTTP exchanges
closed by keepAliveTimeout

Those names match the APIs. server.closeIdleConnections() targets idle HTTP connections. server.closeAllConnections() targets established HTTP connections, whether active or idle. Connection state tells you what those calls will affect.

HTTP/1.1 Reuse Still Serializes Work

HTTP/1.1 keep-alive reuses a connection. Core node:http still runs ordinary HTTP/1.1 request-response sequencing on that connection.

HTTP/1.1 head-of-line blocking happens when later exchanges on the same connection wait behind an earlier exchange on that connection. Responses on a single HTTP/1.1 connection have to come back in request order. If one request takes a long time, the next exchange on that same connection waits.

One socket gives this timeline -

text
/slow starts
/fast waits for the socket
/slow response completes
/fast request starts

Two sockets allow this timeline -

text
socket #1 handles /slow
socket #2 handles /fast
both responses progress independently

The agent controls which timeline is possible through socket limits. With maxSockets: 1, the second request waits in the pending queue. With a higher limit, the agent can open another socket and run both exchanges at once.

Core HTTP gets concurrency by using more sockets, up to the configured limits -

text
maxSockets: 1  -> one active exchange per origin
maxSockets: 10 -> up to ten active sockets per origin
extra requests -> pending queue

More sockets give more HTTP/1.1 concurrency. They also consume more descriptors, memory, kernel state, and upstream capacity. Reuse reduces repeated connection setup. Socket count controls parallel work. Treat them as separate knobs.

The common mistake is treating keep-alive as a concurrency feature. Keep-alive is reuse. A single persistent HTTP/1.1 connection can be efficient for sequential work and still poor for parallel work. Parallel work needs multiple active sockets in core HTTP, or a client and protocol model that supports multiplexing.

Latency makes the ordering clear -

text
one socket: /slow 200ms, /fast 10ms -> about 210ms total
two sockets: /slow 200ms, /fast 10ms -> about 200ms total

The numbers are synthetic, but the ordering is real. On one ordinary HTTP/1.1 connection, /fast waits behind /slow when both exchanges use that connection. With two sockets, each socket has its own request-response sequence.

HTTP pipelining can send multiple HTTP/1.1 requests before earlier responses finish, but core http.Agent pooling is still organized around assigning requests to sockets and freeing sockets after response completion. Undici owns the modern client model, dispatchers, pools, and pipelining behavior in the next subchapter. HTTP/2 multiplexing belongs to Chapter 11. Both change concurrency by changing the protocol machinery.

For core node:http, read pool behavior through three groups - active, free, and pending. If active sockets are full, new work waits. If free sockets are stale, reuse can fail. If idle sockets are too many, descriptors sit open. Keep-alive is the policy that lets a completed socket stay useful for the next exchange.

That is enough to debug most keep-alive behavior. Count requests. Count TCP connections. Then count active, free, and pending sockets for the client or server side you own. The mismatch usually shows where the state is sitting.