HTTP Servers, Clients & Proxies

Node.js HTTP Keep-Alive, Agents, and Connection Pools

Ishtmeet Singh @ishtms/June 10, 2026/39 min read

#nodejs#http#keep-alive#agents#connection-pooling

You might assume every HTTP request to a host opens its own TCP connection. It does not have to. Two requests to the same host can use one connection, one after the other. That reuse only works once the first request and its response have both finished cleanly. The client has to know the response is complete, and the server has to know the request is complete. The socket also has to stay open after that exchange, so either side can put another HTTP message on the same TCP connection.

Start with a tiny server.

import http from 'node:http';

const server = http.createServer((req, res) => {
  res.end(`ok ${req.url}\n`);
});

server.listen(3000);

This server sends a complete response for every request. Calling res.end() finishes the HTTP response from Node's side. After that, Node checks the HTTP connection state and decides whether the accepted socket can stay open for another request.

Add a TCP connection log.

server.on('connection', socket => {
  console.log('tcp', socket.remoteAddress, socket.remotePort);
});

That log runs when Node accepts a new TCP connection. If two sequential HTTP requests reuse one socket, the server prints one tcp line and sends two responses. If reuse does not happen, the server prints two tcp lines.

So you are tracking two separate numbers, request count and TCP connection count. They do not move together.

Now create a client-side agent.

import http from 'node:http';

const agent = new http.Agent({ keepAlive: true });

The agent manages reusable client sockets for core node:http. With keepAlive: true, a socket can return to the agent after a response finishes. The agent can then assign that same socket to another compatible request.

Here is a helper that makes one request and reads the response body.

function once(path) {
  return new Promise(resolve => {
    const req = http.get({ port: 3000, path, agent }, res => {
      console.log(path, req.reusedSocket);
      res.resume();
      res.on('end', resolve);
    });
  });
}

That res.resume() call drains the response body. Without a drained body, Node cannot finish the response stream, so the agent cannot safely return the socket to the free pool.

Run two requests one after another.

await once('/a');
await once('/b');
agent.destroy();

The first request usually logs false. The second usually logs true, assuming the server kept the connection open and the first response body reached 'end'.

Several things happened to produce that second true. The first request made the agent create a socket. The response finished. The agent detached that socket from the completed request and kept it as a free socket. The second request targeted the same origin, so the agent found a compatible free socket, attached it to the new request, and marked it active again.

All of that is connection pooling in core node:http, with no framework involved. The agent keeps sockets, groups them by origin, and moves them between active and free states.

HTTP Keep-Alive Is Protocol Reuse

HTTP keep-alive means one TCP connection can carry more than one HTTP request-response exchange.

A persistent connection is the TCP connection that stays open after an HTTP exchange finishes. Connection reuse is what happens when a later HTTP exchange uses that same connection.

Chapter 9 covered TCP keep-alive. Keep that separate from HTTP keep-alive.

TCP keep-alive is an operating-system feature that probes an idle TCP connection. HTTP keep-alive is reuse at the HTTP level after a complete request and response. The two are independent. A socket can have TCP probes enabled and still close after a single HTTP response, and it can carry many HTTP exchanges with no TCP probes at all, as long as the HTTP layer keeps it reusable.

Client requests expose that lower TCP probe switch too.

const req = http.get(url, { agent }, res => {
  res.resume();
});

req.setSocketKeepAlive(true, 30_000);

That call configures TCP keep-alive probing on the underlying socket when available. The agent still handles HTTP reuse.

For HTTP/1.1, persistent connections are the normal case. A peer can still retire a connection by sending Connection: close, using response framing that ends only when the socket closes, hitting a server limit, timing out, or running into an error.

The Connection header is the HTTP signal. In HTTP/1.1, the connection can stay open across messages unless the framing or headers say otherwise. Connection: close says the current exchange is the final exchange on that connection. The peer can finish the current message and then close.

Core http.Agent also writes the connection intent on client requests. Node's docs describe the constructor behavior this way. an agent sends Connection: keep-alive unless the caller supplied a Connection header or unless the agent is using the no-keep-alive default configuration with unlimited sockets, where Connection: close is used. In normal application code, set the agent options and let the agent manage the header.

A reusable HTTP/1.1 connection needs this sequence.

request headers and body complete
response headers and body complete
neither side asked for close
socket remains open
next HTTP exchange starts

Keep-alive saves connection setup work. The process does not need to create a fresh socket, complete another TCP handshake, allocate another descriptor, and rebuild connection-level kernel state for every request to the same origin.

That saved work has a cost. Idle sockets are still open sockets. They consume descriptors and memory while they wait in the pool.

At low request rates, keep-alive can cut latency for calls spaced a few seconds apart, though the peer may close the socket while it sits idle. At higher rates it saves repeated handshakes and keeps sockets recently used, but the process may then hold many active and idle descriptors at once. During a burst the agent may open several sockets, finish the work, and keep some free until limits or timeouts shrink the pool.

Keep-alive, then, is a retention decision. The HTTP layer holds onto lower connection state because the finished message and headers allow another exchange. TCP keep-alive probes can run underneath. Either way, HTTP reuse begins only after a complete HTTP exchange.

Reuse Needs Complete Message Ends

Node reuses a connection only after the HTTP parser confirms the current message is complete.

When a response carries Content-Length, the parser counts bytes. With chunked transfer coding, it reads chunks until the terminating chunk and trailers. And when the body ends only because the connection closes, the client finds out the response is done at the same moment the reusable socket is gone.

HTTP/1.1 messages sit on a byte stream. The socket gives Node ordered bytes. The HTTP parser decides where the current message ends and where the next one can begin. If even one byte from the previous body is still unread, the next byte is ambiguous. It could be leftover body data, or the first byte of the next request. The parser can only decide from the framing rules and its current parsing state.

For a client response, the reuse path looks like this.

response headers parsed
body bytes consumed
message reaches end
agent receives socket as a reuse candidate
socket returns to pool or closes

For an inbound server request, it looks like this.

request headers parsed
request body consumed or absent
response finishes
socket waits for next request bytes
idle keep-alive timer starts

message.complete helps during failures. It tells you whether Node received a complete HTTP message before the connection ended. A response can emit 'close' after a complete message, or because the connection ended early. The completion flag is how you tell those two apart.

Connection: close is explicit. The peer is saying the socket should close after the current exchange. Node treats that as retirement, even if the current message arrived successfully.

On the client side, your code also has to consume the response body. The core docs call this out for http.ClientRequest - when a response handler exists, the response data must be read through read(), a 'data' handler, or .resume(). The 'end' event waits for body consumption. Until the body reaches its end path, the agent is still waiting for the response to finish.

Small clients often miss this in error branches.

http.get(url, { agent }, res => {
  if (res.statusCode >= 400) {
    res.resume();
    return;
  }

  res.pipe(destination);
});

The rejected response still has a body. When you drain it, Node can finish the response lifecycle and then decide whether the socket returns to the pool. An unread body stays attached to the socket and delays reuse, and if the server closes before that body finishes, the agent drops the socket.

⚠️

Warning

The socket only returns to the pool after you drain the response body. If you return early on a 4xx, or read one header and skip the rest, the response stays attached to that socket and reuse stalls. Call res.resume() or read to 'end' on every path, including error branches, or destroy the request when you do not want the body.

The same issue appears when the caller only wanted headers. A health-check client may inspect statusCode and return early. A metadata client may read one header and ignore a JSON body. Both still received a response stream. If the application wants the socket to remain reusable, it must either consume the body or intentionally destroy the request path.

Destroying the request is a fine choice. It closes the socket and frees the agent from waiting on unread bytes, and the next request pays for a new connection. Draining keeps reuse available but can spend time reading data the caller no longer needs. Which one to use comes down to body size, where the data came from, and your latency budget.

Servers have the same problem with request bodies. A handler may reject a request before reading its body, but the connection still contains those body bytes. The parser cannot treat the next byte as a new request line while unread body bytes from the previous request are still ahead of it.

For a route that rejects a body but still wants the connection reusable, drain or destroy on purpose.

if (req.headers['content-type'] !== 'application/json') {
  req.resume();
  res.writeHead(415);
  res.end();
  return;
}

That code drains the incoming request stream while sending a short response. For small unwanted bodies that is fine. Large or abusive bodies are different, because draining wastes bandwidth and memory on input the server already rejected. A bounded body parser gives you a better route policy. It reads up to a limit, then closes or destroys according to what that route promises.

🚨

Caution

Draining an unwanted request body keeps the connection reusable. But calling req.resume() on a route that already rejected large or hostile input means reading every byte the server just refused, with the sender controlling how many bytes arrive. For untrusted bodies, cap the read with a limit and destroy the socket once it crosses that limit, rather than draining with no ceiling.

Keep-alive depends on framing. The parser needs a known message end, the code has to consume the stream far enough to reach that end, and the socket has to stay open after both sides finish.

Some messages have no body at all. A HEAD response, a 204, a 304, and most informational responses follow rules that end the body immediately. So a body can end for four reasons, the method and status code, Content-Length, chunked framing, or connection close. Only the first three leave the socket available for reuse.

Server Timers After the Response

After a Node HTTP server finishes writing a response, it starts an idle timer.

const server = http.createServer({
  keepAliveTimeout: 10_000,
  keepAliveTimeoutBuffer: 1_000,
}, (req, res) => {
  req.resume();
  res.end('ok\n');
});

server.keepAliveTimeout is the idle wait after the last response has been written. In Node v24, the default is 5000 milliseconds.

server.keepAliveTimeoutBuffer is extra internal time added to the actual socket timeout. In current Node v24, the default is 1000 milliseconds. Node advertises one timeout and keeps the internal socket timer slightly longer. That extra time reduces resets when a client sends the next request near the advertised cutoff.

server.keepAliveTimeoutBuffer requires Node v24.6.0 or newer. Early v24 builds expose keepAliveTimeout only. Since this book targets current v24, the examples use the current option. Libraries that support older runtimes should check for support before relying on it.

The actual socket timeout is the sum of the two.

socket timeout = keepAliveTimeout + keepAliveTimeoutBuffer

The setting applies to new incoming connections. Existing sockets keep the timer setup they already received. Configure these values near server construction so the behavior is easy to read.

Here is the sequence after res.end().

response is marked as ending
final headers and body bytes are queued
bytes are handed to the socket
response emits 'finish'
request side is complete
socket becomes eligible for idle keep-alive
idle timer starts

The response 'finish' event means Node handed the bytes to the operating system for transmission. It does not prove the client received them.

Once the socket is idle and reusable, Node arms the keep-alive timer. If another request byte arrives before the timer fires, the parser starts reading the next request. If the timer fires first, Node destroys the socket, and that close is normal server behavior.

Be careful with raw socket timeouts from the lower connection event.

server.on('connection', socket => {
  socket.setTimeout(60_000);
});

For HTTP servers, a timeout set there can later be replaced by the server's HTTP keep-alive timeout after the socket has served a request. The HTTP server knows when a socket has moved from active request handling to idle reuse. Prefer HTTP server options for HTTP timing, and use the raw socket hook only for raw socket policy.

server.requestTimeout and server.headersTimeout protect earlier periods. headersTimeout limits how long the parser waits for complete request headers. requestTimeout limits how long the server waits to receive the full request. keepAliveTimeout starts after the server has finished the last response and is waiting for another request on a reusable connection.

server.timeout is another timer from the older net.Server layer. It is a socket inactivity timeout, and its default is 0. For HTTP request protection, headersTimeout, requestTimeout, and keepAliveTimeout usually describe intent better. A generic socket inactivity timer can still be useful in custom server setups, but mixing all four without naming the period each one covers makes debugging harder.

The symptoms differ.

headers too slow -> 408 and close
whole request too slow -> 408 and close
idle after response -> socket destroyed

An idle keep-alive close is normal. A client that sends a request just as the server closes the idle socket can see ECONNRESET or a failed write. The server treated the reusable period as expired while the client still treated the socket as usable. keepAliveTimeoutBuffer loosens that timing on the server side, but races across processes and networks can still happen.

The server also has a keepAlive creation option.

const server = http.createServer({
  keepAlive: true,
  keepAliveInitialDelay: 30_000,
}, handler);

That option enables TCP keep-alive probing on accepted sockets. It works at the TCP layer. HTTP reuse still depends on complete messages, headers, server limits, and HTTP idle timers.

server.maxRequestsPerSocket retires sockets by request count.

server.maxRequestsPerSocket = 1_000;

A value of 0 means no request count limit. When the limit is reached, Node sets Connection: close on the response. If a client sends another request after the limit, Node emits 'dropRequest' and sends 503 Service Unavailable.

A busy server can use that to cycle long-lived HTTP/1.1 sockets, rather than letting one descriptor carry unlimited exchanges.

Count limits are useful when per-socket state can build up, when downstream policy wants periodic reconnection, or when you want a bounded lifetime for long-lived connections. A low limit also raises connection setup work, so choose the value deliberately.

Clients can see this policy. A well-behaved client reads Connection: close and retires the socket after the response. A client that sends another request anyway receives the 503 drop path. That response comes from server connection policy rather than ordinary application routing.

Good logs name the reason for retirement.

socket idle close - keepAliveTimeout
socket count close - maxRequestsPerSocket
socket request close - requestTimeout

Without that detail, every closed connection becomes a vague "client disconnected" message. Keep-alive bugs are much easier to debug when the log says which timer or limit caused the close.

How the Agent Reuses Sockets

On the client side, the http.Agent handles reuse. For every request it takes one of three paths. It opens a new socket, hands over a free socket from the pool, or holds the request in a pending queue until a socket frees up.

http.globalAgent is the default agent used by http.request() and http.get() when the call omits agent. Since Node 19, the global agent has HTTP keep-alive enabled and a 5 second socket timeout.

A custom new http.Agent() starts from constructor defaults, so pass keepAlive: true when you want the custom agent to retain free sockets.

That global default can surprise older codebases. Before Node 19, many developers treated the global agent as mostly non-persistent unless they opted into keep-alive. Current Node defaults are better for repeated outbound calls, but they also mean a process can hold idle outbound sockets through the global agent. Explicit agents make the pooling policy easier to review.

Most service clients should make the agent explicit.

const agent = new http.Agent({
  keepAlive: true,
  maxSockets: 50,
  maxFreeSockets: 10,
  scheduling: 'lifo',
});

That creates a pool policy for outbound requests made through this agent. maxSockets limits active sockets per origin. maxFreeSockets limits idle retained sockets per origin. scheduling controls which free socket is selected when several are available. Current Node defaults to 'lifo', which selects the most recently used free socket.

Attach the agent to a request.

const req = http.get('http://api.local/users', { agent }, res => {
  console.log({ reused: req.reusedSocket });
  res.resume();
});

request.reusedSocket is a client-side boolean. It is true when the request used a socket that persisted through keep-alive. Treat it as a diagnostic flag for the agent path, and read it carefully. Plenty of reused sockets succeed and plenty of fresh sockets fail. The flag tells you how the agent assigned the socket, and nothing about whether the request will succeed.

The outbound path through core HTTP is short, but it carries several states.

http.request()
normalize URL and options
compute agent key
find free socket, open socket, or queue request
attach socket to ClientRequest
write request bytes
read IncomingMessage response
free or close socket

Each step can add latency or change how the request fails. DNS work can happen before a new socket connects, and pool waiting can happen before a socket even exists. Writing the request can fail if the socket closes, and reading the response keeps the socket active until the body ends.

A deadline around the whole operation includes all of that. A socket timeout only starts after a socket exists.

The agent groups sockets by agent key. The agent key is derived from connection options that decide socket compatibility. For core HTTP, agent.getName() uses host, port, local address, and address family. HTTPS adds TLS options through its own agent path because certificate and cipher choices affect reuse.

For plain HTTP, the key is built from inputs like these.

host:port:localAddress
host:port:localAddress:family

The exact string is an internal detail exposed through agent.getName(), but the inputs are important. localAddress binds outbound connections to a specific local interface. family separates IPv4 and IPv6. Two requests that look similar at the URL level can use separate pools if these connection options vary.

Different hostnames can resolve to the same IP address and still use separate pools. The agent groups by request options rather than by any later guess about service identity. A request to api.local:80 and a request to 10.0.0.7:80 can hit the same machine and still live in separate agent entries.

DNS changes follow the same idea. An agent key can stay the same while new sockets later resolve the hostname to another address. Existing sockets keep talking to the address they already connected to. New sockets use DNS at connection time.

The agent: false option creates an isolated one-use agent for that request.

http.get({
  host: 'localhost',
  port: 3000,
  agent: false,
}, res => res.resume());

That is useful for one-off calls that should avoid shared pool state. It also disables reuse for that request, so the call pays for a new connection.

Long-lived requests have another option. A socket can emit 'agentRemove' to leave the agent. That is useful when a request turns into behavior that should no longer occupy the shared pool policy. Most application code should use higher-level client behavior, but the event explains why a socket can disappear from an agent without a normal close.

Agents need shutdown too. agent.destroy() destroys sockets currently held by the agent. A short-lived script can call it after work completes. A service usually keeps the agent for the process lifetime and destroys it during shutdown or when removing an upstream client. Letting idle sockets wait for the peer to close them works, but those descriptors stay open until that happens.

Inside the Agent Pool

The agent tracks three groups per key - active sockets, free sockets, and pending requests.

agent key: api.local:80:

active sockets: [#1 handling /users] [#2 handling /teams]
free sockets:   [#3 idle, reusable]
pending queue:  [/events] [/stats]

State diagram of an Agent owned socket moving between active, free, and destroyed, including the direct handoff path. — One Agent socket moves between active, free, and destroyed. When a response finishes and a pending request is waiting, the socket is handed straight to it and never enters the free pool. Without keep-alive, or when the free pool is full, it is destroyed instead.

An active socket is attached to an in-flight ClientRequest. The request has a socket and is using it to write the request and read the response. The socket stays active until the response finishes, the request fails, or the connection closes.

A free socket is a connected socket retained by a keep-alive agent after a response finishes. It has no active request attached. The agent can attach it to a later compatible request.

A pending request is waiting because no socket is available under the current pool limits. When a socket becomes available, the agent assigns it to one of the queued requests.

The public properties mirror those groups.

agent.sockets     -> active sockets by key
agent.freeSockets -> free sockets by key
agent.requests    -> pending requests by key

Treat those objects as diagnostic views. Do not mutate them from application code. Configure the agent through options, then observe request events, socket events, and errors.

You can count them while debugging.

function count(map) {
  return Object.values(map).reduce((n, xs) => n + xs.length, 0);
}

console.log({
  active: count(agent.sockets),
  free: count(agent.freeSockets),
  pending: count(agent.requests),
});

Those counts are snapshots. Under load they can change before the next log line. Each one still points to a separate situation. A high active count means the socket budget is in use, a high pending count means callers are waiting for sockets, and a high free count means descriptors are being held for later reuse.

Here is the normal request path.

A request enters http.request(). Node normalizes the URL and options. The agent computes the key. If a free socket exists for that key, the agent can reuse it. The socket moves out of freeSockets, attaches to the request, and becomes active under sockets. Node calls the agent's reuseSocket() hook. The default hook refs the socket again so active work can keep the process alive.

If no free socket exists, the agent checks limits. maxSockets applies per key. maxTotalSockets applies across all keys in that agent. If creating a new socket stays within the limits, the agent creates one through createConnection(), which follows the net.createConnection() path for core HTTP.

If the limits are full, the request goes into agent.requests for that key. To the caller the request has started, but on the network it may still be waiting for a socket. A user-level deadline with AbortSignal can cancel it while it is still pending inside the agent. A socket timeout cannot fire before a socket exists.

For a busy origin, a growing pending queue is an early sign that the client has more outbound work than its socket budget can run at once. Raising maxSockets tends to reduce local waiting while adding pressure on the upstream. A lower limit does the reverse, protecting the upstream but making callers wait longer. Where to set it is a client policy choice.

When a response finishes, the agent decides whether the socket becomes free or retires. Several conditions force retirement.

server sent Connection: close
response ended by closing the socket
socket errored
socket timed out
free socket pool is full
agent keepSocketAlive() returned false

The default keepSocketAlive() behavior enables TCP keep-alive on the socket, unreferences it, and returns true. Unref lets the process exit if the only remaining work is an idle pooled socket.

That ref and unref behavior affects CLIs and background tools. An active request refs its socket. A free pooled socket is unrefed by default, and when the agent reuses it, the default reuseSocket() hook refs it again. So an idle pool lets the process exit, while an in-flight request holds it open.

If a pending request exists for the same key, a just-finished socket can go straight into the next request instead of sitting in the free pool.

That is why freeSockets can stay small even while keep-alive is working.

request A finishes
pending request B exists
same key can reuse the socket
socket moves directly to B
freeSockets never grows

During a burst, you may see high active count and high pending count with a small free count. Keep-alive may still be active. The pool is busy enough that finished sockets are immediately reused by queued work.

request.reusedSocket is most reliable when a request receives a socket from the free pool. A queued request can still receive the same TCP connection through a direct handoff from the previous request while the flag remains false.

ℹ️

Note

request.reusedSocket reads true only when the request pulled a socket from the free pool. A socket handed straight from a finishing request to a queued one reuses the same TCP connection while the flag stays false. To confirm real reuse, compare request count against TCP connection count rather than trusting the flag alone.

maxSockets and maxTotalSockets control concurrency. With maxSockets: 1, requests to the same key serialize through one active socket unless the code uses another agent. With maxSockets: 50, the agent can open up to fifty active sockets for that key. Extra requests wait. The default Infinity gives the agent permission to open as many concurrent sockets as the process and host can support, which can be too much during bursts.

⚠️

Warning

A custom agent sets maxSockets to Infinity by default. Under a burst it opens one socket per concurrent request to an origin, enough to run local descriptors dry and flood the upstream. Set maxSockets to a deliberate ceiling, and maxTotalSockets too when one process talks to many origins.

maxTotalSockets helps when one process talks to many origins. Per-origin limits alone can still produce a large total descriptor count. Fifty sockets to origin A, fifty to origin B, and fifty to origin C add up quickly. A total cap gives the agent a process-level ceiling across all its keys.

maxFreeSockets controls idle retention. A higher value keeps more ready sockets around, a lower one closes more of them after responses finish. The default is 256 per host when keep-alive is enabled, which is generous for many apps. A process that talks to many origins can still pile up idle descriptors when each origin keeps its own free pool.

Free sockets also expire. A socket can emit 'timeout', and the agent removes it from freeSockets. The server can close it, which emits 'close' and removes it. The free pool is always temporary. Reading agent.freeSockets twice during traffic can produce two valid but different answers.

scheduling controls which free socket gets picked. 'lifo' selects the most recently used free socket. At lower request rates, that often reduces the chance of picking a socket that has been idle long enough for the peer to close it. 'fifo' selects the least recently used free socket. At high request rates, that can spread work across the free pool. Current Node defaults to 'lifo'.

Node v24.7.0 or newer also has agentKeepAliveTimeoutBuffer. It subtracts time from a server-provided Keep-Alive: timeout=... hint when deciding when a free socket should expire. Its job is small. It retires a pooled socket a little before the server's advertised idle cutoff, so the next request is less likely to land on a socket the server is about to close.

These limits only apply to sockets held by one agent. If each module creates its own agent, each module creates its own pool and its own limits. A service client should usually keep one agent per upstream policy and share it across calls. Then you have one place to cap active sockets, idle sockets, scheduling, and shutdown.

Put that policy near the client code.

export const apiAgent = new http.Agent({
  keepAlive: true,
  maxSockets: 100,
  maxFreeSockets: 20,
});

Callers import the client or the agent instead of building new pools per request. One upstream ends up with one pool policy, one place to destroy the agent at shutdown, and one place to record active, free, and pending counts.

Shared agents and isolated agents both have a place. A shared agent gives you reuse under one socket budget, while an isolated agent gives a single caller its own pool behavior. Reach for isolation when the caller needs separate timeouts, a different localAddress, its own proxy settings, or a different lifetime. Share when the calls belong to the same upstream policy.

Reading Pool State

Pool state is a quick first thing to check, before you reach for packet captures and source inspection.

Start with active sockets. A high active count means requests have sockets and are using them. If latency is high while active count is high, the wait is probably inside the exchange - upstream response time, request upload, response body consumption, stream backpressure, or network transfer. Raising maxSockets can add parallelism, but it also sends more concurrent work to the same origin.

Pending requests mean the agent has more work than its socket budget can run right now.

active=maxSockets
pending grows
free stays low
caller latency includes pool wait

That points to pool admission. The request may still be waiting before DNS or TCP connect. A caller with a short deadline can time out while waiting inside the agent queue. If the upstream can handle more concurrency, raise the socket limit. If the upstream is already under pressure, keep the limit and make callers back off earlier.

A high free count means the process is holding idle connection state.

active low
pending zero
free high
descriptors retained for reuse

A client that fires often gets real value from holding those connections ready. A client that calls rarely just leaves the same descriptors sitting idle. If the free count stays high and the next burst brings many ECONNRESET errors with reusedSocket: true, the idle sockets are probably expiring below the client. Lower maxFreeSockets, shorten the client socket timeout, or line up the server's keep-alive timeout with the client's pool policy.

Zero free sockets with repeated traffic usually points to one of these.

keepAlive disabled
server sends Connection: close
responses are still active
free pool limit is zero or very small
socket errors remove reuse candidates

Which fix you need depends on which line is true. request.reusedSocket shows when Node marked a request as reused. Connection counts and pool state expose direct handoff cases. Response headers show whether the server asked for close, active counts show whether requests are finishing, and error logs show whether sockets close before they can become free.

One agent per request creates the most misleading version of the problem. Each request gets its own private pool, then that pool dies with the request. The code can contain keepAlive: true and still get almost no cross-request reuse because no later request uses the same agent. Keep the agent alive for at least as long as the client object that represents the upstream.

⚠️

Warning

keepAlive: true reuses sockets only across requests that share one agent instance. If you build a new agent inside the request handler, every call gets a private pool that dies when the request ends, so reuse never happens even though the flag says it should. Create the agent once and share it for as long as the upstream client lives.

Stale Pooled Sockets

A stale pooled socket is a free socket the agent still has, even after the peer or network path has already closed, reset, or dropped the connection.

The failure usually shows up on reuse.

response finishes
agent stores socket as free
server idle timer closes socket
client assigns socket to next request
write or read reports ECONNRESET

Sequence of the stale pooled socket race ending in ECONNRESET on a reused socket. — The server keepAliveTimeout closes an idle socket that the agent still lists as free. The next request writes onto that dead socket and fails with ECONNRESET while reusedSocket is true. This is the normal idle-close race, not a bug.

⚠️

Warning

When request.reusedSocket is true and the code is ECONNRESET, the usual cause is a stale pooled socket. The server closed the idle connection while the agent still listed it as free. Keep the client idle timeout shorter than the server's keepAliveTimeout, prefer scheduling: 'lifo' for low-rate clients, and keep free pools small when traffic is sparse.

The client sees the error after choosing a reused socket whose lower connection state changed while it was idle. Maybe the server closed with FIN, maybe a network device removed the flow, or maybe the peer process restarted and lost its old connection state.

Chapter 9 covered the TCP side. The HTTP agent adds the pool state that makes the error show up on the next request.

The hardest timing happens when a close or reset reaches the client while the agent is moving the socket out of the free list. The socket looked reusable when the agent selected it. The write starts. Then the pending close or reset becomes visible. JavaScript sees the error on the new request, even though the cause began while the socket was idle.

That timing explains why this can appear more often under low traffic. During steady traffic, sockets spend little time free. During sparse traffic, a socket can sit close to the server idle cutoff. The next request can arrive right as the server closes the connection.

The server can retire a socket for several valid HTTP reasons.

idle keep-alive timeout fired
maxRequestsPerSocket limit reached
response used Connection: close
parser rejected later bytes
application destroyed the socket

The client-side symptom may still be a single ECONNRESET on a reused request, but the server-side reason decides the fix. If a count limit caused it, the client needs to respect Connection: close. An idle timeout calls for aligning the timers. A parser rejection means looking into malformed traffic. An application destroy points to route-level cleanup.

request.reusedSocket helps separate this from a first-use connection failure.

const req = http.get(url, { agent }, res => {
  res.resume();
});

req.on('error', err => {
  console.error(req.reusedSocket, err.code);
});

When the flag is true and the error is ECONNRESET, a stale pooled socket is a likely cause, though only a likely one. A reset can still happen for other reasons. The flag only says the agent used a persisted socket.

For a first-use request, reusedSocket is false. Errors there point more toward DNS, connect, route, refusal, TLS in later chapters, or an early server reset. For a reused request, the connection setup had already succeeded earlier. That narrows the search to idle closure, server retirement, network path expiry, or request timing on a reused descriptor.

A narrow diagnostic can use the flag.

req.on('error', err => {
  if (req.reusedSocket && err.code === 'ECONNRESET') {
    console.warn('stale pooled socket');
  }
});

This chapter stops at diagnosis. Retrying a request depends on method semantics, whether the request body was sent, whether the origin processed it, and what the application considers safe to repeat. Chapter 27 covers retry policy. Here, the useful signal is that the request reused a pooled socket and hit a reset.

Body uploads make retry decisions harder. A client can start writing a request body on a reused socket and then receive ECONNRESET. The local process may not know how many bytes reached the peer kernel, whether the peer application read any of them, or whether the operation committed side effects. request.reusedSocket only answers how the socket was selected.

Even for GET, retries need a budget. One retry can be reasonable in many internal clients when the error came from a stale pooled socket. An unlimited retry loop can make an outage worse. Leave the policy for the resilience chapter. The HTTP mechanism gives you the input signal.

You can reduce stale reuse in a few practical ways.

Keep client idle timeouts shorter than server idle timeouts when you control both sides. Use scheduling: 'lifo' for low-rate clients so recently used sockets are picked first. Keep free pools small when traffic is sparse. Destroy agents during shutdown or when an upstream is removed from configuration. Consume responses fully so sockets reach a known free or closed state.

There is a buffer on each side. server.keepAliveTimeoutBuffer keeps the internal socket timer slightly longer than the advertised keep-alive timeout, which leaves clients a small grace period around the cutoff. The agent-side buffer does the same job from the client side when a server sends a keep-alive timeout hint.

A client can also set a shorter socket timeout on the agent.

const agent = new http.Agent({
  keepAlive: true,
  timeout: 4_000,
});

That timeout applies when the socket is created. Treat it as part of the pool policy and test it against the server's keep-alive timer. When the client timeout sits below the server's advertised idle period, the client retires sockets before the server does. When it sits above, stale sockets can remain available after the server has already closed.

Logs should include reuse, error code, target key, and timing.

target=api.local:80 reused=true code=ECONNRESET
socketAgeMs=29844 idleMs=4998 method=GET

Those fields show whether failures cluster around a server idle cutoff, a client socket timeout, or a traffic pattern where pooled sockets sit idle between calls.

Counters help too.

http.client.reused_socket_errors{code="ECONNRESET"}
http.client.free_sockets{origin="api.local:80"}
http.client.pending_requests{origin="api.local:80"}

A single reset can just be packet timing. A cluster near the keep-alive cutoff is pool policy. A rising pending queue means the socket budget is too tight for the current workload, and a large free pool during sparse traffic is a stale-socket candidate. The point is to name the state rather than file every client error under the same outage.

The easiest local reproduction is a small timing mismatch.

server keepAliveTimeout = 5000ms
client sends every 5000ms
agent keeps free socket
next request sometimes races server close

The race may appear only on some runs. That inconsistency is part of the behavior. Idle close and reuse happen in separate processes with timers and packet delivery between them.

Closing Idle and Active Server Connections

Server shutdown has two socket groups - idle keep-alive connections and active connections.

server.close() stops accepting new connections. Since Node 19, it also closes idle HTTP connections before returning. Active exchanges keep running until they finish or close. This matches the lifecycle from Chapter 9 - the listener and accepted sockets are managed separately.

An idle HTTP connection is between exchanges. It has no request currently being parsed and no response currently being written. An active HTTP connection is carrying an exchange. Headers may be arriving, a request body may be streaming, application code may be preparing a response, or response bytes may be flushing.

server.closeIdleConnections() closes HTTP connections that are connected to the server and idle between request-response exchanges.

server.close(() => {
  console.log('listener closed');
});

server.closeIdleConnections();

For Node 19 and newer, calling it with server.close() is usually redundant. It is still harmless and useful for libraries that support older Node versions. Call it after server.close() when using both. That order avoids a gap where new connections can arrive between cleanup calls.

The order is important because server.close() is what shuts down the listener. If you call server.closeIdleConnections() first on a busy server, a new connection can arrive immediately after. Calling server.close() first starts listener shutdown, then idle cleanup handles the accepted connections that remain.

server.closeAllConnections() is stronger.

server.close(() => {
  console.log('listener closed');
});

server.closeAllConnections();

That closes established HTTP connections, including active ones that are sending a request or waiting for a response. It skips upgraded sockets such as WebSocket and HTTP/2 upgrade paths. Use it when local policy says the server is done with active HTTP exchanges. Any request in progress can fail because the server destroyed the connection under it.

The behavior comes down to three cases.

idle keep-alive socket -> closeIdleConnections() destroys it
request in progress -> closeIdleConnections() leaves it alone
request in progress -> closeAllConnections() destroys it

Upgraded sockets need separate tracking. Once an HTTP connection upgrades, Node hands application code the raw socket path. The HTTP server cleanup APIs skip that upgraded state. WebSocket chapters handle the long-lived protocol behavior. Either way, the HTTP cleanup APIs act on HTTP connections only.

Those APIs are connection cleanup tools. Production draining also needs readiness changes, deadlines, supervisor behavior, load balancer timing, and application shutdown rules, which later chapters cover. The HTTP part is just this. Idle sockets can be closed separately from active sockets, and active sockets can be cut when local policy calls for it.

A good server also avoids invisible idle buildup during normal operation. Keep keepAliveTimeout finite. Set maxRequestsPerSocket when long-lived sockets need request count limits. Watch descriptor usage. Track connection counts separately from request counts.

A service can have low request throughput and high idle connection count if clients hold many persistent sockets. CPU can look fine while memory and descriptors grow. Request metrics alone miss that retained connection state. The fix might be lower idle timeouts, lower client free-socket counts, fewer client processes, or a different client pooling policy.

Server-side counters should separate accepted connections from requests.

requests per second
current HTTP connections
idle HTTP connections
active HTTP exchanges
closed by keepAliveTimeout

Those names match the APIs. server.closeIdleConnections() targets idle HTTP connections. server.closeAllConnections() targets established HTTP connections, whether active or idle. Connection state tells you what those calls will affect.

HTTP/1.1 Reuse Still Serializes Work

Keep-alive reuses a connection. Through core node:http, it does not run two exchanges on that connection at once. The agent still drives ordinary HTTP/1.1 request-response sequencing over each reused socket.

HTTP/1.1 head-of-line blocking happens when later exchanges on the same connection wait behind an earlier exchange on that connection. Responses on a single HTTP/1.1 connection have to come back in request order. If one request takes a long time, the next exchange on that same connection waits.

One socket gives this timeline.

/slow starts
/fast waits for the socket
/slow response completes
/fast request starts

Two sockets allow this one.

socket #1 handles /slow
socket #2 handles /fast
both responses progress independently

The agent controls which timeline is possible through socket limits. With maxSockets: 1, the second request waits in the pending queue. With a higher limit, the agent can open another socket and run both exchanges at once.

Core HTTP gets concurrency by using more sockets, up to the configured limits.

maxSockets: 1  -> one active exchange per origin
maxSockets: 10 -> up to ten active sockets per origin
extra requests -> pending queue

More sockets give more HTTP/1.1 concurrency. They also consume more descriptors, memory, kernel state, and upstream capacity. Reuse cuts repeated connection setup, while socket count controls how much runs in parallel. They are two separate settings, so tune them separately.

People often treat keep-alive as a concurrency feature. It is reuse. A single persistent HTTP/1.1 connection can be efficient for sequential work and still slow for parallel work. Parallel work needs multiple active sockets in core HTTP, or a client and protocol model that supports multiplexing.

📌

Important

Keep-alive reuses one connection for back-to-back exchanges. It does not run them at the same time. On a single HTTP/1.1 socket, a slow response holds up the next request behind it. Concurrency in core node:http comes from opening more sockets through maxSockets, rather than from turning keep-alive on.

Latency makes the ordering clear.

one socket: /slow 200ms, /fast 10ms -> about 210ms total
two sockets: /slow 200ms, /fast 10ms -> about 200ms total

The numbers are synthetic, but the ordering is real. On one ordinary HTTP/1.1 connection, /fast waits behind /slow when both exchanges use that connection. With two sockets, each socket has its own request-response sequence.

HTTP pipelining can send multiple HTTP/1.1 requests before earlier responses finish, but core http.Agent pooling is still organized around assigning requests to sockets and freeing them after a response completes. Undici handles the modern client model, dispatchers, pools, and pipelining in the next subchapter. HTTP/2 multiplexing comes in Chapter 11. Both change concurrency by changing the protocol machinery itself.

For core node:http, read pool behavior through three groups, active, free, and pending. When the active sockets are all in use, new work waits. Stale free sockets make reuse fail. A pile of idle sockets keeps descriptors open for nothing. Keep-alive is what lets a finished socket stay useful for the next exchange.

That covers most keep-alive debugging. Count requests. Count TCP connections. Then count active, free, and pending sockets for whichever side you control, client or server. The mismatch usually shows you where the state is sitting.

HTTP Keep-Alive Is Protocol Reuse

Reuse Needs Complete Message Ends

Server Timers After the Response

How the Agent Reuses Sockets

Inside the Agent Pool

Reading Pool State

Stale Pooled Sockets

Closing Idle and Active Server Connections

HTTP/1.1 Reuse Still Serializes Work

Related Reading