API Design, Contracts & Frameworks

Error Responses, Idempotency, Pagination, and Filtering

Ishtmeet Singh @ishtms/June 11, 2026/48 min read

#nodejs#api-design#idempotency#pagination

You can build an API that passes every schema check and still break the clients calling it. The route is there, the method is correct, the request body validates, and the response matches the schema in the OpenAPI operation. The behavior can still shift. A validation error body comes back with a different structure, a retry quietly creates two payments, page two starts skipping records, a filter stops applying without telling anyone, or an error code gets renamed because someone cleaned up an enum on the server side. Those are all contract breaks, even though every schema check passed.

The earlier subchapters built up to this. Subchapter 01 covered API structure, the resources, endpoints, route patterns, representations, and status behavior. Subchapter 02 handled schema documents and where validation runs. Subchapter 03 dropped those ideas into the Express and Fastify request flow. What is left is the behavior clients lean on after the handler returns. That means error responses, repeated writes, paginated reads, filters, sorting, projection, and compatibility rules.

These details are small, but a client's whole behavior turns on them. A client that gets back PAYMENT_DECLINED knows to show one screen, while VALIDATION_FAILED tells it to mark the bad fields instead. When a client retries POST /payments with the same idempotency key, it should get the original result rather than a second charge. Following nextPageToken from one page to the next lets a client walk a collection that changes underneath it without losing its place.

Passing validation only gets a request as far as the handler. Everything it does after that has to be spelled out.

Behavior Contracts After Schema Contracts

An API contract covers more than the path, method, request schema, and response schema. Those parts are easy to see. The behavior clients actually build against is in everything around them.

which errors have stable machine-readable codes
how validation errors point back to input fields
whether repeated write requests collapse into one logical operation
how pages advance through a collection
which query parameters the server accepts
which fields a client can request or sort by
which behavior changes require a migration path

Query parameters are a good example. They are just text in a URL, so it is tempting to treat them as flexible. Once clients ship code against them, they are not flexible at all. The names you accept, their types, the operators, the defaults, which combinations are allowed, and what happens on a bad value all become part of the contract. That whole set of rules has a name, the query parameter contract. When you give a query string the same care you already give a request body, the service gets a lot more predictable.

GET /orders?status=paid&limit=50&sort=-createdAt

That single request makes three separate behavior choices. status=paid narrows the collection, limit=50 asks for a page of a certain size, and sort=-createdAt puts the newest records first. The names, the values, the defaults, and the response to an invalid value are all part of the contract.

Your framework parses the URL and a schema validator checks the structure, but neither of them decides what any of it means. Defining that is your job.

A lot of broken behavior contracts start the same way, with a default that nobody ever wrote down. The first version of GET /orders returns the newest orders first, only because the database query happened to end with ORDER BY created_at DESC. Six months later someone rewrites that query to make it faster, and now the same route returns the oldest orders first. The schema validated cleanly through all of it, so nothing flagged the change, but every client that relied on the old ordering just broke.

Fixing it is mostly mechanical. Name the ordering rule, write a test that locks it in, document it, and keep it stable until a versioned change replaces it.

This behavior usually exists in three places at once, and over time they stop agreeing. The first is the public contract document, which might be the OpenAPI text, the examples, the SDK docs, or whatever artifact the team treats as the source of truth. In the code, the same behavior is scattered across validation branches, idempotency storage, query builders, serializer rules, and response builders. The tests hold a third copy, the cases that prove a repeated write, an invalid filter, an empty page, or a field projection still does what clients expect.

Then they drift. The route handler returns PAYMENT_ALREADY_SETTLED while the spec still promises PAYMENT_CAPTURED. A framework validation hook emits one field-path format and the application validator emits a different one. A pagination helper adds hasMore on one route and forgets it on another. What you want is to pull all of those scattered choices back into a single named behavior.

Start by sorting each behavior as either endpoint-local or API-wide. The error envelope structure is API-wide, and so are the pagination field names. Allowed filters and sort keys are usually endpoint-local, since they depend on the specific resource. Idempotency support is endpoint-local too, but once more than one write route uses it, the key format, the retention language, and the replay response style all belong in the API-wide set.

The payoff is a readable contract. A client learns one error envelope and one page-response structure, then turns to the per-endpoint docs for the domain-specific filters, sort options, and conflict codes.

Error Responses as Client Input

An error body is data the client has to parse and act on, so write it for the program reading it rather than for a human skimming logs. When the server rejects a request or cannot finish one, the client gets back a status code and a body. The status code gives the broad HTTP result, and the body carries the API-specific detail that the client actually reads and branches on.

Internally, a handler might throw an error object and a framework might push the response through error-handling middleware, but that machinery is covered in Chapter 27. What the client sees is only the status, the headers, and the body, so the error response contract has to be defined at that level.

You want one body structure that stays the same for every error in the API. That structure is called the error envelope. It gives clients one consistent place to find a machine-readable code, a human-facing message, field-level validation details, and optional request-correlation fields.

{
  "error": {
    "code": "ORDER_TOTAL_INVALID",
    "message": "Order total mismatches line items",
    "fields": { "total": ["does_not_match_items"] }
  }
}

The envelope is that outer object with the single error property. Inside it, ORDER_TOTAL_INVALID is the machine-readable code, and message is the human-readable text meant for a developer console, a support tool, or a product screen. The fields object is a validation error map, which ties each input field path to one or more validation codes.

The code is the thing the client branches on, so it is worth getting right. A machine-readable error code is a stable string meant for a program to switch on. It should be specific enough that the client can act on it and stable enough that an internal refactor will not change it. A code like BAD_REQUEST only repeats the status code, and INVALID tells the client nothing at all. ORDER_TOTAL_INVALID gives the client a real domain-level branch, and it does that without exposing the internal validators, function names, database constraints, or framework error types underneath.

The human message has a different purpose. It helps a developer or a support engineer understand the response. Copy that an end user reads, translated for the product, belongs somewhere else entirely. Clients still branch on codes and field paths, never on the message text.

Validation errors need more structure than that, because a single request can fail in several places at the same time.

{
  "error": {
    "code": "VALIDATION_FAILED",
    "fields": {
      "items[0].sku": ["required"],
      "shipping.postalCode": ["invalid_format"]
    }
  }
}

The map uses field paths as its keys and arrays of machine-readable validation codes as its values. The syntax of those field paths is part of the contract as well. Choose one representation, whether that is dotted paths with bracketed array indexes, JSON Pointer, or schema-native paths, and then keep it the same across the whole API. A frontend that wants to highlight a nested form field needs a path it can rely on to find the input that failed.

Status codes still do real work here. A 400 means the request structure or a value was rejected as it came in. 401 and 403 belong to the authentication and authorization chapters. A 404 means the target resource cannot be reached through that route by that principal. A 409 reports a conflict with the current server state, like a version conflict or an idempotency key reused with different input. Some APIs use 422 for input that parses but is semantically wrong. Whichever set you pick, apply it the same way on every endpoint.

The body adds detail to the status code, and the two should agree. A 409 paired with the code VALIDATION_FAILED leaves the client guessing which one to trust. A 400 carrying PAYMENT_ALREADY_CAPTURED reports a state conflict as if it were a generic client error. The exact status vocabulary varies between API styles, but a client should be able to read the status and the code together as a single, consistent result.

Internal errors are handled the same way. A TypeError thrown somewhere in handler code turns into a generic error response, usually a 500, with a stable code like INTERNAL_ERROR. The stack trace, file paths, SQL, framework names, and the raw exception message all stay in the logs. What the contract exposes is only the stable, public result.

🚨

Caution

A leaked stack trace, SQL fragment, or raw exception message gives an attacker free reconnaissance. Default framework error handlers often serialize the real exception straight into the body, so route every error through one builder that emits only { status, code } and a safe message, and check that the fallback 500 does the same.

Correlation fields help, as long as they stay separate from the core error code. A field like requestId lets a support engineer pull the logs for one specific response. Request context and structured logging get a full treatment in Chapter 29, so there is less to say about them here. The one rule worth stating now is that if the API returns a correlation field, it should use the same name everywhere and appear on every error path, including framework validation errors and the fallback 500s.

The error bugs that cause the most trouble tend to show up on the edge paths, the ones that are easy to forget about. These are things like a body parser that fails, a request to a route that does not exist, a schema validator that rejects the input, an async handler that rejects, or a response that was already sent. On every one of those, the framework's defaults will generate their own response structure unless application code steps in first. A consistent API routes all of those paths through the same response builder before anything reaches the client.

That builder should accept normalized error facts and keep the raw exceptions out of its input. A framework validation error turns into { status: 400, code: "VALIDATION_FAILED", fields }. A domain conflict produces { status: 409, code: "PAYMENT_ALREADY_CAPTURED" }. An unexpected exception maps to { status: 500, code: "INTERNAL_ERROR" }, plus a log event that keeps the original exception around.

The normalized object stays internal while the serialized response is what goes public, and keeping them separate pays off. You can swap internal error classes later without changing any of the codes clients see, and Express middleware, Fastify error handlers, and hand-written route code can all share one output path.

Error envelopes need version discipline even before the API has formal versions. Some changes are safe. Adding an optional requestId is one, since clients that do not know about it can ignore it. Other changes are breaking. Renaming fields to errors stops every client that reads field errors, a switch from items[0].sku to /items/0/sku breaks clients that bind those errors into form fields, and even adding a brand-new validation code can trip a client whose enum is strict. So decide up front whether clients should treat an unknown code as a generic failure, or whether every new code has to be coordinated with them first.

Problem Details

Problem Details is a standard you can adopt instead of inventing your own envelope. It is a JSON structure for HTTP API errors defined in RFC 9457, which registers the media type application/problem+json along with a small set of common members. Those members are type, title, status, detail, and instance, and APIs add their own extension fields for domain-specific data.

The main thing you get from it is interoperability. Any client or tool that already understands Problem Details knows where to find the broad error type and the status without being told. Your API still adds its own machine-readable code and validation map alongside those standard members.

{
  "type": "https://api.example.com/problems/validation",
  "title": "Request body failed validation",
  "status": 400,
  "code": "VALIDATION_FAILED",
  "errors": { "shipping.postalCode": ["invalid_format"] }
}

type is the problem type identifier, and many APIs make it a URL. That URL can point at documentation, but the part that counts for the contract is the identifier string itself. title is a short summary of the problem type. status, when it is present, repeats the HTTP status code for consumers that store or inspect the problem document separately from the HTTP response, though generic HTTP software still goes by the real response status. code and errors are the extension fields the API defines for itself.

Problem Details fits best when you want a standard outer structure and have a manageable number of distinct problem types. It also fits OpenAPI well, since the response schema can describe the base structure together with your extensions.

A custom envelope is a perfectly good alternative. Plenty of APIs ship { "error": ... }, { "errors": [...] }, or some other in-house structure, and that is fine. What you pick is less important than keeping it stable, consistent, and documented. Clients pay a real price when one API mixes several error structures, and that price is higher than the cost of any single non-standard envelope.

Problem type identifiers need the same care as machine-readable error codes. If you change https://api.example.com/problems/validation to https://api.example.com/problems/bad-request, any client that branched on the old identifier stops working. Treat the identifiers themselves as contract values, and keep any wording changes confined to the documentation around them.

The mapping work is still on your side. Your API decides which framework errors map to which problem type, which validation path format appears under errors, which fields are safe to expose, and which status code goes with each problem. Problem Details gives you a common body structure, but the behavior behind it is still something you have to define.

Idempotency for Write APIs

Clients retry requests because networks fail partway through.

The common case goes like this. A client sends POST /payments, the server processes it, and the response gets lost because the connection resets before the client can read the body. Now the client has an ambiguous outcome. The payment might have gone through, or the server might have died before doing anything at all.

In most resource APIs, PUT and DELETE already mean something idempotent, purely from HTTP method semantics. POST is the one that creates new operations, which is why a repeated POST can create a second resource, send a second email, enqueue a second job, or charge a card twice.

Idempotency is the property that solves this. With it, repeated attempts at the same logical operation produce one committed effect and one stable result. For write APIs, the usual mechanism is an idempotency key, a value the client supplies that names a single write attempt across all of its retries.

POST /payments HTTP/1.1
Idempotency-Key: pay_2z8ve9q1
Content-Type: application/json

{"orderId":"ord_123","amount":4999}

The key in that request is pay_2z8ve9q1. The client generated it before the first attempt and sends the exact same value on every retry of that one payment. The server uses it to tie all of those repeated HTTP requests back to a single logical write.

For that link to survive a server crash, the state behind it has to be durable. That durable state is the idempotency record, the server-side record stored for one key. It usually holds the key, its scope, a request fingerprint, the current processing state, the final status, the final body, a few selected response headers, a creation time, and an expiration time.

{
  "scope": "acct_42 POST /payments",
  "key": "pay_2z8ve9q1",
  "requestHash": "sha256:3b7c...",
  "state": "succeeded",
  "status": 201,
  "body": { "paymentId": "pay_789" }
}

The scope is what stops two clients from colliding on the same key. The string pay_2z8ve9q1 sent by account acct_42 has to mean something different from the same string sent by account acct_99. Many APIs scope a key by the authenticated principal plus the method plus the route, and some add tenant, region, or API product on top of that. Write the scope down in the contract or the implementation notes, since it is the thing that decides when a key counts as reused.

The request hash binds the key to the actual input. When the key and the payload both match a stored record, the server can replay the result it saved. When the key matches but the payload differs, the server should return a conflict response, which signals that the request clashes with existing server state. In idempotency, that clash almost always means the client reused a key for a different operation.

first request:  key K, hash A -> process and store result
retry request:  key K, hash A -> return stored result
bad reuse:      key K, hash B -> return 409 conflict

A replay is just a later request carrying a key the server has already seen. This is the normal, expected case, and it is the whole reason the key exists in the first place. The replay path should be simple and deterministic. The server compares the scope, compares the request hash, and returns the outcome it already recorded.

Replay status codes need a stated policy too. Most APIs return the original status code and body, so if the first request created a payment and returned 201, the replay also returns 201 with the same body. Some add a header like Idempotency-Replayed: true, which is an optional detail in the contract. The part you cannot skip is returning the stored result.

Require an idempotency key only on the endpoints where a duplicate effect would actually cause harm, like a payment capture, an order submission, a call out to an external provider, a message being enqueued, an account mutation, or the start of a bulk import. A read endpoint gets almost nothing from a key, since reads do not have write effects to begin with. A write endpoint built on PUT /resource/{id} already has a stable resource identifier, though a key can still help it replay the exact response.

The format of the key is itself part of the contract. You define a maximum length, the allowed characters, and a retention period. Past that, treat the key as an opaque token the client generated and nothing more. The business meaning lives elsewhere, in the authenticated principal, the route, and the request body. The server can store the key, hash it, log a shortened form, and compare it, while the authorization, the amount, and the target resource all come from validated request context. The key string stays a plain operation token and nothing else.

Retention is part of the guarantee the API makes. Say the API keeps records for 24 hours. A replay that shows up at hour 25 either runs as a brand-new operation or fails with an expired-key response, depending on which policy you chose. Clients need to know that window, and the server team needs a cleanup job that enforces it. If records were kept forever, every key would become permanent state, which causes its own problems.

The replay matrix is worth writing down before any storage code exists.

same key + same input + succeeded -> return stored response
same key + same input + processing -> return in-progress response
same key + different input -> return conflict response
same key + expired record -> follow expiration policy

Each branch carries both a status code and a machine-readable code. You might use 409 IDEMPOTENCY_IN_PROGRESS for the in-progress case, 409 IDEMPOTENCY_KEY_CONFLICT for the different-input case, and 409 IDEMPOTENCY_KEY_EXPIRED for the expired case, or you might run a fresh attempt on expiry, depending on the retention policy you documented. The specific codes are up to you, but the matrix itself has to exist.

Decision flowchart for an idempotent write: validate, hash the canonical body, atomically claim the key, then branch on whether a record already exists and on its state. — One incoming idempotent write resolves through a single race-safe atomic claim. A won claim runs the write and stores the response. An existing record replays the stored response, returns 409 on a hash mismatch or while still processing, or follows the retention policy once expired.

Store enough metadata that a support engineer can explain any branch after the fact. The key, the scope, the request hash, the state, the creation and expiration times, the route identity, and the final response status usually cover it. Store the request body only when your policy and data classification allow it. Plenty of APIs keep only the hash, so they never hold payment details, personal data, or large payloads in idempotency storage.

Once the storage rule is clear, the full sequence is short.

1. Validate key and request body
2. Claim or load idempotency record
3. Execute or replay the write
4. Store or return the public response

Validation and the ownership decision both happen before any side effect, so a malformed request or a losing claim never reaches the domain write. Only the request that wins the claim runs the actual write; every replay returns the stored result instead. And the public result gets committed into idempotency storage before the response leaves the process, so a later replay always has something to return.

This structure also gives you something concrete to test. One test sends the same key and payload twice and asserts exactly one resource exists with one identical response body. Another changes the amount under the same key and expects the conflict code. A third fires a second request while the first is still processing and expects the in-progress response. None of these reach into the storage engine, which is deliberate, since the thing they protect is the promise clients depend on.

Idempotency Records Under Load

The idempotency implementation has one strict job to do. For a given key, the first accepted request creates exactly one record inside the retention window, and every later request for that key reads the record that first one created.

That work starts before the handler ever runs the write. The server parses the request, validates the key, computes a request fingerprint from the method, the route, and the normalized body, and then tries to create the idempotency record in storage. That create has to be atomic, an insert that succeeds only if no record is already there, with no chance of a race. SQL handles it with a unique constraint over (scope, key) and an insert inside a transaction. Redis or Valkey can do it with a conditional set that claims the key, after which later updates carry the state forward. A document store does the same with a conditional create on a unique key.

Which storage engine you use is less important than the ownership rule it enforces. One process has to create the new record while every other process finds the one that already exists. Without the atomic claim, two Node processes can pick up the same retry at the exact same moment and both run the payment write before either of them has stored a result.

🚨

Caution

The claim has to be a single atomic operation, either a unique constraint on (scope, key) together with an insert, or a conditional SET NX. A read followed by an insert in handler code is a race, and losing that race means two real payments. Never gate the side effect on a separate query that just checks whether the key already exists.

The record often moves through a small state machine.

absent -> processing -> succeeded
absent -> processing -> failed
processing -> expired

absent means the server has no record for the key in this scope yet. processing means one request has claimed the key and started the work. succeeded means a successful response is stored and ready to replay. failed means a failure response is stored against the same logical attempt. expired means the record has aged out or needs recovery.

The processing state is the one that produces the hard bugs.

Say two requests overlap. The first one claims the key and starts the write, and the second arrives before that first request has stored its final response. The API has to define what the second request gets back. Common choices are a 409 Conflict with a code like IDEMPOTENCY_IN_PROGRESS, a 202 Accepted with a status resource when the operation model supports polling, or a short block until the first request finishes. There is also 425 Too Early, but that one is meant for HTTP early-data replay protection, so use it only when the request actually arrived in early data or through a hop that set Early-Data: 1. Blocking works fine within a single process, but across a fleet of processes it needs polling, pub/sub, or storage notifications to coordinate. Returning a conflict is simpler and hands the retry timing back to the client, and the full retry policy comes later in Chapter 27.

A stored failure response needs more thought, because not every failure should replay the same way. If the payload fails semantic validation after the key was already recorded, replaying that same 400 keeps the behavior stable and is the right call. The harder case is an upstream payment provider that times out, where the server genuinely cannot tell whether the charge committed. Storing a generic 500 there can freeze a wrong answer in place and hide the real state of the operation. Many payment-style APIs handle that by treating an unknown outcome as a recoverable record, one that needs reconciliation, a provider lookup, or a later status endpoint. The contract itself can stay small. When the server has a stored outcome, a replay returns it; when the outcome is still in progress or unknown, the replay returns a stable code that tells the client to try again later or inspect the resource directly.

The body you store should be the public API response, not the internal entity that sits behind it. Storing the public response means a replay can skip the handler code and any serialization differences entirely, and it freezes the response for that one attempt. If the payment resource changes ten seconds later, the replay still returns the original 201 body from the creation. Some APIs instead store a pointer to the created resource and rebuild the response on each replay. That saves storage space, but the rebuilt body can drift if the representation rules change later. Make that tradeoff deliberately.

A few response headers are worth keeping in the record as well. Content type, location, and idempotency replay metadata are the usual ones. Headers like the date, tracing headers, connection headers, cookies, and anything the framework generates per response should come from the live response path each time instead of the record. The rule of thumb is to store only the headers a client actually depends on.

Request fingerprints have to run on canonical input. JSON key order, extra whitespace, and defaulted fields can all turn the same logical request into different byte sequences. So for a JSON body, fingerprint the parsed and normalized value, after structural validation and before any semantic side effect. The fingerprint should cover the route identity, the method, and the fields that actually change the operation, and it should leave out headers that have no effect on the write. Tenant and principal come in through the idempotency scope, which keeps the body itself focused on operation data.

⚠️

Warning

Hash the parsed and normalized value rather than the raw request bytes. Reordered JSON keys, added whitespace, or a defaulted field all change the byte stream even though the operation is identical, and a hash mismatch then turns a valid retry into a 409 conflict that can block the client from ever finishing.

The body parser can fail before any record exists at all. A malformed JSON request returns 400 before the server ever claims an idempotency slot, and that is usually fine, because the server never reached a valid logical operation. If your clients expect every failed attempt to be replayable, you can record parse failures as well, though that makes key retention noisier and the error handling more involved.

The main failure window is between the side effect and the record update.

1. Claim idempotency key
2. Commit payment write
3. Store final idempotency response
4. Send HTTP response

If the process dies after step 2 but before step 3, a later replay sees processing even though the write already went through. How you fix that depends on where the side effect actually happens. If the idempotency record and the resource write share one database, write both in a single transaction wherever that is possible. If the side effect happens in an external payment provider, pass that provider the same idempotency key, or a provider-specific operation key, so the provider runs its own duplicate protection. Your service can then reconcile the state by asking the provider for the result of that operation.

The mechanics of database transaction isolation come in Chapter 18, but the API-level rule is already clear. Idempotency state and side-effect state have to stay in close enough agreement that a replay can pick the right response. If they can drift apart in your design, build a recovery path for processing records. That recovery might expire old records, look up created resources by a natural unique key, query an external provider, or return a stable response that says the operation is still resolving until a background repair finishes.

Idempotency keys also need abuse limits. A single client can fire millions of unique keys and force the server to store a record for each one. The deeper treatment of rate limiting and abuse detection waits until Chapter 25, but the local contract can already cap key length, require authentication before it records any key, and reject keys outright on endpoints that do not support idempotency.

Idempotency has a real limit worth being honest about. It guarantees a bounded result for repeated attempts inside one service contract, and nothing beyond that. Global exactly-once execution across databases, queues, external providers, and downstream systems is a distributed workflow problem, and distributed idempotent workflows get their own chapter later. At the level of one API, the promise is narrower but still useful. The same key, used for the same scoped operation while its record is still retained, gets the same result back or a defined conflict.

Pagination Is a Read Contract

A collection endpoint with no pagination will eventually cause a production incident. The first hundred orders come back in one response without any trouble. Once the collection grows to a million rows, that same query has to load all of them, and it pushes against memory, database time, serialization time, response size, and client parsing time at the same moment.

Pagination splits that collection response into bounded pages. The server picks a page size, returns one part of the collection, and gives the client enough information to ask for the next part. With that in place, both sides can repeat the read safely as many times as they need.

Offset pagination uses a numeric offset and limit.

GET /orders?limit=50&offset=100

That request asks for 50 records starting after the first 100 in the ordered result set. It is easy for a client developer to reason about, and it also supports page numbers through something like page=3&pageSize=50, which usually maps down to an offset internally.

Offset pagination drifts whenever the collection changes between two requests. A client fetches offset 0 and gets records 1 through 50. While it is reading them, another user inserts a brand-new order at the top. The client then fetches offset 50 and sees the old record 50 a second time, or misses one entirely, depending on which way the data shifted. Deletions move the window the same way. The query was valid both times, but the set of rows the client saw was wrong.

Offset also gets slow at large offsets in a lot of data stores, because the storage engine often still has to walk past every row it skips. The details of index design come later, but the consequence is something an API designer should know about now. An offset=500000 can run far slower than offset=50 even though both return only 50 records.

Cursor pagination is the answer to that drift. The server hands back a cursor, an opaque value that marks a position in one specific ordered result set, and the client sends that value back on its next request.

GET /orders?limit=50&cursor=eyJjcmVhdGVkQXQiOi...

Keep the cursor opaque from the client's point of view. The server can pack a timestamp, an identifier, a direction, a filter hash, or a version into it. As long as clients only store the value and send it back untouched, the server is free to change that encoding later while the contract field name stays the same.

A page token is the client-facing value used to fetch another page. Different APIs name it differently, cursor in some, pageToken in others, and some return links instead. The token can wrap a cursor along with other state. The contract should spell out whether the token is opaque, how long it stays valid, and which query parameters have to stay the same when the client uses it.

{
  "items": [{ "id": "ord_841" }],
  "nextPageToken": "eyJjcmVhdGVkQXQiOi...",
  "hasMore": true
}

The response carries the data and the traversal state together. Here items holds the current page, nextPageToken points at the next position, and hasMore is a convenience flag the client can check. Some APIs skip hasMore, return only the token, and treat a missing token as the end of the collection. Either style is fine, as long as you document the one you chose and stay with it.

Stable ordering means every request in one pagination flow uses a deterministic order that includes a complete tie-breaker. The sort key is the field, or the set of fields, that orders the results. A sort key of createdAt DESC on its own is usually incomplete, because two records can share the exact same timestamp. createdAt DESC, id DESC is stable once id is unique and the direction of the comparison is fixed.

{
  "createdAt": "2026-06-10T12:30:00.000Z",
  "id": "ord_841",
  "direction": "next"
}

That decoded cursor holds the createdAt and id of the last item from the previous page. The next query asks for the records that come after that pair in the same ordering. The server can sign or encrypt the token before it goes out, and the decoded structure stays entirely on the server side.

Cursor pagination handles a changing collection better than offset does, because the next page begins right after the last sort position the client saw and never relies on a numeric count. A record inserted before the cursor does not change the next page, and a record deleted before the cursor does not change what the cursor points to. Inserts or deletes that happen after the cursor can still move later pages, since the collection is live, but the traversal always has a stable anchor to resume from.

Two columns showing the same rows under offset and cursor pagination after a new row is inserted, with offset re-showing a row and cursor resuming cleanly. — Offset pagination counts positions, so a row inserted between requests shifts the window and page two re-shows an already-returned row. A cursor anchored on a (createdAt, id) tuple resumes strictly after that anchor, so the same insert produces no duplicate.

What you pay for that is more contract complexity. Cursor pagination needs stable sort keys, opaque token handling, token validation, and defined behavior for when the filters or sort order change between requests. A token issued for status=paid belongs to that status=paid traversal and nothing else. If a client pairs that token with status=refunded, the server can reject the request and return a stable code like PAGE_TOKEN_QUERY_MISMATCH.

Page size is part of the contract as well. You define a default, a maximum, and the behavior on an invalid value. A missing limit might default to 50. A limit=10000 might clamp down to 100, or it might fail outright with LIMIT_TOO_LARGE, and failing is usually easier for a client to notice than a silent clamp. A limit=0 needs a defined result too, because returning an empty page together with a next token can trap a client in a loop, which is why a lot of APIs simply reject zero and negative limits.

Empty pages need their own rules. A filter can legitimately match nothing, and that case should return items: [] with no next token at all. A cursor can also end up pointing past the last visible item after records are deleted, and there you can return either an empty terminal page or a token-expired response. Pick whichever one fits, and write a test that proves a client loop actually terminates against it.

⚠️

Warning

Reject limit=0 and negative limits outright, and never return a nextPageToken on a terminal or empty page. A token sitting on an empty page is the most common reason a client ends up polling forever. A next token has to mean that more rows exist, every single time, with no exceptions.

Links are another way to package this. Some APIs return a next URL as the way to continue.

{
  "items": [{ "id": "ord_841" }],
  "links": { "next": "/orders?pageToken=eyJj..." }
}

The link carries the same contract information that nextPageToken would, and the client just follows it directly. The server still has to validate the token's scope, filters, sort order, and expiration before it serves the page. Switching to links only changes how the continuation is packaged, while the traversal mechanics underneath are identical.

Sorting, Cursors, and Page Drift

A cursor only works when the ordering is total enough for the server to resume from it.

ORDER BY created_at DESC leaves groups of records that share the same timestamp. The database is free to return those tied records in a different order from one query to the next, especially after writes, query plan changes, or vacuuming in storage engines that physically move rows around. A cursor built from created_at alone has no way to tell which of the tied records the client has already seen.

⚠️

Warning

Every cursor sort needs a fully unique tail key. With createdAt DESC alone, two rows that share a timestamp can land on opposite sides of a page edge, so the next page either skips them or repeats them. Always append a unique tie-breaker like id, and store both values in the token.

Add a tie-breaker.

GET /orders?limit=50&sort=-createdAt,-id

That sort means the newest createdAt comes first, and for records that share a timestamp, the higher id comes first. The cursor stores both of those values from the last item on the page, and that pair is what the next query resumes from.

Keep the set of public sort options small. An API that accepts arbitrary sort values lets clients order by fields that have no stable meaning, surface internal column names, or kick off expensive queries. A tighter contract spells out exactly which sort keys and directions it allows.

allowed sort:
- createdAt
- updatedAt
- id

That short list is itself a contract. An unknown sort key should come back as an error response with a stable code. Falling back silently to the default sort creates hidden bugs, because the client goes on believing its sort took effect.

The default ordering needs to be named and pinned down as well. If GET /orders defaults to -createdAt,-id, write that down and put a test around it. Changing that default to id later would quietly reorder results for every client that left sort off the request. Adding a default record-status filter could hide records that some dashboard expected to see. In both cases the route and the response schema look exactly the same, and only the behavior underneath has moved.

Cursor tokens should usually bind to the query parameters that produced them. The token can carry a hash of the filter and sort parameters. Then if a client changes status, customerId, sort, or limit while reusing the same token, the server can reject it. Some APIs do allow limit to change across pages while others keep it fixed, and either choice works fine as long as the token validation matches the rule you documented.

Opaque tokens are what give you implementation freedom. A token can be a base64url-encoded JSON object today and a signed binary blob six months from now, and clients should send it back exactly as they received it. As soon as clients start parsing cursor tokens, the internal fields inside them become API fields by accident, and what was a private storage detail becomes a compatibility obligation you did not mean to take on.

Token expiration is a behavior contract in its own right. How long a token stays valid depends on what it carries. One that holds raw record positions might be good for hours, while one tied to a temporary search snapshot might expire in minutes, and one that embeds a signing key version can expire the moment that key rotates. Whenever a token does expire, return a clear error like PAGE_TOKEN_EXPIRED and tell the client to restart from the first page.

Backward pagination adds another layer of rules. Some APIs support a previousPageToken, while others only ever move forward. Going both directions means putting a direction into the token and getting the comparison operators right for each way. A forward-only API is simpler to reason about, and it is often all that backend clients syncing a list actually need.

Counting the total number of results is a separate contract again. A totalCount field is tempting, but it can be expensive to compute or stale by the time the client reads it. If the API does return it, define whether the number is exact, approximate, or computed once at the first page. For a lot of APIs, hasMore together with nextPageToken gives clients everything they need to traverse the collection without the server ever promising a global count.

Filtering and Sorting

Filtering narrows a collection down using conditions the client supplies. The syntax and the set of operators the API accepts for those conditions make up the filter grammar, and the first version of that grammar should stay small.

GET /orders?status=paid&createdAfter=2026-06-01

That request uses named query parameters. Here status probably maps to an enum and createdAfter to an ISO timestamp. The server validates each field on its own and rejects any name it does not recognize. For most REST-style APIs, named parameters stay easier to document than a small query language of your own.

A richer grammar tends to grow fast.

GET /orders?filter=status:eq:paid,createdAt:gte:2026-06-01

That single filter parameter packs fields, operators, and values into one string. It can work, but now every character in that string is part of a parser contract you have to define. You have to decide how commas get escaped, which operators exist, and how an invalid field gets reported. A grammar that ships with two operators can grow into a full custom query language after only a few product requests.

Named parameters usually make the better first contract.

status=paid
customerId=cus_123
createdAfter=2026-06-01T00:00:00Z
createdBefore=2026-07-01T00:00:00Z
minTotal=1000

Each one has a type, a validation rule, and an error code of its own. Unknown parameters need a deliberate response rather than an accident. If the server silently ignores them, client bugs stay hidden; if it silently accepts them, a simple typo can quietly change which production data comes back.

⚠️

Warning

Reject unknown query parameters with a 400 and the validation envelope. When a misspelled filter is silently dropped, the client believes it narrowed the result while the server quietly returns everything, and that bug shows up only as wrong data in production instead of as a clear error.

Filtering also touches authorization, though this chapter stays shallow on that. A user can ask for customerId=cus_123 and still be allowed to see only some of those orders. Authorization policy gets its full treatment in Chapter 24. At the contract level, the thing to state plainly is what a client sees, which is that the collection returns only the resources the caller is allowed to view, and any filters apply inside that already-visible set.

Sorting follows the same pattern as filtering. You allow a small list of public sort keys and define the syntax for direction. Common forms are sort=-createdAt for descending, sort=createdAt for ascending, or a separate pair like sortBy=createdAt&sortDir=desc. Settle on one of those, validate it on the way in, and return stable errors for any unsupported field or direction.

Reject internal field names outright. A value like sort=db_created_ts leaks your storage naming, and filter[deleted_at]=null leaks the fact that you use soft deletes. The public filter names should line up with the concepts in the resource representation that clients already see, while the actual table columns stay inside the handler, which translates between the two.

Filtering can change pagination behavior as well. Cursor tokens should include, or at least bind to, the filter set that produced them. A token issued for one filtered collection only belongs to that filtered collection, so when a client changes the filter, the traversal has to start over. That rule is what keeps clients from getting page sequences that are half old data and half new.

Open-ended filters need resource limits of their own. A client that sends no filter at all and asks for old data can end up scanning far too much. The API can require a date range, enforce a maximum window size, cap the page size, or return a validation error when a request is too broad. Index design shows up in later chapters, but the API structure itself can already stop pathological requests from becoming a normal, accepted path through the contract.

Invalid filter behavior should reuse the same error envelope you already use for body validation. The request was structurally something the API understood, but the query parameter contract rejected one of its fields, values, operators, or combinations.

{
  "error": {
    "code": "QUERY_PARAMETER_INVALID",
    "fields": { "createdAfter": ["invalid_timestamp"] }
  }
}

With that, a client can flag a broken filter control the same way it flags a bad body field. The query errors reuse the same envelope, the field paths point at query parameter names, and the codes come from the same validation vocabulary as the rest of the API.

Combinations of parameters need rules of their own. A createdAfter that lands later than createdBefore can pass field-level parsing and still fail semantic validation. A pairing like status=archived&includeDrafts=true might not be supported at all. And sort=total might be allowed only when currency is held fixed. All of these are cross-parameter rules, and they belong in the query parameter contract, because a client can run into them using nothing but query parameters.

Field Projection and Response Size

Field projection lets a client request a subset of fields in the response representation.

GET /orders/ord_42?fields=id,status,total,createdAt

The client is asking for four fields out of the full order representation. The server returns those four and leaves out the rest, following a documented projection rule. Some APIs call this sparse fieldsets. What it buys you is control over response size, and it helps most when a representation carries large nested objects or optional expansions.

Projection helps most when different clients want different amounts of data. A list screen might only need id, status, and total, whereas a detail screen needs the line items, the shipping address, adjustments, and audit fields. Sending the full detail representation for every row in a list wastes both bytes on the wire and serialization work on the server.

Projection also creates pressure on the contract. Every field a client is allowed to project becomes a public name. If clients can request internalFraudScore, marginCents, or deletedAt, those names are now part of the API whether the team intended to publish them or not. The projection list should be an allow-list built from the public representation fields, and nothing else.

🚨

Caution

Resolve fields against an explicit allow-list of public names rather than against the underlying entity. A pass-through projection would let a client name internalFraudScore, marginCents, or deletedAt and read fields you never published. The allow-list is the thing that keeps internal columns out of responses.

Nested projection needs more caution.

GET /orders/ord_42?fields=id,customer.id,customer.name

Nested paths can work well. They also drag in a path syntax, rules for missing fields, and rules for arrays. You have to decide whether items.sku means every item should carry sku, whether an unknown nested field is an error or just ignored, and whether a parent object shows up at all when only its child fields were requested.

For this subchapter, keep projection limited to trimming an existing representation. The full execution model is what GraphQL handles, and that comes next. A REST-style projection parameter works best when all it does is drop fields from a representation the client already knows. It should refuse arbitrary joins, computed subqueries, or any resolver-style behavior presented as a plain field name.

Projection also interacts with response validation. Subchapter 02 treated the response schema as a complete representation, and a projected response breaks that assumption, so it needs either its own schema or a documented sparse-response policy. A generated client has to be able to tell a field that was omitted on purpose from a field that is missing because of a server bug.

Default representations need to stay stable as well. Adding a new field to the default response is usually safe, since most clients ignore fields they do not know, although a strict decoder can still choke on it. Removing a field that clients were relying on is a straight break. Projection takes some of this pressure off, because it lets clients ask for exactly the fields they want, but that only holds as long as the projection contract itself stays stable.

Documenting Behavior in OpenAPI

OpenAPI can hold all of these behavior contracts, as long as the team actually writes them down as operation details instead of leaving them buried in handler-only code.

Error responses need schemas of their own. For an API that uses Problem Details, define the problem schema together with its extensions. For one with a custom envelope, define that envelope a single time and reference it from every operation. Each operation should then list the status codes it can return for the expected, client-visible failures.

Idempotency keys belong in header parameters for the write operations that support them.

parameters:
  - name: Idempotency-Key
    in: header
    required: false
    schema: { type: string, maxLength: 128 }
responses:
  "409": { description: Idempotency key conflict }

That snippet documents the header and one conflict response. The operation around it still has to describe the retention window, the replay behavior, and the same-key-different-payload behavior, either in prose or in a shared description. A schema field can enforce something like the string length of the key, but it cannot capture the storage semantics behind idempotency on its own.

Pagination needs both a response schema and its query parameters documented. Define limit, the pageToken or cursor parameter, the maximum page size, the default ordering, and the token error responses. The schema should show items along with nextPageToken or the links. The operation description should spell out that the token is opaque and how it binds to the query parameters.

Filters and sort fields should use enums wherever the set is known.

parameters:
  - name: sort
    in: query
    schema:
      type: string
      enum: ["createdAt", "-createdAt", "id", "-id"]

The enum turns the supported sort values into a machine-readable contract. A generated client can surface exactly those options, and a test can check the spec against the handler's validation to make sure they agree. An unsupported value then gets a defined error response rather than whatever undefined behavior the server would otherwise fall into.

Projection parameters can document their allowed field names as well, though OpenAPI gets verbose once you try to describe a comma-separated field list. A schema can describe the parameter as a plain string while prose or examples carry the actual grammar. If projection is central to the API, a more structured query parameter style can be worth the extra specification effort.

Examples help most when they show the behavior and the response structure side by side. A good set includes a validation error map, an idempotency replay response, a paginated response carrying a token, and an invalid filter response. Once tests compare real response fixtures against the contract, those examples turn into executable expectations.

OpenAPI should carry the client-facing behavior and leave the implementation notes out. There is a real difference between a deployment detail and a contract. Saying the records are stored in Redis for 24 hours is a deployment detail, while promising that idempotency keys are retained for at least 24 hours is a contract clients can build on. Saying the query uses a created_at,id index is a database concern, while stating that the default ordering is createdAt descending and then id descending is part of the API contract.

Compatibility Risks

Behavior changes can break clients even when every schema check still passes.

Start with errors. A client that branches on a machine-readable error code stops working the moment that code is renamed, and a client with a strict enum can fail just from seeing a code it has never encountered. Clients that map validation errors onto form fields depend on both the field map and the path syntax staying put, so removing a field from the map or switching the path format quietly breaks their error display.

Idempotency has its own set of risks. If you shorten key retention, retries that used to be protected can come back as duplicate operations once they fall outside the shorter window. Changing the key scope is subtle, because keys that were separate can start colliding, or keys that were shared can split apart. There is also the replay response itself. A client that expects a 201 after creation will break if the replay starts returning 200, and a client that expected the replayed values to match the original attempt can be surprised if the server rebuilds a fresh representation instead.

Pagination changes tend to stay invisible until the underlying data moves. A change to the default sort order shifts every page for clients that never sent a sort of their own. A dropped tie-breaker brings back exactly the duplicate and skipped records it was there to prevent. The cursor token format is only safe to change when clients have always treated tokens as opaque and the server keeps honoring old tokens through a transition window. A different maximum page size breaks any client that assumed a fixed batch count.

Filtering and projection carry the same kind of risk. A new default filter makes records disappear from views that used to show them. A renamed filter parameter stops every URL built on the old name. When an unknown filter is silently ignored, the client stays convinced it narrowed the data while the server hands back a broader collection. And once a projectable field is removed, any client that trimmed its payloads around it breaks.

The mechanics of actually migrating through these changes belong to the versioning chapter. The rule to take away from this one is simpler. If a client can observe something, branch on it, store it, retry with it, or send it back to you later, then it is part of the API contract. Handler code underneath is free to change however you like. Anything a client can see deserves the same care you already give to route names and JSON schemas.

Behavior Contracts After Schema Contracts

Error Responses as Client Input

Problem Details

Idempotency for Write APIs

Idempotency Records Under Load

Pagination Is a Read Contract

Sorting, Cursors, and Page Drift

Filtering and Sorting

Field Projection and Response Size

Documenting Behavior in OpenAPI

Compatibility Risks

Related Reading