API Design, Contracts & Frameworks

OpenAPI, JSON Schema, and Validation Boundaries

Ishtmeet Singh @ishtms/June 11, 2026/52 min read

#nodejs#api-design#openapi#json-schema#validation

A single Node route can accept one structure, document a second, and type-check against a third. Now you have three descriptions of the same endpoint, and they do not have to agree. Nothing in the build catches the mismatch.

Everything reads as healthy at first. The route compiles, the OpenAPI document looks plausible, and the TypeScript type makes the handler feel pinned down. At runtime that confidence breaks down. The handler accepts a field the spec never mentioned and coerces a query string into the wrong value. It returns an internal field that was supposed to stay private. Then the generated client you shipped weeks ago is still trusting a document that has since gone stale.

A route this short already has the problem.

type CreateOrder = { sku: string; quantity: number };

app.post('/orders', async (req, res) => {
  const order = await createOrder(req.body);
  res.status(201).json(order);
});

Notice how little this handler actually guarantees. It reads whatever the framework left on req.body. The TypeScript type exists only where the source code is compiled, and by the time a real request runs, that type is already gone. The input arrives as bytes, gets parsed into JSON, and becomes ordinary JavaScript values. Somewhere between that raw input and your application code, something has to check it.

OpenAPI and JSON Schema both do their work at exactly that point.

These are two different tools. OpenAPI is a machine-readable contract document. It describes paths, methods, parameters, request bodies, responses, and the reusable pieces those descriptions share. JSON Schema is narrower. It is a runtime schema language for data, and it describes which JSON values count as acceptable in a form a validator can run against real input.

Both live right next to the route, and they still do different jobs.

raw HTTP input
  -> HTTP parser and body parser
  -> request validation
  -> handler input
  -> handler output
  -> serialization
  -> response validation
  -> HTTP response

Call the inbound check the validation point. It is where data crosses from an untrusted external form into the structure the handler is allowed to depend on. For a Node HTTP API it usually covers path parameters, query parameters, a few selected headers, and the parsed request body. On the way out, a separate check can verify the response structure before any bytes leave the process.

The framework name is less important than where you put the check. Express, Fastify, Hono, Koa, and raw node:http all take in external input, and each one needs its own policy for validating, coercing, and serializing it. Framework internals come up in Subchapter 03. This subchapter stays on the contract layer.

One Route, Several Artifacts

We can keep the route from the previous chapter's resource model small. It creates an order.

paths:
  /orders:
    post:
      operationId: createOrder
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/CreateOrder'

That fragment carries two things, a path and one operation. The path item is the object stored under a path like /orders, and it groups whatever operations that path supports. An operation is one method-level contract under it, like post, get, or delete. Each operation spells out the request and response a client can see for that one method on that one path.

operationId gives tooling a stable name to work from. A generator turns createOrder into a client method, documentation anchors its examples to the same id, and server-side tooling uses it to map the operation back to a route. Once a consumer generates code from that name, it becomes part of your public API surface.

Responses live on the same operation.

responses:
  '201':
    description: Created
    content:
      application/json:
        schema:
          $ref: '#/components/schemas/Order'

The operation now points in two directions at once. Its request body references CreateOrder, and its 201 response references Order. The handler sits between them, taking validated request data and returning data that is supposed to match the response schema.

Reusable named objects live under components. In backend APIs the common one is a schema, but the same place holds parameters, responses, headers, examples, and security scheme descriptions. Schemas are what this subchapter spends its time on.

components:
  schemas:
    CreateOrder:
      type: object
      required: [sku, quantity]
      properties:
        sku:
          type: string

That begins a schema component. Anything else in the document can reference it. References use $ref, and the string after it points at another location, either inside the same document or in an external one.

properties:
  sku:
    type: string
  quantity:
    type: integer
    minimum: 1
additionalProperties: false

The rest of the object finishes the request schema. quantity joins sku under properties. additionalProperties: false sits next to properties, so it covers the whole object. quantity has to be an integer of at least 1. Any field other than sku and quantity gets rejected, as long as the validator enforces that keyword.

Now the handler has a concrete structure to work against.

app.post('/orders', async (req, res) => {
  const input = validateCreateOrder(req.body);
  const order = await createOrder(input);
  res.status(201).json(order);
});

validateCreateOrder() is the runtime gate. You can generate the TypeScript type from the schema, write it by hand next to it, or infer it through a library. The runtime check is what decides whether the real request body reaches application code.

Traced end to end, the flow is mechanical.

POST /orders
  -> OpenAPI operation: createOrder
  -> request schema: CreateOrder
  -> validator function
  -> handler input
  -> response schema: Order

Every one of those arrows can drift apart. Maybe the route path changes while the OpenAPI document stays old. A schema might accept a field that the handler ignores, or the handler might start requiring a field that the schema still marks optional. A generated client can keep calling an operationId that was renamed since. A response can return data that the response schema never mentioned.

Most contract work comes down to keeping those arrows pointed at the same thing.

An OpenAPI operation pointing at a request schema and a response schema with the handler between them, and the same document read by five independent tools. — One operation points at its request schema and its response schema, with the handler between them. The same inert document is read independently by a doc generator, a client generator, a server-type generator, a runtime validator, and a CI drift check. Each solid arrow is a place two artifacts can drift apart.

Reading an OpenAPI Operation

Most backend work needs far less of OpenAPI than the full specification. You only really need a handful of pieces, and the same handful covers almost every route. Those pieces are the path and its methods, the parameters, the body, the responses, and the shared components that everything references.

Start with a route containing a path parameter.

paths:
  /accounts/{accountId}/orders:
    post:
      operationId: createAccountOrder
      parameters:
        - $ref: '#/components/parameters/AccountId'

OpenAPI writes {accountId} into the path template. That value comes out of the route path, so the operation has to list it as a parameter. Parameters can sit in the path, the query string, headers, or cookies. For resource APIs you mostly deal with the first two.

The reusable parameter can live under components.

components:
  parameters:
    AccountId:
      name: accountId
      in: path
      required: true
      schema:
        type: string

The in: path field says where the value comes from. required: true fits a path parameter because the route template cannot match without that segment. The nested schema says the raw value should validate as a string. Whether the account actually exists, and whether this caller may create an order for it, is left to a later semantic check.

$ref is the reference keyword. In both OpenAPI and JSON Schema it points from one spot in the document to another schema or object. A local reference uses a fragment that starts with #.

#/components/parameters/AccountId is JSON Pointer syntax after the #. JSON Pointer is a small path language for picking a value out of a JSON document. Each / steps down one object key or array index. A key that contains / gets escaped as ~1, and a key that contains ~ gets escaped as ~0. Most specs keep those characters out of component names, which keeps the references easy to read.

The request body hangs off the operation.

requestBody:
  required: true
  content:
    application/json:
      schema:
        $ref: '#/components/schemas/CreateOrder'

OpenAPI keeps the HTTP body separate from path and query parameters, and that split matches what actually happens in Node. The route matcher captures the path values, the URL parser hands you the query values, and the body parser turns the raw bytes into a JavaScript value. Validation then runs against each source under its own rules.

Responses use status codes as keys.

responses:
  '201':
    description: Created
    content:
      application/json:
        schema:
          $ref: '#/components/schemas/Order'

The quotes around '201' are YAML hygiene. Status codes are object keys here, and quoting keeps them as strings instead of numbers. OpenAPI also supports default for a catch-all response, but listing the explicit success code and the known error codes tends to produce better generated clients and docs.

An operation can also carry examples, tags, summaries, callbacks, links, security requirements, and more. Those fields become useful in bigger systems. For validation work you can start from the smaller set above and pull in the rest only when a specific route needs it.

The OpenAPI document is still only data. It does nothing until something reads it.

Different tools read it for different reasons. A documentation generator turns it into endpoint docs. A client generator produces typed functions for consumers. From the same file, a server generator can emit route stubs or server-side types, and a validator can pull out the schemas to check runtime values against them. CI uses it too, comparing the spec against your examples or your handler metadata.

Two tools can read the same document and still disagree on edge behavior when their OpenAPI version, JSON Schema dialect, or configuration differs. Pin those choices in the repo, and treat the OpenAPI version and validator settings as part of the contract itself.

Version, Dialect, and Media Type Choices

The openapi: line at the top of the document looks like a formality. It is not. Which version you write there changes how every tool interprets your schemas.

OpenAPI 3.0 and 3.1 look close in most backend specs, but their schema behavior differs in ways you will hit. 3.0 uses an OpenAPI Schema Object, which descends from JSON Schema but adds OpenAPI-specific fields like nullable. 3.1 lines its schema vocabulary up much more directly with JSON Schema. A service runs fine on either one, as long as the validator, generator, linter, and documentation renderer all assume the same version.

A version mismatch usually surfaces in small places. One tool accepts type: ["string", "null"] where another only understands nullable: true. A generator might treat format as a real validation claim while the next tool down the line treats it as a documentation note. A linter can reject a keyword that some other validator just ignores. The schema reads fine the entire time, and two tools end up producing different runtime behavior from it.

Write the version at the top of the document and make every tool in the chain assume that same version.

openapi: 3.1.0
info:
  title: Orders API
  version: 1.4.0
paths: {}

The info.version field is the document's own version, the revision of your API contract that this file represents. It can match the service's package version or track the public contract on its own schedule. The openapi field is unrelated. It names the specification version, the set of OpenAPI rules a tool uses to parse the document. They answer two genuinely different questions, so keep them straight.

Media types are part of the contract too. A request body schema under application/json applies only to JSON input for that operation. A different media type can carry a different schema, or the same request expressed in a different representation.

content:
  application/json:
    schema:
      $ref: '#/components/schemas/CreateOrder'

That fragment says the operation accepts JSON that matches the CreateOrder schema. It also implies how the Node service parses the body. By that point the HTTP layer has already parsed the headers, the body parser has picked JSON because the request Content-Type matches the route's accepted media type, and the schema validator runs against the parsed value.

If the service accepts both JSON and form data, the operation should show both.

content:
  application/json:
    schema:
      $ref: '#/components/schemas/CreateOrder'
  application/x-www-form-urlencoded:
    schema:
      $ref: '#/components/schemas/CreateOrderForm'

Those two inputs can land as different JavaScript values before validation ever runs. JSON gives you real numbers and booleans, while form fields all arrive as strings, and arrays or nested objects depend on the form parsing convention you use. Whatever the validator checks has to match what the parser actually produced. You can share one conceptual structure across media types, but the runtime parser decides the first value the validator sees.

Responses work the same way. One route can return application/json for normal clients and text/csv for export clients. The operation should describe the representation it returns for each media type, because generated clients and docs need to know which bytes are coming.

content:
  application/json:
    schema:
      $ref: '#/components/schemas/Order'

OpenAPI can describe a lot of response metadata. Keep the response schema tied to what the client actually receives. The internal database row, the domain object, and the HTTP response body can each have different fields. The schema describes the HTTP response, nothing further back than that.

Parameters have style and explode rules. Those describe how arrays and objects show up in paths and query strings. Most teams inherit the defaults their tools pick, where query arrays become repeated values or comma-separated text, objects expand into key/value pairs, and path values stay simple segments. The rule you land on changes how parsing works. A generated client has to use the same rule the server does.

parameters:
  - name: include
    in: query
    schema:
      type: array
      items:
        type: string

That says include is an array in the contract. The wire format still needs a convention. Many OpenAPI tools default query arrays to repeated parameters like ?include=items&include=totals. Node parsers disagree on what to do with the repeats. One collapses them to a single value, another preserves the array, and a third gives you only the last value. Either normalize the parser output at the route before validation, or use a parser whose behavior you have pinned down.

⚠️

Warning

A generated client serializes array query parameters according to the OpenAPI style and explode rules. Your server parses them according to its runtime parser. When those two disagree, the same input can come through as ["a","b"], as only "b", or as the literal string "a,b". Normalize the parser output to the contract before validation, otherwise one request behaves one way from the SDK and a different way from a raw HTTP call.

The contract starts to split apart when these choices live only in framework defaults. The generated client serializes by the OpenAPI rules and the server parses by its runtime, so the two need the same value model or they drift.

Keep components public. A component schema is reusable because several operations share the same public structure, which means it should model the API representation that consumers see. The moment you reuse an ORM model as a component, internal fields ride along into the contract with it. Soft-delete timestamps, internal account IDs, audit metadata, and field names that follow your table conventions all leak straight into the published schema.

Component names turn into public vocabulary too. CreateOrder, Order, OrderLine, and Money read as sensible generated type names. OrderDtoV2Internal leaks your implementation history straight into client code. These names show up in SDKs, docs, validation errors, and examples. Treat them as API names from the start.

A useful default is to keep request and response components separate unless they genuinely share the same structure.

CreateOrder:
  type: object
Order:
  type: object

A create request usually has no server-assigned fields, while a response carries identifiers, timestamps, status, links, or computed totals. Update requests often need even more separation, since a partial update has a different set of required fields. Sharing a component for a nested value like Money or Address is fine. Top-level request and response schemas usually do better as separate named schemas.

JSON Schema as Runtime Data

Strip away the tooling and a JSON Schema is structured data that describes other data. In this subchapter it is a JSON-compatible object that your validation code reads at runtime and runs against a value.

JSON Schema gives validation rules names, and those names are its keywords. type, required, properties, items, enum, minimum, maxLength, and additionalProperties are all keywords. A validator reads them and checks a value against each one.

{
  "type": "object",
  "required": ["sku", "quantity"],
  "properties": {
    "sku": { "type": "string" },
    "quantity": { "type": "integer", "minimum": 1 }
  }
}

The schema is data, nothing more. You can store it in an OpenAPI document, pass it to a validator, generate it from another source, or compile it into a JavaScript function. None of those actions checks anything by itself. The runtime check happens only when code feeds an actual value to a validator.

For the request body, the actual value might look like this.

{
  "sku": "book-1",
  "quantity": "2",
  "coupon": "SUMMER"
}

Look at that value against the schema and you can see three separate situations at once. The sku field matches the string rule cleanly. The quantity field is a string where the schema asked for an integer. And coupon is an extra field, which only counts against the request when the schema closes the object. What happens next is up to the validator's configuration. It decides whether coercion rewrites "2" into 2, whether the extra field gets rejected, stripped, or kept, and whether any declared defaults get inserted.

Structural validation looks at how the data is put together, plus the local constraints on each value. That covers object fields and required keys, primitive types, the rules for array items, enum membership, string length, and numeric ranges. Format checks fall in here too, when you turn them on.

Semantic validation checks the facts a schema can never know on its own, like whether the account exists, whether the SKU is in that account's catalog, and whether the caller is allowed to create an order of this size. None of those answers live in the request structure. They need application state, external services, or database reads, which is why they run after structural validation has already handed back typed, bounded input.

That separation is what keeps the handler readable.

const input = validateCreateOrder(req.body);
const account = await loadAccount(req.params.accountId);
await assertCanCreateOrder(account, input);

The first line is structural validation. The next two are semantic, and they run in that order on purpose. By the time application code touches the body, its local structure is already checked, so the later code can think about business facts instead of guarding every property access.

What the Validator Actually Runs

From the outside, JSON Schema validation has a tiny contract. Value in, pass or fail out, usually with a list of errors. Behind that small surface, a runtime validator does a lot of document work before the first request ever reaches your handler.

It starts with a schema graph. One operation may point at #/components/schemas/CreateOrder, and that schema may point on at Address, Money, or OrderLine. The $ref edges connect those pieces. Before validation can run fast, the tool has to resolve every reference, catch any missing targets, and work out which schema dialect or OpenAPI schema rules apply.

A local $ref is a JSON Pointer into the current document. An external one can point at another file or a URL, depending on tool policy. Plenty of production services bundle external references at build or startup time, because fetching schemas over the network during validation turns into a deployment dependency you do not want. Bundling gives the process one complete schema graph the moment it starts.

Once references resolve, the validator usually builds an internal representation. Some interpret the schema object fresh on every call. Many Node validators instead compile the schema into a JavaScript function. Ajv is the common one, and a lot of frameworks wire it in, but the idea does not depend on any single package. Compilation turns schema keywords into executable branches.

For the CreateOrder body, a compiled validator might run checks in roughly this order.

input is object
  -> required keys: sku, quantity
  -> sku is string
  -> quantity is integer
  -> quantity >= 1
  -> extra-key policy

Each failure can carry a path to the exact spot. A missing quantity points at the object itself, a string where quantity should be an integer points at /quantity, and a bad item deeper in a larger schema might point at /items/3/sku. Those paths are what Subchapter 04 builds on, turning them into client-facing validation responses.

Composition adds another layer. The word just means a schema built out of other schemas, joined with allOf, anyOf, or oneOf. allOf requires the value to satisfy every listed schema. anyOf accepts the value if at least one schema matches. oneOf accepts it only when exactly one matches. Each of those carries its own runtime cost and its own error-reporting behavior.

{
  "allOf": [
    { "$ref": "#/components/schemas/BaseOrder" },
    { "required": ["quantity"] }
  ]
}

That schema glues a base order schema to one extra requirement, and the validator evaluates both. In real OpenAPI documents, composition usually shows up when a team wants to reuse common fields across request and response models. Reuse works as long as the structure really is shared between them. The trouble starts when create, update, and read representations actually need different fields, because then the composition is hiding accidental coupling that nobody chose on purpose.

oneOf is the one to watch. It looks perfect for payload variants, but at runtime the validator has to prove that exactly one branch matches. Two overlapping branches make valid input fail for matching more than once, and a payload that matches nothing produces an error for every branch, which buries the real cause. When the variants are part of the public contract, add a discriminator field so both validators and generated clients have one stable field to select the branch from.

⚠️

Warning

oneOf passes only when exactly one branch matches. Overlapping branches reject valid payloads for matching twice, and a total miss produces an error per branch that buries the actual cause. Always pair public oneOf variants with a discriminator field so validators and generated clients select the branch from one explicit value instead of guessing from which properties are present.

Formats sit in a gray zone. A schema can say format: "date-time" or format: "email", but what happens next depends on configuration and on which format packages are installed. One tool treats a format as a hard assertion, the next treats it as a documentation note, and a third does nothing with it until you add an extra package. Pin the format behavior your code relies on instead of assuming every consumer reads formats the way you do.

Defaults are validator policy too. A schema can declare a default, but the validator only inserts it when you configure that behavior on. A documentation tool, meanwhile, will happily display the default even though runtime validation leaves the missing field untouched, and that mismatch is a steady source of bugs. When the service does fill defaults in during validation, write tests that assert the handler actually receives those values. If the default exists for documentation only, have the handler set its own value explicitly.

Coercion is another policy that mutates. Query parameters always arrive as strings. Some body values arrive as strings just because the client sent them that way. With coercion on, a validator might turn "2" into 2, "true" into true, or a lone value into a one-item array, depending on its rules. That makes the API more forgiving, and it also hides client mistakes and makes generated clients and direct HTTP calls behave differently from each other.

Unknown fields round out the policy. An object schema can accept extra keys, reject them outright, or strip them before the handler runs. If you keep them, forward compatibility gets easier for some consumers, at the risk of a handler quietly depending on fields that were never in the contract. Rejecting them keeps the public structure tight. Stripping them gives the handler a clean object but tells the client nothing about what got dropped. Pick one policy per check point and make every exception to it visible.

Performance here comes from compiling once and reusing. Compile the schemas at startup or at route registration, then reuse the validator function on every request. Compiling on every request wastes CPU and allocates throwaway structures, and it turns a contract mistake into request latency. Doing it once at startup catches an invalid schema early and leaves each route with one stable validation function.

What the validator hands back should be predictable.

const result = validate(req.body);
if (!result.ok) return badRequest(result.errors);
return handler(result.value);

result.value is the value after policy has run. Depending on configuration it can be the original object, a clone of it, or the original object mutated in place. That single detail is easy to miss and painful to debug later. When validation strips fields or applies defaults in place, every middleware further down the chain sees the modified data. A validator that returns a fresh object instead means the handler has to read that returned value rather than the original req.body.

⚠️

Warning

Find out whether your validator mutates the input object or returns a new value, then standardize on one. In-place mutation means every middleware after the check sees stripped fields and applied defaults. A returned clone means any code still reading the raw req.body skips coercion completely. Mix the two and your logs and your handler end up disagreeing about what the request actually contained.

Schema validation is runtime code, so give it the same discipline you give parsing, routing, and database queries. Compile it once, pin the versions it depends on, and write tests for what actually happens at the check point. Read the error paths it produces, and keep its mutation policy out in the open.

Schemas Under Real API Pressure

JSON Schema stays simple right up until a route needs partial updates, optional fields, nullable fields, arrays, and nested objects. Once those show up, the schema is carrying real API policy on top of describing the data.

Required is an object-level decision. A field's own schema describes what the value must look like when the field is present, while the object's required array decides which fields have to be present at all.

{
  "type": "object",
  "required": ["sku"],
  "properties": {
    "sku": { "type": "string" }
  }
}

That schema requires the sku key. If sku is there, it has to be a string. Any other property needs its own entry. This split shows up clearly on update routes, where a partial update accepts a field when it is present yet still treats it as optional.

{
  "type": "object",
  "properties": {
    "quantity": { "type": "integer", "minimum": 1 }
  }
}

That schema accepts an object with no quantity key at all, and still checks the value when the key is present. For a PATCH-style route that behavior is usually exactly what you want. A PUT-style replacement is different, since it usually wants the full representation. Match the schema to the resource operation that the route and method actually describe.

Nullable is its own decision. A missing field and a field set to null are two different things. Missing means the key is absent, while null means the key is present and holds the JSON null value.

{
  "type": ["string", "null"]
}

That schema takes either a string or null. It says nothing about whether the surrounding object requires the field. Required and nullable are two separate decisions. Generated clients lean on the difference, because in many languages an optional property and a nullable property compile to different types.

An array gives you two things to describe, the container and the items inside it.

{
  "type": "array",
  "minItems": 1,
  "items": { "$ref": "#/components/schemas/OrderLine" }
}

The array can demand at least one entry. Each entry then validates against its own schema. A validator should report errors with item indexes, because a client needs to know which element broke. /lines/2/quantity tells them far more than "body is invalid".

Enums are public vocabulary. A schema like { "enum": ["pending", "paid"] } hands generated clients a set of literal values and hands validators a closed set to check against. Adding a new value can break any consumer that switches exhaustively over the old set, and removing a value breaks clients that still send or expect it. Either way an enum change is API evolution, because it changes both the generated code and the runtime validation.

Go easy with string patterns. A regex can check local structure, such as a short identifier prefix, and it should stay readable enough that a reviewer can see what the contract claims at a glance. Once a value needs database state, external policy, or real domain logic, move that check into semantic validation. A schema regex is good for rejecting malformed text, but the actual business rule belongs in application code.

additionalProperties controls how open an object is. For map-like objects, it also describes the value schema behind arbitrary keys.

{
  "type": "object",
  "additionalProperties": { "type": "string" }
}

That schema describes an object where the caller chooses the keys and every value has to be a string. For a fixed request body, additionalProperties: false does the opposite and closes the object. The same keyword expresses two very different contracts depending on its value, so read it in context every time.

Use composition for public reuse or public variants, and keep your internal inheritance trees out of it. A CreateOrder schema built from BaseOrder, AuditedEntity, and TenantScopedRecord is leaking application structure through the API. A PaymentMethod schema that uses oneOf over card, bank_account, and wallet can be a clean public contract, as long as each variant has an obvious discriminator.

{
  "oneOf": [
    { "$ref": "#/components/schemas/CardPayment" },
    { "$ref": "#/components/schemas/WalletPayment" }
  ]
}

That contract needs its variants kept apart. Add a field like type whenever a consumer has to choose a branch. Then generated clients can produce safer unions, validators can produce better errors, and server code can dispatch on a value that actually exists in the representation instead of guessing from which fields happen to be set.

Schema reuse should stop at the line between one representation and the next. A single domain object can look one way in memory, take a different form in a database row, another in a cache entry, another on a queue, and another again in the HTTP API. The JSON Schema in this subchapter only governs the HTTP edge. Event payload registries, database schemas, and GraphQL schemas each get their own chapters or subchapters.

Contract-First and Code-First

Teams usually pick one of two workflows, then borrow pieces of the other.

Contract-first design starts from the OpenAPI document. You edit the path, the operation, the request and response schemas, and the examples, and that document is the source of truth. From there a generator can produce an API client for consumers, plus server types or route stubs for the Node service.

Code-first generation runs the other way around. Here the application code is the source, the route declarations, decorators, schema objects, and typed handlers. A tool reads that code and emits an OpenAPI document from it.

Both workflows work in practice, and each one has its own characteristic way of going wrong.

Workflow	Source	Typical output	Main pressure
Contract-first design	OpenAPI document	clients, server types, docs	implementation catches up to the contract
Code-first contract generation	route code and schema metadata	OpenAPI document, docs	emitted contract matches runtime behavior

Contract-first pays off when several teams need to agree before anyone writes the implementation. Mobile apps, frontend teams, partner integrations, and platform SDKs can all build against the contract while the route is still a stub. The server team implements the route later and wires runtime validators into the same schemas.

What contract-first does not give you for free is a handler that still matches the contract. A generated server type can say the handler receives CreateOrder, but that promise means nothing until a runtime validator actually checks the body. A route stub can sit in place while the real handler returns something else, and the document can describe an operation that has no registered code behind it at all.

Code-first fits Node services well, where the routes and schemas already sit in the same file.

route({
  method: 'POST',
  path: '/orders',
  body: CreateOrderSchema,
  handler: createOrderHandler,
});

Now the contract metadata lives right next to the handler registration. One generator emits OpenAPI from it, the validator reuses the same schema, and a TypeScript type gets inferred from that same schema object. The pressure moves off hand-maintaining a separate document and onto your framework conventions and how much the generator actually covers.

The cost is that a generator leaves things out without telling you. A thrown error never shows up in the OpenAPI responses. A serializer might add fields the schema never declared. A route-level helper can encode query behavior the generator cannot see, or hand-written middleware can rewrite req.body after validation while the document still describes the pre-mutation structure.

Schema drift is the name for what happens when the contract artifacts stop agreeing. It can open up between the OpenAPI document and the handler code, between JSON Schema and TypeScript types, between a generated client and the deployed server, between examples and schemas, or between response validation and what the serializer actually emits.

A generated API client raises the price of drift. Generate one from the OpenAPI operation and the consumer gets a method like this.

client.createOrder({
  sku: 'book-1',
  quantity: 2,
});

That method trusts the published request schema completely. Suppose the server starts requiring warehouseId while the document stays old. The client still compiles and still sends requests the server now rejects. The reverse drifts too. Put warehouseId in the document while the deployed server ignores it, and the client looks correct against the contract while the backend runs an older version of it.

Generated server types push from the other side. They type the handler input and output to match the contract.

async function createOrder(input: CreateOrderRequest) {
  return saveOrder(input.body);
}

That type nudges the implementation toward the contract, but it validates nothing on its own. Runtime validation is still what guards external data, because TypeScript checks source code while validators check actual values. Chapter 8 covered how TypeScript execution discards its types, and the short version is that the types can vanish before Node runs a single line. That holds here too, even on a project that type-checks with a full build step.

Most real setups mix the two. One team writes schemas in code, generates OpenAPI, generates clients from that OpenAPI, then runs a CI check that diffs the generated file against the committed one. Another team edits OpenAPI first, generates server types, and runs route-level tests against a local server. The workflow you pick counts for less than two rules underneath it. There should be one declared source per artifact, and there should be a check that catches drift when it happens.

Request Validation Boundaries

There is exactly one good place for request validation, and that is where external data first enters the service. It runs after parsing and route matching, and before any application code reads a value.

Take POST /accounts/{accountId}/orders?dryRun=true. Several inputs land at that one check point.

path params: accountId
query params: dryRun
headers: idempotency or content negotiation fields
body: CreateOrder JSON value

Each source arrives in a different form. Path parameters come through as strings the route matcher captured. Query parameters arrive as strings, or repeated strings, pulled out of the URL. Headers are field values that Node's HTTP layer already parsed, with case-normalized access that depends on which API you use. And the body is whatever the body parser produced from the raw bytes.

Validate each source on its own terms. A path-parameter schema can require a specific string pattern. A query schema can read booleans and numbers under an explicit coercion policy. For headers, the schema mostly confirms the required ones are present, and the body schema checks the object structure itself.

{
  "type": "object",
  "properties": {
    "dryRun": { "type": "boolean", "default": false }
  }
}

That schema does nothing until the validator runs it against parsed query data. Query strings arrive as text, so ?dryRun=false reaches the validator as the string "false". With coercion on, the handler can receive an actual false. With it off, the request either fails or passes raw text through to the handler, depending on how you wrote the check.

Coercion should be explicit and written down somewhere. It is fair to coerce query booleans, query integers, and route IDs that the contract genuinely calls numeric. It is not fair to coerce payload fields where a string and a number carry different meanings. A sku of "00123" has to stay a string, and a coercion rule that turns it into 123 has just broken the resource representation.

🚨

Caution

Never set type: integer on an identifier field, and never leave coercion on for one, even when the value is all digits. Coercing "0017" to 17 drops the leading zero, and now two separate resources collapse to the same key. Lookups, dedup, and idempotency keys all break, with no validation error to show for it. Model identifiers as strings with a pattern. Keep numeric coercion for fields the contract really does treat as numbers.

Defaults need the same clarity. The policy here decides whether a missing input field gets filled in before the handler runs. In request bodies that can catch clients off guard, because the server is now inventing data the client never sent. In query parameters it usually goes down easier. limit defaults to 50, dryRun defaults to false, include defaults to an empty list.

Handler code should read the post-validation value the check returns, not the raw request objects.

const input = validateRequest({
  params: req.params,
  query: req.query,
  body: req.body,
});

That single call makes the check explicit. Its return value can hold parsed params, coerced query values, a checked body, and an error list when something fails validation. The handler then works from one object with one contract.

Unknown-field policy is the one most teams skip and later regret. For request bodies, three options show up.

Policy	Runtime behavior	Consequence
pass through	extra keys remain	handlers may depend on undocumented fields
strip	extra keys are removed	clients receive less feedback
reject	request fails	public structure stays tight

Which one is right depends on the API. Public create and update endpoints usually want to reject unknown fields, so a client finds out about a typo right away. Internal APIs sometimes strip extras to make rolling deploys smoother. Pass-through fits when the service deliberately stores opaque metadata, and even then the contract should name that object as an extension point.

Structural validation finishes before semantic validation starts.

const input = validateCreateOrder(req.body);
const sku = await catalog.findSku(input.sku);
if (!sku) return notFound('sku');

The schema confirms sku is a string, and the catalog lookup confirms the SKU actually exists. Keeping those two checks separate gives you better errors. Malformed structure comes back as a validation error, while a well-formed request for a SKU that does not exist comes back as a normal application response.

Security-specific input checks get their own chapter, but where you put them does not change. A service should reject or normalize external input before it ever touches a database query, a file path, a shell command, template rendering, or a downstream request. This subchapter stays on contract structure, since the broader attack model is a chapter of its own.

Coercion and Default Policies

Coercion changes one runtime value into another before the handler ever sees it. That one fact makes it part of how the API behaves, not just a parsing detail.

The usual case is query strings. The URL gives you text and nothing but text. Your query contract, though, might want booleans, integers, string arrays, or enum values. A validator can take the raw parser output and hand back the value the handler wants.

?limit=25&dryRun=false
  -> { limit: "25", dryRun: "false" }
  -> { limit: 25, dryRun: false }

It looks like a small conversion. It becomes contract behavior the moment a client depends on it. A generated client might hold a real boolean and serialize it as false, while a raw HTTP client just sends the text false. The server has to treat both the same way, following the OpenAPI parameter style and the validator settings.

Keep the rules narrow. limit becomes an integer because the query contract calls it one, and dryRun becomes a boolean for the same reason. A string identifier stays text even when every character is a digit. accountId=0017 and accountId=17 can be two different values when the identifier is a string, and the schema should say so.

{
  "type": "string",
  "pattern": "^[0-9]{4}$"
}

That schema holds "0017" as text and checks its structure. A numeric schema would allow coercion to 17, which changes the identifier and gives you no error to notice it by.

Set a higher bar for body coercion. JSON already carries booleans and numbers natively, so a client that sends "quantity": "2" in a JSON body chose to send a string. A forgiving API can coerce it, but then the contract should spell that out through tests and validator settings. Plenty of public APIs land on strict body validation with limited query coercion, which keeps the query strings convenient while the JSON payloads stay precise.

Defaults create the same kind of pressure. Three different layers can each supply one. The OpenAPI document can show a default, the runtime validator can insert it, and the handler can apply its own, and all three need to agree on the value.

⚠️

Warning

Declaring default in a schema does not make it happen. Documentation tools render it, but the validator inserts it only when you turn that behavior on, so the handler often sees undefined while the published docs promise a value. Either enable default-application and test that the handler actually receives it, or treat the schema default as documentation only and set the value explicitly in code.

{
  "type": "integer",
  "minimum": 1,
  "maximum": 100,
  "default": 50
}

For a limit query parameter, filling the default during validation works well. The handler gets limit: 50 whether the client passed the parameter or left it out. Logs and metrics record the final value either way.

In a request body, defaults surprise people more. A missing priority field might mean "normal priority" in the contract. Or it might mean "the caller left it unset, and the application decides based on account policy." The schema can declare the default value, but the service still needs an explicit rule about where that value gets applied.

Watch the mutation policy. Some validators rewrite the original object as they apply defaults or drop extra fields. Others return a clone, or a wrapper result. Either way, the order your middleware runs in suddenly becomes something you can observe.

const checked = validateRequest(req.body);
audit(req.body);
handle(checked.value);

If validation mutates req.body in place, the audit call sees the post-validation object. A validator that returns a separate value leaves the audit call looking at the raw parsed body instead. Pick one convention and keep it visible. Hidden mutation here produces logs and metrics that are painful to debug, because different middleware ends up reading different data.

Tie the unknown-field policy back to how the API evolves. Rejecting extras catches typos and keeps the contract tight. Stripping them lets clients send future fields during a staggered deploy, at the cost of hiding the occasional integration bug. Passing them through is fine when the schema names an explicit extension point for that data.

{
  "type": "object",
  "properties": {
    "metadata": {
      "type": "object",
      "additionalProperties": { "type": "string" }
    }
  }
}

That field gives clients a spot for extra key/value data while the rest of the request stays closed. An explicit extension field is far safer than accidentally accepting unknown keys everywhere.

The request check should hand back a value whose policy your tests can pin down.

raw body
  -> parse JSON
  -> validate structure
  -> apply allowed coercions
  -> apply allowed defaults
  -> handle unknown fields
  -> semantic checks

That order changes the outcome. If defaults go in before validation, they can satisfy required fields on their own. Run them after validation and the required fields stay tied to what the caller actually sent. Strip unknown fields before the semantic checks and you remove data the application code might otherwise have used. A type-directed coercion policy can even make { type: "number", enum: [1] } accept "1", because the validator converts the value to 1 before it checks the enum. Write all of this down in code as one route-level policy, rather than leaving the assumptions scattered across handlers.

Response Validation and Serialization

Everything so far has been about data coming in. Data going out drifts just as easily.

The handler returns an object, and the serializer turns that object into bytes. The OpenAPI response schema is what tells clients which fields to expect. Response validation checks the object against that schema before serialization, or checks the serialized output afterward, depending on the framework and tool.

const order = {
  id: 'ord_1',
  sku: 'book-1',
  internalMargin: 0.41,
};
res.status(201).json(order);

If the response schema lists only id and sku, that extra internalMargin field is a contract bug. The client receives every field the serializer writes, and different clients do different things with the surprise. One ignores it, a generated client parses and drops it, and a human integrator copies it straight into their own code. By that point the published contract and the bytes on the wire have already split apart.

🚨

Caution

Whatever the serializer writes reaches the client, schema or no schema. Hand res.json() a raw database row or domain object and you can ship internal-only data along with it, including margins, cost fields, soft-delete timestamps, internal account IDs, and audit columns. Map the domain object to an explicit API representation before you respond, and set additionalProperties: false on the response schema so that a column added later cannot ride out onto the wire.

Response validation catches that split.

{
  "type": "object",
  "required": ["id", "sku"],
  "additionalProperties": false
}

A validator enforcing that schema rejects the extra field. During development and testing, that failure is exactly what you want. In production, failing a response after the application already finished its work gives the client a worse experience than just sending the data. Teams usually settle on one of three patterns. They validate responses in tests, validate selected routes in staging, or use compiled serializers that emit only the fields the response schema declares.

Serialization contracts also decide nullability. A database record can have a nullable column. The API representation can omit the field, return null, or return a concrete fallback. Those are three different contracts, and the schema has to say which one clients should expect.

{
  "type": ["string", "null"]
}

In OpenAPI 3.1-style schemas, that union accepts a string or null. Older 3.0 documents tend to write nullable: true instead. Pick the syntax your specific version supports, then test it against the actual validator and generator you run.

Response validation also guards generated clients. A generated client might type order.id as a string because the OpenAPI schema marks it required. Return { sku: "book-1" } from the server and the client code still compiles, even though the runtime data has no id at all. A contract check should catch that long before a consumer does.

There is a real performance cost here. Validating every response is extra CPU work, spent after the handler has already finished its job. For high-traffic routes, a compiled serializer can enforce the schema by construction, reading it once and then emitting only the declared fields in the declared order. Subchapter 03 covers the framework-level serializer paths. What does not change is that the outbound response structure is part of the API contract, so response data needs its own check on the way out.

Response schemas also decide which body belongs to which status code. A 201 from order creation returns the created Order. A 400 returns a validation error envelope, whose design Subchapter 04 covers. A 404 for a missing SKU might use a different envelope again. The operation should list every response a client has to handle.

responses:
  '201':
    $ref: '#/components/responses/CreatedOrder'
  '400':
    $ref: '#/components/responses/ValidationError'

Reusable response components keep repeated error bodies consistent. They still need to point at schemas that describe real response bodies. A named response component is part of the contract instead of a documentation shortcut.

Response validation catches accidental structure changes, and it also exposes fuzzy representation decisions. Say the database returns a timestamp as a Date object, and JSON serialization turns that into a string. The response schema should describe the string, because that is what clients receive. If you validate before serialization, the validator sees a Date while the schema expects a string. Move the check after serialization and it sees the serialized JSON value instead. Either way, the service has to commit to one validation point.

const body = toOrderRepresentation(order);
validateOrderResponse(body);
res.status(201).json(body);

That ordering is explicit. First the domain object turns into the API representation, then validation checks that representation, and serialization writes the response last. The handler never hands raw domain state to the serializer and hopes the output happens to match the public schema.

Response schemas should account for partial data too. A list endpoint might return a compact representation while an item endpoint returns the full detail. Both are Order resources, but their representations differ. Give them separate names when clients need different fields.

OrderListItem:
  type: object
OrderDetail:
  type: object

Shared names only make generated clients nicer when the representation really is shared. Reuse Order on every route and you create pressure to mark every field optional, which weakens the contract. Separate response components keep required fields meaningful.

TypeScript Types and Runtime Schemas

It is tempting to think a TypeScript type and a JSON Schema are two views of the same check. They run at completely different times.

A TypeScript type describes source code to a checker, while a JSON Schema describes runtime values to a validator. Generated server types can connect the two, but the jobs stay separate.

type CreateOrder = {
  sku: string;
  quantity: number;
};

That type helps a developer call createOrder(input) correctly, and it catches source-level mistakes before a build ships. The runtime validator does something different. It inspects the JSON request body after Node has parsed it.

The split gets sharper once you bring in direct Node TypeScript execution from Chapter 8. Node strips the type-only syntax and runs the JavaScript that is left. Even on a project with full tsc checks, production takes in values from clients, databases, caches, queues, and downstream services. Every one of those arrives outside the TypeScript checker.

📌

Important

A generated CreateOrderRequest type checks your source, not the request. Node strips types before it runs, so the body, query, headers, and downstream responses all arrive from outside the checker. A typed handler with no runtime validator accepts whatever the client sends. Keep one runtime validator at every point where external data enters, no matter how thoroughly the code type-checks.

Incoming bytes reaching two paths: a compile-time TypeScript type that is erased and does nothing at runtime, and an Ajv validator that gates the actual bytes before the handler. — A generated TypeScript type checks your source and is erased before Node runs, so it gates nothing at runtime. Only the Ajv validator runs on the actual bytes, which is why a typed handler still needs a runtime validator at the edge.

Types derived from the schema are still useful.

type CreateOrder = FromSchema<typeof CreateOrderSchema>;

That pattern asks tooling to derive the TypeScript type straight from the runtime schema, which cuts out one duplicate declaration. The handler uses CreateOrder, the validator uses CreateOrderSchema, and the runtime schema stays the thing that actually gates incoming data.

The reverse direction works too. You can generate JSON Schema or OpenAPI from TypeScript types, and for simple structures that is fine. The trouble shows up when TypeScript expresses something that the JSON Schema generators map poorly, or when runtime validation needs policy that no type can carry, like coercion, defaults, unknown-field handling, formats, and serialized response structure.

Literal unions usually map cleanly.

type OrderStatus = 'pending' | 'paid' | 'cancelled';

A generator can emit an enum schema from that.

{ "enum": ["pending", "paid", "cancelled"] }

Other TypeScript features need careful generator rules. Branded types, conditional types, generic helpers, and inferred framework context can give a developer a pleasant model while the runtime checks behind it stay weak. When the contract has to be correct, read the emitted schema yourself. Treat generation as a build step whose input and output you can both see.

Decide on purpose how generated client and server code enters the workflow. One team commits generated clients so the diffs stay visible in review. Another publishes them as packages from CI. A third generates them during the consumer's own build. Each choice carries its own failure mode. The committed client goes stale, the CI-published one can get ahead of the deployed server, and the consumer-generated one can run a different generator version from the one the service team uses.

The contract file should name the public operation clearly enough for generated code to use.

operationId: createAccountOrder

A stable operationId becomes a method name across many clients. If you rename it, you break consumer code even when the path and payload never moved. Once clients generate from these ids, the ids themselves are public names.

Generated server types help a route handler stay aligned with the OpenAPI document. They can type the path params, the query params, the body, and the response payload. The runtime still needs validators where input enters, but the types cut down mistakes inside the handler. A handler that returns the wrong response structure can fail type-checking before response validation ever runs.

All of this can produce false confidence. A generated type can make a handler look correct even though the route registration wired the wrong schema to it. A generated client can make a request look correct even though the deployed service still runs an older contract. The types only ever help the source code, and keeping the running system aligned is a separate job that belongs to deployment checks and runtime validation.

The cleanest systems commit to one primary schema source. You might write the runtime schema first and derive TypeScript from it, or write typed route declarations first and generate OpenAPI from those, or write OpenAPI first and generate both clients and server types out of it. Any of those works. Running two primary sources at once is how you build drift in from the start.

Startup Wiring

A broken contract should stop the service from starting, not wait around to surface on a live request.

In a Node service, the moment to load and compile API schemas is startup, or route registration. Right there the process can read the OpenAPI document, resolve the $ref targets, compile the validators, attach them to routes, and refuse to listen at all when the contract is broken. A bad schema then takes down a deployment instead of surfacing as a request-time surprise.

const contract = loadOpenApi('openapi.yaml');
const validators = compileOperations(contract);

for (const route of routes) {
  route.validate = validators.get(route.operationId);
}

The snippet skips the library details to keep the lifecycle visible. The contract loads once, the operation validators compile once, and each route picks up the validator for its operationId. From then on, request handling just calls a function that is already built.

Check registration in both directions. Every public route needs a matching operation, and every operation meant for this service needs a route behind it. A missing validator counts as a startup error, because at that point the service has already lost the link between what it does at runtime and what it published.

for (const route of routes) {
  route.validate ?? failMissingContract(route);
}

The check is deliberately plain. What counts is the invariant behind it. Route metadata and contract metadata have to agree before any traffic arrives.

Startup is also the place to compile response validators or serializers. An Order response schema that references a component which does not exist should bring the service down before it accepts a single request. A schema using a keyword the chosen validator cannot handle should fail at startup the same way. And a parameter definition that uses a query-array style the parser layer never produces should fail during route setup, or get pinned by a test that runs through the same parser.

Hot reload and long-lived processes add one wrinkle. In local development, the contract files change while the server is running, so a watcher can recompile the validators and restart the process. In production, contract changes should go through the same deployment path as handler code. A running process that holds new handlers with old validators, or the other way around, is schema drift sitting in memory.

The route object should carry enough metadata to debug with.

{
  name: 'orders.create',
  operationId: 'createAccountOrder',
  validate: createAccountOrderValidator
}

Logs and validation errors can carry the route name and the operation ID, which makes a failure easy to locate. A 400 from request validation belongs to one specific operation. A 500 from response validation points at one specific response schema. The names tie a runtime failure straight back to the contract file.

Startup is also the point where generated server types and runtime validators come together. Type generation happened during the build, validator compilation happens here at startup, and the route metadata binds both of them to the handler. Inside a running Node process, that binding is what an API contract actually amounts to.

Drift Checks

Drift almost never arrives as a big obvious break. It starts with one small change that nobody propagates.

Say the handler starts returning an extra field.

return {
  id: order.id,
  sku: order.sku,
  warehouseId: order.warehouseId,
};

The response schema still lists only id and sku, the generated clients are out of date, and so are the docs. The real response carries warehouseId regardless.

Or the spec moves first.

required: [sku, quantity, warehouseId]

The server still accepts only sku and quantity. The generated clients now send the new field on every request, and the deployed handler ignores it. The contract looks like it moved, even though the runtime behavior never changed.

The drift checks that work are simple ones that run close to the artifacts.

OpenAPI document builds
  -> references resolve
  -> schemas compile
  -> generated clients update cleanly
  -> generated server types match handlers
  -> request examples validate
  -> response examples validate

CI can run those checks at this chapter's level, well short of a full contract-testing framework. Contract tests and consumer-driven contract tests get their own chapter later. Even at this layer, the simple checks catch a lot, like an unresolved $ref, an invalid schema keyword, a stale generated file, an example that fails validation, or a response fixture that drifted from the documented schema.

Runtime checks catch the rest as data moves through the service. Request validation rejects bad input before the handler runs, and response validation or schema-driven serialization catches outbound drift. A startup step can compile every route schema and fail early on an invalid document, while a generated-client check proves the published OpenAPI file still produces the package you expect.

A lot of this is habit more than tooling. When you change a route, you update its operation in the same commit. Moving a field from optional to required means the examples and generated types move with it. A change to a coercion setting comes with a test that shows the exact input and the value it produces. And a decision to strip or reject unknown fields gets tested right at the route.

The contract is never just one file. It is spread across the document, the schemas, the generated code, and the handlers, and the API only stays coherent as long as all of them keep describing the same runtime behavior.

One Route, Several Artifacts

Reading an OpenAPI Operation

Version, Dialect, and Media Type Choices

JSON Schema as Runtime Data

What the Validator Actually Runs

Schemas Under Real API Pressure

Contract-First and Code-First

Request Validation Boundaries

Coercion and Default Policies

Response Validation and Serialization

TypeScript Types and Runtime Schemas

Startup Wiring

Drift Checks

Related Reading