Lab 03 · Node Runtime Labs

Stream processing workbench.

Build a local Node.js workbench that generates NDJSON, processes it through whole-file and streaming paths, records backpressure, compresses output, cancels pipelines, and compares the evidence.

Read the phase plan Get the lab

14Phases

8.5/10Difficulty

0 depsPackage installs

Stream commandsWorkbench command path

01Generate
$ npm run generate -- --rows 1000 --payload-bytes 32
Creates repeatable NDJSON for small test runs.
02Baseline
$ npm run naive
Measures the whole-file implementation first.
03Stream
$ npm run stream -- --where "status >= 500"
Runs the streaming path with a real filter.
04Pressure
$ npm run pressure -- --slow-write-ms 5
Makes drain timing and buffer growth visible.
05Compare
$ npm run compare
Checks output parity and report differences.

Generate data, compare processors, then pressure the pipeline.

What this lab builds

The finished project measures stream behavior.

Controlled data generator

Generate predictable NDJSON with row count, payload size, error ratio, seed, output path, byte count, and a generator report.

Naive and stream processors

Build a whole-file baseline, then build a stream path with source adapters, line splitting, JSON parsing, filtering, plugins, and stringification.

Backpressure evidence

Simulate a slow writable destination, record queue length and drain events, and produce a pressure timeline with memory and throughput samples.

Comparison reports

Write generator, naive, stream, split-check, pressure, gzip, abort, and mode comparison reports without shipping a solution on the public page.

Complete phase plan

Every phase in Lab 03.

The build starts with a predictable CLI shell and ends with a comparison report across naive, stream, pressure, gzip, and highWaterMark variants. Each phase ships one visible capability and one artifact to inspect.

00Create the projectCreate the workbench shell with ESM enabled, Node v24+ declared, stable folders, npm scripts, help output, and strict flag handling.

Tasks

Create the package manifest with ESM mode and a Node v24 or newer runtime contract.
Separate source code, generated input, processed output, and reports into dedicated folders.
Create the overview CLI entry that prints available commands, default paths, and common flags.
Add scripts for workbench, generate, naive, stream, split-check, pressure, and compare.
Treat --help as a success path and unknown flags as clear failures.
Keep the first argument parser small enough to inspect branch by branch.

Artifact

A runnable project shell with predictable commands, paths, help behavior, and failure behavior.

01Generate large NDJSON dataGenerate data/access.ndjson with configurable row count, payload size, error ratio, seed, and output location.

Tasks

Create the generator entrypoint and connect it to the generate command.
Support row count, payload bytes, error ratio, seed, and output path options.
Emit deterministic rows with id, timestamp, method, path, status, IP, user agent, and payload fields.
Write each row incrementally through a file stream while respecting writable backpressure.
Report the generated line count and byte size after the writer finishes.
Write reports/generator.json with settings, output path, line count, and byte count.

Artifact

A repeatable NDJSON workload plus a generator report that later phases can cite.

02Build the naive baselineProcess the whole NDJSON file with readFile, filter error rows, write filtered output, and report the memory-heavy baseline.

Tasks

Create the naive processor and connect it to the naive command.
Support input, output, and where options with status >= 500 as the default filter.
Load the full input file, decode it, split it into lines, and count rows.
Parse every non-empty line, retain matching rows, and report parse failures with line numbers.
Write filtered NDJSON with the same logical format the stream path will produce.
Write reports/naive.json with rows, matches, elapsed time, bytes, and peak memory.

Artifact

A whole-file processor that intentionally retains the input, decoded text, line array, matches, and serialized output.

03Add measurement utilitiesCreate shared timing, memory, byte-count, and report-writing utilities so later modes can be compared.

Tasks

Create the measurement module used by every command that produces evidence.
Record wall-clock start and end timestamps plus monotonic elapsed milliseconds.
Sample RSS, heap used, heap total, external memory, and ArrayBuffer memory on an interval.
Track input bytes, output bytes, and mode-specific byte counters as raw numbers.
Write pretty JSON reports with stable top-level groups for mode, status, input, output, counts, timing, and memory.
Move the naive command onto the shared timer, sampler, counters, and report writer.

Artifact

One report shape shared by naive, stream, pressure, compression, abort, and comparison runs.

04Build source adaptersCreate compatible readable sources for regular files, stdin, and gzipped input.

Tasks

Create the stream command boundary with source, input, and read highWaterMark options.
Create a source adapter module that returns a readable stream and source metadata.
Implement file input through a file stream with optional highWaterMark tuning.
Implement stdin input while keeping diagnostics off stdout.
Implement gzip input by reading compressed bytes and exposing decompressed NDJSON downstream.
Record source kind, input path, compressed state, declared bytes, and counted bytes read.
Keep all source-specific branching inside the adapter so downstream stages receive one byte-stream contract.

Artifact

File, stdin, and gzip sources that produce compatible readable streams and source reports.

05Build a line splitter transformSplit arbitrary byte chunks into complete NDJSON lines while carrying partial lines between chunks.

Tasks

Create the line splitter module as a Transform stream or async generator stage.
Decode incoming Buffer chunks without corrupting UTF-8 characters split across chunk boundaries.
Carry the incomplete final segment after each newline split.
Flush the final carry when the upstream source ends and ignore the empty carry created by a final newline.
Normalize CRLF input by stripping one trailing carriage return from emitted lines.
Create a split checker that reads with tiny chunks to force carry-buffer behavior.
Use the splitter in the stream command so file, stdin, and gzip runs report both byte count and row count.

Artifact

A chunk-safe NDJSON splitter plus reports/split-check.json with expected lines, seen lines, read buffer size, and pass/fail status.

06Build a filter DSLAdd a small query syntax for filtering parsed rows and share it between naive and stream modes.

Tasks

Create the filter module with separate parsing and predicate compilation boundaries.
Parse one-clause expressions for status comparisons, method equality, path containment, and inequality.
Compile each filter once so the row loop receives a ready predicate function.
Add a JSON row parser stage that converts line strings into row objects and reports malformed lines with numbers.
Add the stream filter stage and count rows seen and rows matched.
Move naive mode onto the shared compiled predicate so output comparisons use the same selection rules.

Artifact

Shared filter behavior for whole-file and streaming modes, with matching counts for the same input and query.

07Add transform pluginsSupport pluggable row transformations for redaction, field selection, and enrichment.

Tasks

Create a plugin module or plugin directory that builds an ordered list of row transforms from CLI options.
Implement redaction for requested fields before data leaves the process.
Implement field selection while preserving the requested output field order.
Implement enrichment fields such as status class and payload length.
Encode the plugin order in one place: filter, redact, enrich, select, then stringify.
Add a composition check that proves redaction, enrichment, and selection run in the intended order.
Record active plugins and their order in the stream report.

Artifact

A stream path that can redact, enrich, narrow, and serialize rows while preserving auditable plugin order.

08Add a slow sink simulatorCreate a Writable stream that intentionally writes slowly and records backpressure signals.

Tasks

Create a custom SlowWritable destination with configurable delay, highWaterMark, discard mode, and optional output path.
Count chunks, bytes, drain events, current writable length, writable highWaterMark, and max buffered bytes.
Delay each write callback so the destination becomes slower than the upstream stages.
Create the pressure command using the existing source, splitter, parser, filter, plugin, stringifier, and slow sink stages.
Support slow-write delay, write highWaterMark, read highWaterMark, and filter options.
Write reports/pressure.json with slow sink metrics and elapsed time.

Artifact

A controlled slow writable destination that makes backpressure measurable.

09Add backpressure visualizationProduce a timeline of readable pressure, writable pressure, throughput, drain events, and memory.

Tasks

Add a sample interval option to the pressure command and record it in the report.
Collect timeline samples during the run and force one final sample during shutdown.
Sample RSS, heap used, external memory, readable queue length when available, and writable queue length from the slow sink.
Sample bytes read, bytes written, rows seen, rows matched, and per-window throughput.
Record drain events with elapsed timestamps from inside the drain listener.
Summarize max readable length, max writable length, max RSS, total drains, total bytes, and elapsed time.

Artifact

A pressure timeline that shows queue length, drain events, throughput, and memory over time.

10Make pipeline lifecycle explicitRecord completion, failure, and cleanup details around the stream/promises.pipeline boundary.

Tasks

Refactor stream and pressure commands so each run creates fresh source, transform, and destination stages.
Use the promise-based pipeline API as the operation boundary.
Normalize the stream command stage list: source, counters, optional gunzip, splitter, parser, filter, plugins, stringifier, optional gzip, destination.
Normalize the pressure command through the same pipeline boundary while preserving slow sink metrics.
Record pipeline status, completion, error code, and tracked stage-close information.
Keep stream errors flowing through one try/catch/finally path before report finalization.

Artifact

Stream and pressure reports that say whether the pipeline completed, failed, or cleaned up after an error.

11Add abort and partial output cleanupCancel a running pipeline safely and keep partial files from being mistaken for successful output.

Tasks

Support row-count aborts, timer aborts, and a keep-partial inspection flag.
Create one AbortController per run and pass its signal into the pipeline boundary.
Abort after the configured row threshold from inside the data path.
Abort after the configured timeout from outside the stream graph and clear the timer during cleanup.
Write file output through a temporary path and promote it only after successful completion.
Remove or mark partial output on abort while preserving the original abort reason in the report.

Artifact

Aborted runs with status, reason, partial file policy, cleanup state, and no corrupt final output promotion.

12Add compressionAdd gzip compression for output and decompression for input, then report compression ratio and pressure effects.

Tasks

Support gzip output mode, gzip level, and compressed output paths.
Read gzipped input through the existing gzip source path before line splitting.
Track compressed input bytes, uncompressed input bytes, uncompressed output bytes, and compressed output bytes.
Record gzip mode, gzip level, and compression ratio when compression is enabled.
Compare decompressed gzip content against raw NDJSON content for equivalent runs.
Run pressure tests in raw and gzip modes and compare drain counts, elapsed time, and byte counts.

Artifact

Stream reports that show compression settings, byte boundaries, ratios, and pressure changes.

13Compare modesCompare naive, stream, slow sink, gzip, and highWaterMark variants in one report.

Tasks

Create the comparison command and connect it to the compare script.
Define named scenarios for naive default, stream default, pressure default, slow pressure, gzip output, low read highWaterMark, and larger write highWaterMark.
Compute line counts, match counts, and SHA-256 content hashes for outputs that should be equivalent.
Read rows, matches, byte counts, elapsed time, peak RSS, peak heap, peak external memory, and drain count from scenario reports.
Write reports/comparison.md with one row per scenario and links to the JSON reports used.
Fail the comparison command when equivalent logical outputs disagree.
End the comparison with the production rule you would apply based on the measured result.

Artifact

A mode comparison report that validates output equality and exposes memory, timing, pressure, and compression differences.

Workbench evidence

The reports make each mode auditable.

Lab 03 keeps the experiment grounded in files you can inspect: generator settings, whole-file memory, stream counts, splitter checks, pressure timelines, compression data, abort cleanup, and final mode comparison.

generator.jsonNDJSON settings, output path, rows, and bytes

naive.jsonwhole-file counts, timing, bytes, and peak memory

stream.jsonsource, filter, plugin, lifecycle, and output data

split-check.jsontiny-chunk line splitting validation

pressure.jsonslow sink queues, drains, memory, and throughput

stream-gzip.jsoncompressed and uncompressed byte boundaries

stream-abort.jsonabort reason, cleanup state, and partial files

comparison.mdmode table with hashes, metrics, and final rule

Choose your
NodeBook package.

Buy a single volume or lock in every volume at once. Switch between one-reader pricing and team licenses for up to 25 members.

Choose volume

Individual pricing is for one reader and one personal purchase record.

Downloadable book bundle

Digital Bundle

Volume I as EPUB, light and dark PDFs, slides, cheatsheets, and future updates.

$19.99$49.99

One-time purchase

Volume I EPUB for offline reading
Light and dark PDF editions
Slide decks for chapter review
Cheatsheets for quick lookup
Future Digital Bundle updates
Lifetime access to the files

Get Digital Bundle

This is the downloadable Volume I study bundle. It does not include Node Runtime Labs.

Best value

Everything for this volume

NodeBook Pro

Volume I Labs plus its downloadable bundle in one purchase. Save $9.99 vs buying the Digital Bundle and Labs separately.

$49.99$99.99

One-time purchase

Node Runtime LabsDigital Bundle

Everything in Node Runtime Labs
Everything in the Digital Bundle
Volume I labs and book bundle
Future updates for both products
Lifetime access to purchased files

Get NodeBook Pro

Includes both paid products for Volume I.

Premium labs

Complex runtime projects

Node Runtime Labs

Volume I long-form builds with checkpoints, hints, debugging notes, and expected output.

$39.99$79.99

One-time purchase

Volume I runtime lab projects
Phase-by-phase build instructions
Checkpoints, hints, and rubrics
Debugging notes and expected output
Reflection questions
Future lab updates

Get Labs Bundle

This is the paid labs bundle for Volume I. It does not include EPUB, PDFs, slides, or cheatsheets.

See complete pricing breakdown

Other labs in the bundle

Lab 03 sits inside the runtime lab set.

The bundle includes seven runtime projects covering process observation, binary storage, streams, module resolution, file watching, async orchestration, and custom protocols.

Lab 01