Skip to main content
fast frontend

Streaming HTML Architecture Patterns

6 min read Chapter 29 of 33

Streaming HTML Architecture Patterns

The Symptom

The e-commerce product detail page uses SSR with streaming, but the FCP is 480ms instead of the expected 200ms. The streaming implementation renders the entire page inside a single Suspense boundary. The slowest data source (recommendations, 460ms) blocks the entire shell.

The Cause

A single Suspense boundary wrapping the whole page defeats streaming. The onShellReady callback fires only when all non-suspended content is rendered. If the shell itself depends on slow data, the shell is not ready until that data arrives.

// SLOW: Single Suspense boundary wraps everything
function ProductPage({ productId }: { productId: string }) {
  return (
    <Suspense fallback={<FullPageSkeleton />}>
      <ProductContent productId={productId} />
    </Suspense>
  );
}

// ProductContent fetches ALL data before rendering anything
function ProductContent({ productId }: { productId: string }) {
  const product = use(fetchProduct(productId)); // 40ms
  const reviews = use(fetchReviews(productId)); // 280ms
  const recommendations = use(fetchRecommendations(productId)); // 460ms
  // Nothing renders until all three resolve (460ms)

  return (
    <>
      <ProductHeader product={product} />
      <ProductReviews reviews={reviews} />
      <Recommendations items={recommendations} />
    </>
  );
}

The Baseline

Single-boundary streaming:

  • Shell ready: 460ms (blocked by recommendations)
  • FCP: 600ms (460ms server + 140ms network)
  • LCP: 800ms
  • Total server render time: 460ms

The Fix

Granular Suspense Boundaries

Split the page into independent streaming boundaries based on data source latency:

// FAST: Independent Suspense boundaries per data source
function ProductPage({ productId }: { productId: string }) {
  return (
    <Layout>
      {/* Shell: only depends on product data (40ms) */}
      <ProductShell productId={productId} />

      {/* Streams independently when reviews resolve (280ms) */}
      <Suspense fallback={<ReviewsSkeleton />}>
        <ProductReviews productId={productId} />
      </Suspense>

      {/* Streams independently when recommendations resolve (460ms) */}
      <Suspense fallback={<RecommendationsSkeleton />}>
        <Recommendations productId={productId} />
      </Suspense>
    </Layout>
  );
}

// Shell component: fast data only
function ProductShell({ productId }: { productId: string }) {
  const product = use(fetchProduct(productId)); // 40ms

  return (
    <>
      <ProductHeader product={product} />
      <ProductImages images={product.images} />
      <ProductPrice
        price={product.price}
        originalPrice={product.originalPrice}
      />
      <AddToCartButton productId={product.id} />
    </>
  );
}

The shell now depends only on fetchProduct (40ms). The onShellReady callback fires at 40ms, and HTML streaming begins immediately. Reviews and recommendations stream in when their data resolves, each replacing its skeleton placeholder.

Nested Suspense for Progressive Disclosure

Some sections have sub-components with different data latencies. The reviews section has a summary (fast) and individual reviews (slower):

function ProductReviews({ productId }: { productId: string }) {
  const summary = use(fetchReviewSummary(productId)); // 80ms

  return (
    <section>
      {/* Summary renders when this boundary streams */}
      <ReviewSummary
        averageRating={summary.averageRating}
        totalReviews={summary.totalReviews}
        distribution={summary.distribution}
      />

      {/* Individual reviews stream later */}
      <Suspense fallback={<ReviewListSkeleton count={5} />}>
        <ReviewList productId={productId} />
      </Suspense>
    </section>
  );
}

function ReviewList({ productId }: { productId: string }) {
  const reviews = use(fetchReviews(productId)); // 280ms

  return (
    <ul>
      {reviews.map((review) => (
        <ReviewCard key={review.id} review={review} />
      ))}
    </ul>
  );
}

The review summary (average rating, total count) streams at 80ms. Individual reviews stream at 280ms. The user sees meaningful content (the rating breakdown) 200ms before the full review list appears.

Error Boundaries in Streaming Context

If a streamed section fails (the recommendation service is down), the error must not break the already-rendered page. An Error Boundary wrapping each Suspense boundary catches failures:

import { Component, type ErrorInfo, type ReactNode } from "react";

interface ErrorBoundaryProps {
  fallback: ReactNode;
  children: ReactNode;
}

interface ErrorBoundaryState {
  hasError: boolean;
}

class StreamErrorBoundary extends Component<
  ErrorBoundaryProps,
  ErrorBoundaryState
> {
  state: ErrorBoundaryState = { hasError: false };

  static getDerivedStateFromError(): ErrorBoundaryState {
    return { hasError: true };
  }

  componentDidCatch(error: Error, info: ErrorInfo): void {
    console.error("Stream section failed:", error, info);
  }

  render(): ReactNode {
    if (this.state.hasError) {
      return this.props.fallback;
    }
    return this.props.children;
  }
}

// Usage: wrapping each streamed section
function ProductPage({ productId }: { productId: string }) {
  return (
    <Layout>
      <ProductShell productId={productId} />

      <StreamErrorBoundary fallback={<ReviewsUnavailable />}>
        <Suspense fallback={<ReviewsSkeleton />}>
          <ProductReviews productId={productId} />
        </Suspense>
      </StreamErrorBoundary>

      <StreamErrorBoundary fallback={<RecommendationsUnavailable />}>
        <Suspense fallback={<RecommendationsSkeleton />}>
          <Recommendations productId={productId} />
        </Suspense>
      </StreamErrorBoundary>
    </Layout>
  );
}

If recommendations fail, the user sees a “Recommendations unavailable” message instead of a broken page. The product header, images, price, and reviews remain intact.

Server Timing Headers

Measure streaming performance by exposing server-side timing data:

import { renderToPipeableStream } from 'react-dom/server';
import type { Request, Response } from 'express';

function handleProductRequest(req: Request, res: Response): void {
  const startTime = performance.now();

  const { pipe } = renderToPipeableStream(
    <ProductPage productId={req.params.id} />,
    {
      bootstrapScripts: ['/static/client.js'],
      onShellReady() {
        const shellTime = performance.now() - startTime;

        res.setHeader('Content-Type', 'text/html; charset=utf-8');
        res.setHeader(
          'Server-Timing',
          `shell;dur=${shellTime.toFixed(1)};desc="Shell render"`
        );
        res.statusCode = 200;
        pipe(res);
      },
      onAllReady() {
        const totalTime = performance.now() - startTime;
        // Log total render time for monitoring
        console.log(`Full render: ${totalTime.toFixed(1)}ms`);
      },
      onError(error: unknown) {
        console.error('Render error:', error);
      },
    }
  );
}

The Server-Timing header appears in Chrome DevTools Network panel, making shell render time visible during development. The CI pipeline can parse this header to assert shell render time stays below a threshold.

The Proof

MetricSingle BoundaryGranular BoundariesDelta
Shell ready (server)460ms40ms-420ms
FCP600ms180ms-420ms
LCP800ms380ms-420ms
Reviews visible600ms420ms-180ms
Recommendations visible600ms600ms0ms
TTI1,200ms1,100ms-100ms

The recommendations appear at the same time in both approaches (the data takes 460ms regardless). The difference is everything else renders 420ms earlier. The user sees the product header, images, and price at 180ms instead of 600ms.

The Trade-off

More Suspense boundaries mean more skeleton states for the user to see. If seven sections each stream at different times, the page “assembles” over 500ms with content popping in progressively. This can feel chaotic. The guideline: group data sources with similar latencies into a single Suspense boundary. The product page uses three groups: fast (product data, 40ms), medium (reviews, 80-280ms), and slow (recommendations, 460ms).

Skeleton components must match the exact dimensions of the rendered content. If the review skeleton is 200px tall and the rendered reviews are 340px tall, the 140px height change causes a CLS of 0.12 (above the 0.1 threshold). Every skeleton in the streaming architecture must have an explicit height or aspect ratio matching the expected content size.