Skip to main content

On This Page

Optimizing Docker Images: A Data-Driven Guide to Reducing Image Size with Dive

3 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

Docker Image Diet: Find the Problem With dive Before Trying to Fix It

Engineer Recca Tsai demonstrates that guessing image optimizations is less effective than diagnostic profiling using specialized tooling. By identifying specific layer inefficiencies, a standard Node.js image was reduced from 1.25GB to just 139MB.

Why This Matters

Engineers often apply generic optimization checklists without understanding why an image is large, leading to minimal gains. In reality, build-time dependencies and duplicated files—such as a 561MB apt-get layer or 107MB of wasted devDependencies—often persist in the final image, increasing storage costs and slowing deployment pipelines. True optimization requires visibility into the layer stack to target the actual source of weight rather than applying superficial changes.

Key Insights

  • The ‘docker image history’ command reveals layer-specific weight, such as identifying a 561MB layer dedicated to build tools like gcc and python3.
  • The ‘dive’ tool identifies file-level duplication, showing that files like typescript.js can appear in multiple layers when a .dockerignore file is missing.
  • Wasted space often stems from devDependencies; analysis showed 107MB of waste from packages like @babel/parser that serve no purpose in production.
  • Multi-stage builds effectively separate build environments from production runtimes, reducing a Node.js image to its 139MB Alpine-based floor.
  • Switching to Google’s Distroless images for Node.js can further reduce the final production footprint to approximately 100MB.

Working Examples

A typical unoptimized Node.js Dockerfile that results in a 1.25GB image.

FROM node:latest
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
CMD ["node", "index.js"]

A multi-stage build using Alpine and production-only dependencies to reduce size to 139MB.

FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .

FROM node:20-alpine AS production
WORKDIR /app
COPY package*.json ./
RUN npm ci --omit=dev
COPY --from=builder /app/index.js ./
CMD ["node", "index.js"]

Using the ‘scratch’ base image for Go applications to create minimal images containing only the binary.

FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY . .
RUN CGO_ENABLED=0 go build -o server .

FROM scratch
COPY --from=builder /app/server /server
ENTRYPOINT ["/server"]

Practical Applications

  • Use Case: Deploying Node.js microservices where production stages use ‘—omit=dev’ to exclude large testing frameworks like Jest. Pitfall: Forgetting a .dockerignore file causes local node_modules to be copied over the fresh install, doubling the image size.
  • Use Case: Identifying redundant system packages in legacy images using ‘dive’ to find unused build-essential tools. Pitfall: Relying on ‘node:latest’ which uses Debian and includes hundreds of megabytes of unnecessary system utilities.

References:

Continue reading

Next article

Solving AI Behavioral Drift with Execution-Time Governance

Related Content