Your Docker image is large and you don’t know why. Guessing and applying a checklist of tips might not cut much. Find out what’s actually causing the size, then fix that specifically.
The Tools: docker image history + dive
Two tools, different purposes.
docker image history is built into Docker. It shows how much space each layer takes:
| |
dive is a third-party tool. It lets you browse each layer’s contents interactively and reports how much space is being wasted:
| |
Starting With a Fat Image
A typical Node.js Dockerfile with no optimizations:
| |
package.json has a few devDependencies: jest, typescript, @types/express.
After docker build, docker images shows:
| |
1.25GB. Don’t touch the Dockerfile yet — find out where the weight is.
Step 1: docker image history
| |
Output (relevant lines):
| |
Three things stand out immediately:
- 561MB apt-get layer:
build-essential,python3,gcc— build tools that aren’t needed at runtime, but they’re stuck in the image - 61MB npm install: includes devDependencies (jest, typescript) that production doesn’t use
- 49.3MB COPY . .:
node_modulesgot copied in (no.dockerignore)
Step 2: dive to Find the Waste
docker image history shows layer sizes. dive shows what’s actually inside each layer:
| |
Output:
| |
107MB wasted. dive names the culprits directly: typescript, @babel/parser — all devDependencies that serve no purpose in a production image.
“Count: 2” means the same file appears in two layers — once from npm install, once from COPY . .. That’s what happens without a .dockerignore: node_modules gets installed, then copied in again on top.
Fix What’s Actually Broken
Problems identified. Fix each one:
Problem 1: node_modules copied twice
→ Add .dockerignore
| |
Problem 2: devDependencies in production image
→ Multi-stage build, production stage uses --omit=dev
Problem 3: Base image is too heavy (Debian + build tools)
→ Switch to node:20-alpine
The fixed Dockerfile:
| |
Rebuild and compare:
| |
| |
Run dive again to confirm:
| |
Can You Go Further?
The 139MB floor is mostly the Node.js runtime inside node:20-alpine. To go lower, switch to distroless:
| |
That gets you to around 100MB. Beyond that, the gains are small — unless you switch to Go.
Go: A Different Story
Go compiles to a static binary. Use scratch — a completely empty base image:
| |
Final image is just the binary. A few MB to a few dozen MB. At this point dive has little to tell you — there’s almost nothing to optimize.
Summary
Workflow: docker image history to find the heavy layers → dive to see what’s inside them → fix the actual problem.
The common culprits:
- Bloated base image:
node:latest(Debian) →node:20-alpine - devDependencies in production: multi-stage build +
--omit=dev - Duplicate node_modules: add
.dockerignore
Use tools to diagnose, make targeted fixes, then verify with tools again. More effective than guessing.
