Fundamentals

What is Docker?

A platform to build, ship, and run containerised applications. It packages code + dependencies into a portable unit that runs identically everywhere — your laptop, CI server, or cloud.

Write code

→

Dockerise

→

Build image

→

Push to registry

→

Run anywhere

Problem it solves

"Works on my machine" — version mismatches
OS-specific CLI errors
Manual dependency installation
Inconsistent dev/prod environments
Scaling & resource conflicts

Key benefits

Consistency — same env everywhere
Isolation — no dependency clashes
Scalability — spin up instances fast
Portability — runs on any OS
Speed — containers start in ms

Docker Image

A static, read-only snapshot. Blueprint used to create containers. Layered architecture — each instruction adds a layer.

StaticImmutableLayered

Docker Container

A running instance of an image. Isolated, lightweight process with its own filesystem, networking, and process space.

RunningIsolatedEphemeral

Docker Registry

Storage for images. Docker Hub is the default public registry. Private registries include AWS ECR, GCR, ACR.

HubECRGCR

Container vs VM

Feature	Container	Virtual Machine
Boot time	Milliseconds	Minutes
Size	MBs	GBs
OS	Shares host kernel	Full OS per VM
Isolation	Process-level	Hardware-level
Overhead	Very low	High

Docker Architecture

Docker Engine — 3 Components

Docker Daemon (dockerd)

Background service on host. Manages images, containers, networks, volumes. Listens for API requests.

Docker CLI (docker)

User-facing command line tool. Sends commands to daemon via REST API.

REST API

Bridge between CLI and daemon. Enables programmatic control of Docker.

Docker Image Layers

Each Dockerfile instruction creates an immutable layer. Layers are cached and shared across images — huge efficiency gain.

FROM ubuntu:22.04       ← Layer 1 (base OS)
RUN apt-get install...  ← Layer 2 (dependencies)
COPY . /app             ← Layer 3 (app code)
CMD ["python","app.py"] ← Layer 4 (config)

Layers are cached. If Layer 2 hasn't changed, Docker reuses it. Put rarely-changing instructions early in the Dockerfile to maximise cache hits.

Image Lifecycle

Dockerfile

→

docker build

→

Image (local)

→

docker push

→

Registry

→

docker pull

→

docker run

→

Container

Image naming convention

username/image_name:tag
     ↑           ↑        ↑
  DockerHub   repo name  version

Examples:
  python:3.11-slim
  nginx:latest
  myuser/myapp:v2.1.0

Dockerfile Reference

All instructions

FROMBase image to build on. Must be first instruction. FROM python:3.11-slim

WORKDIRSet working directory for subsequent instructions. Creates it if absent. WORKDIR /app

COPYCopy files from host to image. COPY . /app or COPY requirements.txt .

ADDLike COPY but also extracts tarballs and supports URLs. Prefer COPY unless you need these extras.

RUNExecute a command during build (creates a new layer). RUN pip install -r requirements.txt

CMDDefault command when container starts. Can be overridden at runtime. CMD ["python","app.py"]

ENTRYPOINTLike CMD but harder to override — always runs. Used with CMD for default args. ENTRYPOINT ["gunicorn"]

ENVSet environment variables. ENV NODE_ENV=production

ARGBuild-time variable (not available at runtime). ARG VERSION=1.0

EXPOSEDocument which port the app uses. Doesn't actually publish it. EXPOSE 8080

VOLUMECreate a mount point for persistent data. VOLUME ["/data"]

LABELAdd metadata to image. LABEL version="1.0" maintainer="you@email.com"

USERSet user for subsequent instructions. Security best practice — don't run as root. USER appuser

HEALTHCHECKCommand Docker runs to check if container is healthy. HEALTHCHECK CMD curl -f http://localhost/ || exit 1

ONBUILDTrigger instruction for child images built from this image.

CMD vs ENTRYPOINT

# CMD only — overrideable
CMD ["python", "app.py"]
# docker run img echo hi → runs "echo hi"

# ENTRYPOINT only — always runs
ENTRYPOINT ["python"]
# docker run img app.py → runs "python app.py"

# Both — ENTRYPOINT fixed, CMD = default args
ENTRYPOINT ["python"]
CMD ["app.py"]
# docker run img other.py → "python other.py"

Shell vs Exec form

# Shell form (runs in /bin/sh -c)
# Signals NOT forwarded to process
CMD python app.py

# Exec form (recommended)
# Signals ARE forwarded properly
CMD ["python", "app.py"]

# Use exec form for CMD and ENTRYPOINT
# Use shell form for RUN when piping

Production-ready Python example

# --- Stage 1: build ---
FROM python:3.11-slim AS builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir --user -r requirements.txt

# --- Stage 2: final image (smaller) ---
FROM python:3.11-slim
WORKDIR /app
COPY --from=builder /root/.local /root/.local
COPY . .
ENV PATH=/root/.local/bin:$PATH
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
EXPOSE 8000
USER nobody                        # don't run as root
HEALTHCHECK --interval=30s --timeout=5s \
  CMD curl -f http://localhost:8000/health || exit 1
CMD ["gunicorn","--bind","0.0.0.0:8000","app:app"]

Multi-stage builds drastically reduce final image size. Build tools and deps used only at build time stay in the builder stage.

Dockerfile best practices

Use -slim or -alpine base images to reduce size
Combine RUN commands with && to reduce layers
Copy requirements.txt before code — leverage layer cache
Use .dockerignore to exclude node_modules, .git, etc.
Never hardcode secrets — use ENV vars or secrets management
Use multi-stage builds for compiled languages
Set a non-root USER for security
Pin image versions: FROM python:3.11.6-slim not :latest

Docker Commands

Images

List images

docker images

Pull image

docker pull <image>[:tag]

Build image

docker build -t <name>:<tag> .

Build (no cache)

docker build --no-cache -t <name> .

Delete image

docker rmi <image>

Remove unused

docker image prune

Inspect image

docker inspect <image>

Image history

docker history <image>

Save to file

docker save <image> > img.tar

Load from file

docker load < img.tar

Containers — lifecycle

Run (basic)

docker run <image>

Run detached

docker run -d <image>

Run interactive

docker run -it <image> /bin/bash

Run with name

docker run --name myapp <image>

Run with port

docker run -p 8080:80 <image>

Run with env var

docker run -e DB_URL=... <image>

Run with volume

docker run -v mydata:/app/data <image>

Auto-remove on stop

docker run --rm <image>

Start / stop

docker start|stop <container>

Restart

docker restart <container>

Pause / unpause

docker pause|unpause <container>

Delete container

docker rm <container>

Force delete

docker rm -f <container>

Containers — inspection & debug

List running

docker ps

List all

docker ps -a

View logs

docker logs <container>

Follow logs

docker logs -f <container>

Shell into running

docker exec -it <container> /bin/bash

Run command inside

docker exec <container> <cmd>

Inspect container

docker inspect <container>

Resource usage

docker stats

Running processes

docker top <container>

Copy file from container

docker cp <container>:/path ./local

Port mappings

docker port <container>

Diff from image

docker diff <container>

DockerHub & system

Login

docker login

Logout

docker logout

Push image

docker push <user>/<image>:<tag>

Search hub

docker search <term>

Tag image

docker tag <src> <user>/<image>:tag

System info

docker info

Version

docker version

Disk usage

docker system df

Remove everything unused

docker system prune -a

Docker Networking

Network drivers

Driver	Description	Use case
bridge	Default. Creates a virtual network. Containers on same bridge talk via name.	Multi-container apps on single host
host	Container shares host's network namespace. No isolation.	High-performance apps, avoid port mapping
none	No networking. Container completely isolated.	Batch jobs, security-sensitive tasks
overlay	Spans multiple Docker hosts. Used with Docker Swarm.	Distributed/clustered applications
macvlan	Assign MAC address to container — appears as physical device.	Legacy apps expecting direct network access

Port mapping

Syntax: -p host_port:container_port

# Map port 80 in container → 8080 on host
docker run -p 8080:80 nginx

# Bind to specific interface
docker run -p 127.0.0.1:8080:80 nginx

# Random host port
docker run -p 80 nginx

# Multiple ports
docker run -p 80:80 -p 443:443 nginx

# Expose all declared ports (random)
docker run -P nginx

Network commands

List networks

docker network ls

Create

docker network create <name>

Create bridge

docker network create --driver bridge mynet

Connect container

docker network connect <net> <container>

Disconnect

docker network disconnect <net> <container>

Inspect

docker network inspect <name>

Remove

docker network rm <name>

Remove unused

docker network prune

Container-to-container communication

# 1. Create a custom network
docker network create myapp-net

# 2. Run containers on same network — reference by name!
docker run -d --name db --network myapp-net postgres
docker run -d --name web --network myapp-net -p 80:80 nginx

# Inside the web container you can reach db at:
#   db:5432   (container name = DNS hostname)
# Default bridge network does NOT support name-based DNS

Always create a custom bridge network for multi-container apps. Container DNS by name only works on custom (non-default) bridge networks.

Volumes & Storage

Why volumes?

Container filesystems are ephemeral — data is lost when a container is removed. Volumes persist data outside the container lifecycle.

📦

Named Volume

Managed by Docker. Best for most cases.

📁

Bind Mount

Map host path. Good for dev with live reload.

💾

tmpfs Mount

In-memory only. Fast, non-persistent.

Named volumes

Create

docker volume create mydata

List

docker volume ls

Inspect

docker volume inspect mydata

Delete

docker volume rm mydata

Remove unused

docker volume prune

# Mount named volume (-v shorthand)
docker run -v mydata:/app/data <image>

# Mount named volume (--mount verbose form)
docker run --mount type=volume,src=mydata,dst=/app/data <image>

Bind mounts

# Mount current dir to /app — live code reload!
docker run -v $(pwd):/app <image>

# Read-only bind mount
docker run -v $(pwd)/config:/etc/config:ro <image>

# --mount form (explicit, more readable)
docker run --mount type=bind,src=$(pwd),dst=/app <image>

Bind mounts expose host filesystem to container. Use carefully — avoid mounting sensitive directories.

tmpfs (memory-only)

# Data lives in RAM only — gone when container stops
docker run --tmpfs /app/cache <image>

# Or with --mount
docker run --mount type=tmpfs,dst=/app/cache <image>

Use for sensitive data like passwords that shouldn't be written to disk, or as a high-speed scratch space.

Docker Compose

What is Compose?

Tool for defining and running multi-container applications. A single docker-compose.yml file defines all services, networks, and volumes.

Start all services

docker compose up

Start detached

docker compose up -d

Stop all

docker compose down

Stop + remove volumes

docker compose down -v

View logs

docker compose logs -f

List services

docker compose ps

Rebuild images

docker compose build

Run one-off command

docker compose run <service> <cmd>

Scale service

docker compose up --scale web=3

Exec into service

docker compose exec <service> bash

Complete docker-compose.yml example

version: '3.9'

services:
  web:
    build: .                          # build from local Dockerfile
    image: myapp:latest
    ports:
      - "8000:8000"
    environment:
      - DATABASE_URL=postgresql://db/mydb
      - REDIS_URL=redis://cache:6379
    volumes:
      - .:/app                        # bind mount for dev
    depends_on:
      db:
        condition: service_healthy    # wait for DB to be ready
      cache:
        condition: service_started
    restart: unless-stopped
    networks:
      - backend

  db:
    image: postgres:15
    environment:
      POSTGRES_DB: mydb
      POSTGRES_USER: user
      POSTGRES_PASSWORD: pass
    volumes:
      - pgdata:/var/lib/postgresql/data  # named volume
    healthcheck:
      test: ["CMD-SHELL","pg_isready -U user"]
      interval: 5s
      timeout: 5s
      retries: 5
    networks:
      - backend

  cache:
    image: redis:7-alpine
    networks:
      - backend

volumes:
  pgdata:                             # declare named volume

networks:
  backend:
    driver: bridge

Compose — key fields reference

Field	Purpose
`build`	Path to Dockerfile context, or `{context, dockerfile}`
`image`	Image name to use or tag as
`ports`	Host:container port mapping
`environment`	Env vars as list or map
`env_file`	Load env vars from file (e.g. `.env`)
`volumes`	Mount volumes or bind mounts
`depends_on`	Start order + optional healthcheck condition
`networks`	Attach service to named networks
`restart`	`no` / `always` / `unless-stopped` / `on-failure`
`healthcheck`	Check if service is ready
`deploy`	Replicas, resource limits (Swarm/Kubernetes)
`profiles`	Only start service when profile is active

Real-world Workflow

Complete dev-to-prod workflow

# 1. Write your app + Dockerfile

# 2. Build image locally
docker build -t myapp:v1.0 .

# 3. Test locally
docker run -d -p 8000:8000 --name test-app myapp:v1.0
docker logs test-app
docker exec -it test-app /bin/bash   # debug if needed

# 4. Tag for registry
docker tag myapp:v1.0 myusername/myapp:v1.0

# 5. Push to DockerHub
docker login
docker push myusername/myapp:v1.0

# 6. On the production server
docker pull myusername/myapp:v1.0
docker run -d -p 80:8000 --name prod-app myusername/myapp:v1.0

# 7. Update (zero-downtime pattern)
docker pull myusername/myapp:v1.1
docker stop prod-app && docker rm prod-app
docker run -d -p 80:8000 --name prod-app myusername/myapp:v1.1

.dockerignore

Prevents files from being sent to build context — keeps images small and fast.

.git
.gitignore
.env
node_modules
__pycache__
*.pyc
.pytest_cache
.coverage
dist
build
*.log
README.md
docker-compose*.yml

Restart policies

Policy	Behaviour
`no`	Never restart (default)
`always`	Always restart, even on manual stop
`unless-stopped`	Restart unless explicitly stopped
`on-failure`	Restart only on non-zero exit code

docker run --restart unless-stopped nginx

Environment variables & secrets

# Pass single var
docker run -e DB_PASS=secret <image>

# Load from .env file
docker run --env-file .env <image>

# .env file format (no quotes needed)
DB_HOST=localhost
DB_PORT=5432
DB_NAME=myapp

# In Compose — env_file automatically loads .env from same dir
# Never commit .env to git — add to .gitignore!

Never bake secrets into images via ARG or ENV in Dockerfile — they're visible in docker history. Use Docker Secrets (Swarm) or an external vault like HashiCorp Vault.

Advanced Topics

Multi-stage builds

Dramatically shrink production image size by discarding build tools.

# Node.js example
FROM node:20 AS build
WORKDIR /app
COPY package*.json .
RUN npm ci
COPY . .
RUN npm run build                 # produces /app/dist

FROM nginx:alpine                 # tiny final image (~23MB)
COPY --from=build /app/dist /usr/share/nginx/html
EXPOSE 80
CMD ["nginx","-g","daemon off;"]

Resource limits

# Limit CPU and memory
docker run --cpus="1.5" --memory="512m" <image>

# Memory + memory-swap (swap = total, not extra)
docker run --memory="512m" --memory-swap="1g" <image>

# In docker-compose.yml (compose v3 with deploy):
services:
  web:
    deploy:
      resources:
        limits:
          cpus: '1.5'
          memory: 512M
        reservations:
          cpus: '0.5'
          memory: 128M

Container health checks

# In Dockerfile
HEALTHCHECK --interval=30s \
            --timeout=10s \
            --start-period=5s \
            --retries=3 \
  CMD curl -f http://localhost/health \
      || exit 1

# docker ps shows:
# STATUS: Up (healthy) / (unhealthy) / (starting)

Docker context

# Manage remote Docker hosts
docker context create prod \
  --docker "host=ssh://user@server"

docker context use prod
docker ps          # ← runs on remote server!

docker context use default  # back to local
docker context ls

Debugging patterns

# Container exited immediately — check logs
docker logs <container>

# Container won't start — run interactively to debug
docker run -it --entrypoint /bin/bash <image>

# Debug a running container's network
docker exec -it <container> curl http://other-service

# Inspect full container config (env vars, mounts, etc.)
docker inspect <container> | grep -A 20 '"Env"'

# Monitor resource usage live
docker stats <container>

# See what changed from base image
docker diff <container>

# Copy files out of a stopped container
docker cp <container>:/var/log/app.log ./logs/

Security hardening checklist

Use USER to run as non-root inside container
Use --read-only flag for immutable containers
Use --no-new-privileges to block privilege escalation
Scan images: docker scout cves <image>
Pin base image versions — never use :latest in production
Use slim/distroless images to minimise attack surface
Never mount Docker socket (/var/run/docker.sock) unless necessary
Use Docker secrets for sensitive data, not environment variables
Enable Content Trust: export DOCKER_CONTENT_TRUST=1
Limit container capabilities: --cap-drop ALL --cap-add <needed>

Use cases where Docker shines

Microservices

Each service in its own container — independent deployment and scaling.

CI/CD pipelines

Identical environments from dev → test → prod. No "it works on CI" issues.

Local dev

Run Postgres, Redis, Kafka locally in seconds without installing anything.

ML/AI workloads

Package models with all CUDA deps. Reproducible experiments.

Cloud migration

Containerise legacy apps once — run on AWS, GCP, or Azure unchanged.

Testing & QA

Spin up fresh environments per test run. Destroy after. No state leakage.