Docker: Complete Containerization Guide
Master Docker from basics to advanced concepts including multi-stage builds, orchestration, and production deployment. Learn containerization best practices for modern applications.
Docker: Complete Containerization Guide
Docker has revolutionized how we develop, package, and deploy applications. This comprehensive guide covers everything from Docker basics to advanced production deployment strategies, helping you master containerization for modern applications.
What is Docker?
Understanding Containers
Docker is a containerization platform that allows you to package applications and their dependencies into lightweight, portable containers. Unlike virtual machines, containers share the host OS kernel, making them more efficient and faster to start.
Key Benefits
- Consistency: Same environment across development, testing, and production
- Portability: Run anywhere Docker is installed
- Efficiency: Lightweight compared to virtual machines
- Scalability: Easy to scale applications horizontally
- Isolation: Applications run in isolated environments
- Version Control: Track changes to your application environment
Docker vs Virtual Machines
| Feature | Docker Containers | Virtual Machines |
|---|---|---|
| Resource Usage | Lightweight, shares OS | Heavy, full OS per VM |
| Startup Time | Seconds | Minutes |
| Isolation | Process-level | Hardware-level |
| Portability | High | Medium |
| Overhead | Minimal | Significant |
Docker Fundamentals
Core Concepts
Images
Docker images are read-only templates that define how to create containers. They include:
- Application code
- Runtime environment
- System libraries
- Dependencies
- Configuration files
Containers
Containers are running instances of Docker images. They include:
- The application
- Runtime environment
- Isolated filesystem
- Network interface
- Process space
Dockerfile
A text file containing instructions to build Docker images:
# Use official Node.js runtime
FROM node:18-alpine
# Set working directory
WORKDIR /app
# Copy package files
COPY package*.json ./
# Install dependencies
RUN npm ci --only=production
# Copy application code
COPY . .
# Expose port
EXPOSE 3000
# Start application
CMD ["npm", "start"]
Basic Docker Commands
Image Management
# List images
docker images
# Pull image from registry
docker pull nginx:latest
# Remove image
docker rmi nginx:latest
# Build image from Dockerfile
docker build -t myapp:latest .
# Tag image
docker tag myapp:latest myapp:v1.0
Container Management
# Run container
docker run -d -p 8080:80 --name myapp nginx
# List running containers
docker ps
# List all containers
docker ps -a
# Stop container
docker stop myapp
# Start container
docker start myapp
# Remove container
docker rm myapp
# Execute command in running container
docker exec -it myapp bash
Dockerfile Best Practices
Multi-Stage Builds
Single-Stage Build (Inefficient)
FROM node:18-alpine
WORKDIR /app
# Install all dependencies including dev dependencies
COPY package*.json ./
RUN npm install
# Copy source code
COPY . .
# Build application
RUN npm run build
# Start application
CMD ["npm", "start"]
Multi-Stage Build (Efficient)
# Build stage
FROM node:18-alpine AS builder
WORKDIR /app
# Copy package files
COPY package*.json ./
# Install all dependencies
RUN npm ci
# Copy source code
COPY . .
# Build application
RUN npm run build
# Production stage
FROM node:18-alpine AS production
WORKDIR /app
# Copy package files
COPY package*.json ./
# Install only production dependencies
RUN npm ci --only=production
# Copy built application from builder stage
COPY --from=builder /app/dist ./dist
# Create non-root user
RUN addgroup -g 1001 -S nodejs
RUN adduser -S nextjs -u 1001
# Change ownership
RUN chown -R nextjs:nodejs /app
USER nextjs
# Expose port
EXPOSE 3000
# Start application
CMD ["npm", "start"]
Security Best Practices
Use Specific Base Images
# Bad: Using latest tag
FROM node:latest
# Good: Using specific version
FROM node:18.17.0-alpine
Create Non-Root User
# Create user and group
RUN addgroup -g 1001 -S appgroup
RUN adduser -S appuser -u 1001 -G appgroup
# Change ownership
RUN chown -R appuser:appgroup /app
# Switch to non-root user
USER appuser
Minimize Attack Surface
# Use minimal base images
FROM alpine:3.18
# Install only necessary packages
RUN apk add --no-cache nodejs npm
# Remove package cache
RUN apk del npm && rm -rf /var/cache/apk/*
Layer Optimization
Combine RUN Commands
# Bad: Multiple layers
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get install -y git
RUN apt-get clean
# Good: Single layer
RUN apt-get update && \
apt-get install -y curl git && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
Use .dockerignore
node_modules
npm-debug.log
.git
.gitignore
README.md
.env
.nyc_output
coverage
.nyc_output
.vscode
Docker Compose
Basic Compose File
docker-compose.yml
version: '3.8'
services:
web:
build: .
ports:
- "3000:3000"
environment:
- NODE_ENV=production
depends_on:
- db
- redis
db:
image: postgres:15-alpine
environment:
- POSTGRES_DB=myapp
- POSTGRES_USER=user
- POSTGRES_PASSWORD=password
volumes:
- postgres_data:/var/lib/postgresql/data
redis:
image: redis:7-alpine
ports:
- "6379:6379"
volumes:
postgres_data:
Advanced Compose Features
Environment Variables
version: '3.8'
services:
web:
build: .
environment:
- NODE_ENV=${NODE_ENV:-development}
- DATABASE_URL=${DATABASE_URL}
env_file:
- .env
- .env.local
Health Checks
version: '3.8'
services:
web:
build: .
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
Networks
version: '3.8'
services:
web:
build: .
networks:
- frontend
- backend
db:
image: postgres:15-alpine
networks:
- backend
networks:
frontend:
driver: bridge
backend:
driver: bridge
Production Deployment
Container Orchestration
Docker Swarm
# Initialize swarm
docker swarm init
# Create service
docker service create \
--name web \
--replicas 3 \
--publish 80:3000 \
myapp:latest
# Scale service
docker service scale web=5
# Update service
docker service update --image myapp:v2.0 web
Kubernetes Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
replicas: 3
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: myapp:latest
ports:
- containerPort: 3000
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
Monitoring and Logging
Health Checks
# Add health check to Dockerfile
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD curl -f http://localhost:3000/health || exit 1
Logging Configuration
version: '3.8'
services:
web:
build: .
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "3"
Security Considerations
Image Scanning
# Scan image for vulnerabilities
docker scan myapp:latest
# Use Trivy for security scanning
trivy image myapp:latest
Secrets Management
version: '3.8'
services:
web:
build: .
secrets:
- db_password
environment:
- DB_PASSWORD_FILE=/run/secrets/db_password
secrets:
db_password:
file: ./secrets/db_password.txt
Advanced Docker Techniques
Custom Networks
Create Custom Network
# Create network
docker network create mynetwork
# Run containers on custom network
docker run -d --name web --network mynetwork nginx
docker run -d --name db --network mynetwork postgres
Network Configuration
version: '3.8'
services:
web:
build: .
networks:
- frontend
api:
build: ./api
networks:
- frontend
- backend
db:
image: postgres:15-alpine
networks:
- backend
networks:
frontend:
driver: bridge
backend:
driver: bridge
internal: true
Volume Management
Named Volumes
version: '3.8'
services:
db:
image: postgres:15-alpine
volumes:
- postgres_data:/var/lib/postgresql/data
- ./init.sql:/docker-entrypoint-initdb.d/init.sql
volumes:
postgres_data:
driver: local
Bind Mounts
version: '3.8'
services:
web:
build: .
volumes:
- ./src:/app/src
- ./logs:/app/logs
Development Workflow
Development Compose
version: '3.8'
services:
web:
build:
context: .
target: development
volumes:
- .:/app
- /app/node_modules
environment:
- NODE_ENV=development
command: npm run dev
db:
image: postgres:15-alpine
environment:
- POSTGRES_DB=myapp_dev
volumes:
- postgres_dev_data:/var/lib/postgresql/data
volumes:
postgres_dev_data:
Production Compose
version: '3.8'
services:
web:
build:
context: .
target: production
environment:
- NODE_ENV=production
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 30s
timeout: 10s
retries: 3
nginx:
image: nginx:alpine
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf
- ./ssl:/etc/nginx/ssl
depends_on:
- web
Performance Optimization
Image Size Optimization
Use Alpine Images
# Use Alpine Linux for smaller images
FROM node:18-alpine
# Install only necessary packages
RUN apk add --no-cache curl
Multi-Stage Builds
# Build stage
FROM node:18-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# Production stage
FROM node:18-alpine AS production
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY --from=builder /app/dist ./dist
CMD ["npm", "start"]
Resource Limits
Memory and CPU Limits
version: '3.8'
services:
web:
build: .
deploy:
resources:
limits:
memory: 512M
cpus: '0.5'
reservations:
memory: 256M
cpus: '0.25'
Troubleshooting
Common Issues
Container Won't Start
# Check container logs
docker logs container_name
# Check container status
docker ps -a
# Inspect container
docker inspect container_name
Performance Issues
# Monitor container resources
docker stats
# Check container processes
docker top container_name
# Analyze container filesystem
docker exec -it container_name df -h
Network Issues
# List networks
docker network ls
# Inspect network
docker network inspect network_name
# Test connectivity
docker exec -it container_name ping other_container
Debugging Techniques
Interactive Debugging
# Run container with shell
docker run -it --rm myapp:latest sh
# Execute shell in running container
docker exec -it container_name sh
# Copy files from container
docker cp container_name:/app/logs ./logs
Log Analysis
# Follow logs in real-time
docker logs -f container_name
# Filter logs by timestamp
docker logs --since="2024-01-01T00:00:00" container_name
# Save logs to file
docker logs container_name > app.log
Conclusion
Docker has become an essential tool for modern application development and deployment. By mastering Docker fundamentals, best practices, and advanced techniques, you can create efficient, scalable, and maintainable containerized applications.
The key to successful Docker implementation is understanding the containerization concepts, following security best practices, and optimizing for your specific use case. With proper Docker knowledge, you can streamline your development workflow and deploy applications with confidence across any environment.