8.7 KiB
8.7 KiB
name, description
| name | description |
|---|---|
| container-infrastructure-ops | Maintains, troubleshoots, and optimizes containerized infrastructure using Docker, Docker Compose, Kubernetes, Helm, and CI/CD pipelines. Enables system stability, security, reproducibility, and clear technical execution. Use for deployment operations, container management, networking, storage, secrets management, monitoring, and infrastructure troubleshooting. |
Container Infrastructure Operations
Comprehensive skill for managing and maintaining software stacks hosted on containerized infrastructure.
Core Capabilities
- Docker & Docker Compose: Service orchestration, container lifecycle, volume management, networking
- Kubernetes & Helm: Cluster operations, deployment manifests, package management, upgrades
- CI/CD Pipelines: GitLab CI, GitHub Actions, and runner configuration
- Networking & Routing: Reverse proxies (Traefik, nginx), TLS/HTTPS, service discovery
- Storage & Data: Volume mounting, backup/restore, database operations, data persistence
- Security: Secrets management, access control, network policies, RBAC
- Monitoring & Logging: Health checks, log aggregation, observability
- Troubleshooting: Container debugging, resource issues, log analysis, dependency resolution
Operational Workflows
1. Service Startup & Deployment
Docker Compose:
# Start all services
docker compose up -d
# Start specific service
docker compose up -d [service_name]
# Build and start
docker compose up --build -d
# With environment file
docker compose --env-file .env up -d
Kubernetes:
# Apply manifest
kubectl apply -f deployment.yaml
# Rolling update
kubectl set image deployment/[name] [container]=[image]:[tag]
# Check rollout status
kubectl rollout status deployment/[name]
Helm:
# Install release
helm install [release-name] [chart] -f values.yaml
# Upgrade existing release
helm upgrade [release-name] [chart] -f values.yaml
# Rollback to previous version
helm rollback [release-name] [revision]
2. Service Inspection & Monitoring
Docker Compose:
# View running services
docker compose ps
# View logs (follow)
docker compose logs -f [service_name]
# View logs with time range
docker compose logs --since 10m [service_name]
# Inspect container stats
docker stats [container_id]
Kubernetes:
# List resources
kubectl get pods -n [namespace]
kubectl get svc -n [namespace]
# Describe resource (detailed info)
kubectl describe pod [pod_name] -n [namespace]
# View logs
kubectl logs [pod_name] -n [namespace]
kubectl logs -f [pod_name] -n [namespace] # Follow
# Watch resources in real-time
kubectl get pods -w -n [namespace]
3. Environment & Configuration Management
Load environment variables:
# From .env file
set -a
source .env
set +a
# Apply to specific command
env $(cat .env | xargs) docker compose up -d
Manage secrets:
# Docker Compose (from file)
docker secrets create [name] /path/to/secret
# Kubernetes
kubectl create secret generic [name] --from-file=key=/path/to/secret
kubectl create secret docker-registry [name] --docker-server=[url]
4. Troubleshooting Workflow
Container health check:
- Verify container is running:
docker compose psorkubectl get pods - Check logs:
docker compose logs [service]orkubectl logs [pod] - Inspect configuration: Check environment variables, mounted volumes, network connectivity
- Test connectivity:
docker exec [container] curl [service]orkubectl exec [pod] -- curl [service] - Resource analysis:
docker statsorkubectl top pods
Network troubleshooting:
# Docker Compose
docker network ls
docker network inspect [network_name]
# Kubernetes
kubectl get networkpolicies -n [namespace]
kubectl describe networkpolicy [name] -n [namespace]
Volume & storage issues:
# Docker Compose
docker volume ls
docker volume inspect [volume_name]
# Kubernetes
kubectl get pv
kubectl get pvc -n [namespace]
kubectl describe pvc [name] -n [namespace]
5. Backup & Restore Operations
Docker Compose volumes:
# Backup volume
docker run --rm -v [volume]:/data -v $(pwd):/backup busybox tar czf /backup/backup.tar.gz -C /data .
# Restore volume
docker run --rm -v [volume]:/data -v $(pwd):/backup busybox tar xzf /backup/backup.tar.gz -C /data
Database backup within containers:
# PostgreSQL
docker compose exec [postgres_service] pg_dump -U [user] [db] > backup.sql
# MySQL/MariaDB
docker compose exec [mysql_service] mysqldump -u [user] -p [db] > backup.sql
6. Security & Access Control
Docker security best practices:
- Use read-only root filesystem:
read_only: true - Drop unnecessary capabilities:
cap_drop: [ALL] - Run as non-root user:
user: "1000:1000" - Use secrets for sensitive data (not environment variables)
Kubernetes RBAC:
# Create service account
kubectl create serviceaccount [name] -n [namespace]
# Bind role to account
kubectl create rolebinding [binding-name] --clusterrole=[role] --serviceaccount=[namespace]:[account]
Debugging Strategies
Container execution:
# Docker Compose
docker compose exec [service] /bin/bash # Interactive shell
docker compose exec [service] ps aux # List processes
docker compose exec [service] env # View environment
# Kubernetes
kubectl exec -it [pod] -- /bin/bash
kubectl exec [pod] -- ps aux
Log analysis:
- Check application logs:
docker logsorkubectl logs - Check container startup logs: Look for early exit, missing dependencies, config errors
- Cross-reference with timestamps to correlate events across services
Resource constraints:
# Docker
docker inspect [container] | grep -A 10 Memory
# Kubernetes
kubectl top nodes
kubectl top pods -n [namespace]
Configuration Best Practices
- Immutable infrastructure: Rebuild containers rather than modifying running instances
- Health checks: Define liveness and readiness probes
- Resource limits: Set CPU/memory requests and limits to prevent resource contention
- Rolling updates: Use rolling deployment strategies to maintain availability
- Secrets separation: Store secrets outside version control (use
.env, K8s secrets, or secret managers) - Logging: Aggregate logs centrally; avoid storing logs in containers
Common Error Patterns
| Issue | Symptom | Troubleshooting |
|---|---|---|
| Port conflict | bind: address already in use |
Check existing process: lsof -i :[port]; Kill if needed |
| Missing dependency | Service fails to start | Check logs for missing service/network; Verify service startup order |
| Resource exhaustion | Slow/hanging containers | Check CPU/memory usage; Increase limits; Reduce replica count |
| Networking | Services can't communicate | Verify network name; Check firewall rules; Test DNS resolution |
| Volume mount | Permission denied in container | Verify mount path exists; Check file permissions; Confirm user ID |
| Config error | Parse/validation error at startup | Validate YAML syntax; Check environment variable substitution |
File Structure Reference
Docker Compose project:
project/
├── docker-compose.yaml # Main orchestration
├── .env # Environment variables (secrets)
├── .env.example # Template (tracked in git)
├── config/ # Configuration files
│ ├── traefik.yml
│ └── app.config
└── data/ # Persistent volumes
├── db/
└── uploads/
Kubernetes project:
k8s/
├── manifests/ # YAML definitions
│ ├── deployment.yaml
│ ├── service.yaml
│ └── configmap.yaml
├── helm/ # Helm charts
│ └── [chart-name]/
├── kustomization.yaml # Kustomize overlays
└── secrets/ # Sealed/encrypted secrets
Context-Specific Workflows
Working with JMP Server
For jmp-server Docker Compose stack:
# View all services
docker compose ps
# Start specific service
docker compose up -d gitea # Or: bookstack, traefik, etc.
# View logs for troubleshooting
docker compose logs -f traefik
docker compose logs -f gitea
# Backup database
docker compose exec gitea-db pg_dump -U gitea gitea > gitea-backup.sql
# Restart service cleanly
docker compose restart gitea
Actionable Execution
When troubleshooting or deploying:
- State the objective clearly
- Run targeted diagnostic commands
- Report findings with specific evidence (logs, output, metrics)
- Execute corrective actions with clear before/after confirmation
- Document any configuration changes for reproducibility