Infrastructure Monitoring & Edge Routing Stack
Overview
This repository provides a standardized, portable Docker-based stack for:
- Host-level monitoring
- Metrics collection and visualization
- Service uptime tracking
- Reverse proxy routing for internal services
It is designed to be deployed consistently across multiple nodes, with minimal per-host customization via environment variables.
Core Components
| Component | Purpose |
|---|---|
| Traefik | Edge router and reverse proxy with automatic TLS |
| Prometheus | Metrics collection and storage |
| Grafana | Metrics visualization and dashboards |
| Node Exporter | Host-level metrics (CPU, memory, disk, etc.) |
| Uptime Kuma | Service uptime monitoring and alerting |
Architecture Summary
- Each host runs:
- A local monitoring stack
- A Traefik instance for routing
- Services are exposed internally and routed via Traefik using host-based rules
- TLS certificates are automatically managed via Cloudflare DNS
- Configuration is environment-driven for portability
Repository Structure
infra/ docker/ monitoring/ docker-compose.yml prometheus.yml traefik/ docker-compose.yml traefik.yml middlewares.yml env/ example.env prod/ node1.env node2.env scripts/ bootstrap.sh Makefile
Prerequisites
-
Docker + DockerCompose -
A configured external Docker network:
docker network create frontend -
Cloudflare API token (for TLS certificate provisioning)
-
Basic understanding of Docker and reverse proxies
Configuration
All configuration is driven via .env files.
Setup
- Copy the example environment file:
cp env/example.env .env
- Modify values as needed:
- Domains (e.g.,
grafana.vpn.savant.io) - Ports
- Credentials
- File paths
Deployment
Start Traefik
docker compose -f docker/traefik/docker-compose.yml up -d
Start Monitoring Stack
docker compose -f docker/monitoring/docker-compose.yml up -d
Access Points
| Service | URL |
|---|---|
| Traefik UI | https://traefik.vpn.savant.io |
| Grafana | https://grafana.vpn.savant.io |
| Prometheus | http://:9090 |
| Uptime Kuma | http://:3001 |
Customization
Adding a New Service Behind Traefik
Add labels to any container:
labels:
- "traefik.enable=true"
- "traefik.http.routers.myapp.rule=Host(`myapp.example.com`)"
- "traefik.http.routers.myapp.entrypoints=websecure"
- "traefik.http.routers.myapp.tls=true"
- "traefik.http.services.myapp.loadbalancer.server.port=3000"
Per-Node Configuration
Each node can override:
- Domain names
- Ports
- Credentials
- File paths
Using its own .env file:
env/prod/nodeX.env
Operational Notes
- Do not expose Grafana or Traefik publicly without authentication
- Change default credentials immediately
- Ensure
acme.jsonhas correct permissions:
chmod 600 /opt/traefik/acme.json
node-exporterruns in host mode and requires elevated visibility into the system
Scaling Strategy
For multiple nodes:
- Maintain a single Git repository
- Use per-node .env files
- Deploy via:
- Manual git pull + docker compose
- Or automation tools (e.g., Ansible)
Future Improvements
- Centralized logging (Loki / ELK)
- Alerting integration (Alertmanager)
- SSO / authentication layer (e.g., OAuth / Authelia)
- GitOps-based deployment model
Quick Start (TL;DR)
git clone <repo>
cd infra
cp env/example.env .env
# edit .env
vim .env
docker network create frontend
docker compose -f docker/traefik/docker-compose.yml up -d
docker compose -f docker/monitoring/docker-compose.yml up -d
Purpose
This project aims to provide:
- A repeatable infrastructure baseline
- Immediate visibility into host and service health
- A clean entry point for expanding internal services
It is intentionally minimal, composable, and environment-driven.