Skip to content

Infrastructure Testing & Verification Guide

This guide provides procedures for testing and verifying Smart Smoker infrastructure deployments.

Overview

Infrastructure testing ensures that all Proxmox containers are properly configured, secured, and operational. These tests should be run after any infrastructure changes or deployments.

Current Infrastructure

LXC Containers

Container ID Resources IP Address Purpose
github-runner 104 2 CPU, 4GB RAM, 50GB 10.20.0.10 Self-hosted GitHub Actions runner
smart-smoker-dev-cloud 105 2 CPU, 4GB RAM, 20GB 10.20.0.20 Development cloud environment
smart-smoker-cloud-prod 106 4 CPU, 8GB RAM, 40GB 10.20.0.30 Production cloud environment

Network Configuration

Network CIDR Purpose
vmbr0 192.168.1.0/24 External network
vmbr0 (secondary) 10.20.0.0/24 Container internal network
vmbr1 10.30.0.0/24 Isolated network for virtual devices

Proxmox Host: 192.168.1.151

Quick Verification

Automated Verification

The fastest way to verify infrastructure:

cd infra/proxmox/ansible
ansible-playbook playbooks/verify-all.yml

This playbook checks: - Docker installation and status - UFW firewall configuration - fail2ban service status - Docker and Node.js versions - GitHub runner configuration (if applicable) - Application directories

Quick Connectivity Test

# Test SSH connectivity to all servers
cd infra/proxmox/ansible
ansible all -m ping

Expected: All servers return pong with SUCCESS status.

Detailed Testing Procedures

Test 1: Container Connectivity

SSH to Each Container

Test SSH jump host connectivity through Proxmox:

# Test github-runner
ssh -J root@192.168.1.151 root@10.20.0.10 'hostname && uptime'

# Test dev-cloud
ssh -J root@192.168.1.151 root@10.20.0.20 'hostname && uptime'

# Test prod-cloud
ssh -J root@192.168.1.151 root@10.20.0.30 'hostname && uptime'

Expected: All commands return hostname and uptime without errors.

Network Connectivity

# Test internet connectivity
ssh -J root@192.168.1.151 root@10.20.0.10 'ping -c 3 8.8.8.8'

# Test DNS resolution
ssh -J root@192.168.1.151 root@10.20.0.10 'ping -c 3 google.com'

Expected: 0% packet loss for both tests.

Test 2: Docker Verification

Check Docker Installation

# Check all containers
for ip in 10.20.0.10 10.20.0.20 10.20.0.30; do
  echo "=== Testing Docker on $ip ==="
  ssh -J root@192.168.1.151 root@$ip 'docker --version && docker compose version'
done

Expected: - Docker version 28.5.1 or newer - Docker Compose version 2.x

Check Docker Service Status

for ip in 10.20.0.10 10.20.0.20 10.20.0.30; do
  echo "=== Testing $ip ==="
  ssh -J root@192.168.1.151 root@$ip 'systemctl is-active docker'
done

Expected: Output is active for all containers.

Test Docker Functionality

ssh -J root@192.168.1.151 root@10.20.0.10 'docker run --rm hello-world'

Expected: "Hello from Docker!" message appears.

Test 3: Node.js Verification

Check Node.js Installation

for ip in 10.20.0.10 10.20.0.20 10.20.0.30; do
  echo "=== Node.js on $ip ==="
  ssh -J root@192.168.1.151 root@$ip 'node --version && npm --version'
done

Expected: - Node.js: v20.x.x - npm: 10.x.x

Test Node.js Functionality

ssh -J root@192.168.1.151 root@10.20.0.10 'node -e "console.log(\"Node.js works!\")"'

Expected: Output is Node.js works!

Test 4: Terraform Verification (GitHub Runner Only)

Check Terraform Installation

ssh -J root@192.168.1.151 root@10.20.0.10 'terraform version'

Expected: Terraform v1.13.3 or newer

Test Terraform Functionality

ssh -J root@192.168.1.151 root@10.20.0.10 'cd /tmp && terraform init'

Expected: Terraform initializes successfully.

Test 5: GitHub Runner Verification

Check Runner Service Status

ssh -J root@192.168.1.151 root@10.20.0.10 \
  'systemctl status actions.runner.* --no-pager | head -20'

Expected: Service is active (running).

Check Runner Registration

gh api repos/benjr70/Smart-Smoker-V2/actions/runners \
  --jq '.runners[] | select(.name=="smart-smoker-runner-1") | {name, status, busy}'

Expected:

{
  "name": "smart-smoker-runner-1",
  "status": "online",
  "busy": false
}

Check Runner Logs

ssh -J root@192.168.1.151 root@10.20.0.10 \
  'journalctl -u actions.runner.* -n 50 --no-pager'

Expected: Recent activity with no errors.

Test 6: Security Configuration

UFW Firewall Status

for ip in 10.20.0.10 10.20.0.20 10.20.0.30; do
  echo "=== Checking UFW on $ip ==="
  ssh -J root@192.168.1.151 root@$ip 'ufw status verbose | head -10'
done

Expected: - Status: active - SSH port 22: ALLOW - Default incoming: deny - Default outgoing: allow

fail2ban Status

for ip in 10.20.0.10 10.20.0.20 10.20.0.30; do
  echo "=== Checking fail2ban on $ip ==="
  ssh -J root@192.168.1.151 root@$ip 'systemctl is-active fail2ban'
done

Expected: Output is active for all containers.

SSH Configuration

ssh -J root@192.168.1.151 root@10.20.0.10 \
  'grep "^PasswordAuthentication" /etc/ssh/sshd_config'

Expected: PasswordAuthentication no

Test 7: Application Environment

Dev Cloud Directories

ssh -J root@192.168.1.151 root@10.20.0.20 'ls -la /opt/smart-smoker-dev'

Expected: Directory exists with subdirectories: data, logs, backups, config

Prod Cloud Directories

ssh -J root@192.168.1.151 root@10.20.0.30 'ls -la /opt/smart-smoker-prod'

Expected: Directory exists with subdirectories: data, logs, backups, config

MongoDB Data Directories

# Dev
ssh -J root@192.168.1.151 root@10.20.0.20 'ls -la /opt/smart-smoker-dev/data/mongodb'

# Prod
ssh -J root@192.168.1.151 root@10.20.0.30 'ls -la /opt/smart-smoker-prod/data/mongodb'

Expected: Directories exist with proper permissions (owned by smoker user).

Test 8: Container Resource Usage

Check CPU and Memory

for ip in 10.20.0.10 10.20.0.20 10.20.0.30; do
  echo "=== Resources on $ip ==="
  ssh -J root@192.168.1.151 root@$ip 'free -h && df -h /'
done

Expected: - Memory usage < 80% under normal load - Disk usage has adequate free space - No swap usage under normal conditions

Test 9: Inter-Container Communication

Container-to-Container Connectivity

# From github-runner, ping dev-cloud
ssh -J root@192.168.1.151 root@10.20.0.10 'ping -c 3 10.20.0.20'

# From github-runner, ping prod-cloud
ssh -J root@192.168.1.151 root@10.20.0.10 'ping -c 3 10.20.0.30'

Expected: 0% packet loss between containers.

Test 10: Proxmox Network Configuration

Verify Bridge Configuration

ssh root@192.168.1.151 'ip addr show vmbr0 | grep "inet "'

Expected: Shows both: - inet 192.168.1.151/24 (external network) - inet 10.20.0.1/24 (container network)

Verify NAT Configuration

ssh root@192.168.1.151 'iptables -t nat -L POSTROUTING -n -v | grep 10.20.0.0'

Expected: MASQUERADE rule for 10.20.0.0/24 network.

Verify IP Forwarding

ssh root@192.168.1.151 'sysctl net.ipv4.ip_forward'

Expected: net.ipv4.ip_forward = 1

Troubleshooting

SSH Connection Failures

Problem: Cannot connect to container via SSH jump host

Solutions:

# Check container is running
ssh root@192.168.1.151 'pct list | grep -E "104|105|106"'

# Check network interface
ssh root@192.168.1.151 'pct exec 104 -- ip addr show'

# Restart container if needed
ssh root@192.168.1.151 'pct reboot 104'

Docker Service Not Running

Problem: Docker service is inactive

Solution:

ssh -J root@192.168.1.151 root@10.20.0.10 \
  'systemctl restart docker && systemctl status docker'

GitHub Runner Offline

Problem: Runner shows as offline in GitHub

Solutions:

# Restart runner service
ssh -J root@192.168.1.151 root@10.20.0.10 \
  'systemctl restart actions.runner.*'

# Check service status
ssh -J root@192.168.1.151 root@10.20.0.10 \
  'systemctl status actions.runner.* --no-pager'

# View recent logs
ssh -J root@192.168.1.151 root@10.20.0.10 \
  'journalctl -u actions.runner.* -n 100'

Network Connectivity Issues

Problem: Containers cannot reach internet

Solutions:

# Check Proxmox gateway
ssh root@192.168.1.151 'ip addr show vmbr0 | grep 10.20.0.1'

# Check NAT rules
ssh root@192.168.1.151 'iptables -t nat -L POSTROUTING -n'

# Verify IP forwarding
ssh root@192.168.1.151 'sysctl net.ipv4.ip_forward'

# Restart container networking
ssh root@192.168.1.151 'pct reboot 104'

UFW Blocks Required Ports

Problem: Firewall blocking necessary connections

Solutions:

# Check current UFW rules
ssh -J root@192.168.1.151 root@10.20.0.10 'ufw status verbose'

# Allow specific port
ssh -J root@192.168.1.151 root@10.20.0.10 'ufw allow 8080/tcp'

# Reload firewall
ssh -J root@192.168.1.151 root@10.20.0.10 'ufw reload'

Testing Checklist

Use this checklist after infrastructure changes:

  • [ ] SSH connectivity to all containers works
  • [ ] Docker installed and functional on all containers
  • [ ] Node.js installed and functional on all containers
  • [ ] Terraform installed on github-runner
  • [ ] GitHub runner is online and registered
  • [ ] UFW firewall active on all containers
  • [ ] fail2ban running on all containers
  • [ ] SSH hardened (password auth disabled)
  • [ ] Application directories exist with proper structure
  • [ ] Ansible verification playbook passes
  • [ ] Container resources are healthy (CPU, memory, disk)
  • [ ] Containers can communicate with each other
  • [ ] Proxmox network configuration is correct
  • [ ] NAT and IP forwarding configured

Automated Testing

CI/CD Workflows

The following GitHub Actions workflows provide automated testing:

  • ansible-lint.yml: Validates Ansible syntax and best practices
  • terraform-validate.yml: Validates Terraform configuration
  • runner-test.yml: Tests self-hosted runner capabilities

Ansible Verification Playbook

The verify-all.yml playbook provides automated verification:

cd infra/proxmox/ansible
ansible-playbook playbooks/verify-all.yml

This checks: - Docker installation and status - UFW and fail2ban services - Docker and Node.js versions - Application directories - GitHub runner configuration

Performance Monitoring

Resource Usage

# Monitor real-time resource usage
ssh -J root@192.168.1.151 root@10.20.0.10 'top -bn1 | head -20'

# Check system load
ssh -J root@192.168.1.151 root@10.20.0.10 'uptime'

# Check disk I/O
ssh -J root@192.168.1.151 root@10.20.0.10 'iostat -x 1 5'

Network Performance

# Test network throughput between containers
ssh -J root@192.168.1.151 root@10.20.0.10 \
  'ping -c 10 -i 0.2 10.20.0.20 | tail -1'

References