Multi-Node Installation

Production high-availability deployment

You're viewing a development version of manager, the latest released version is 1.6.1

Overview

This guide describes the installation of the AgileTV CDN Manager across multiple nodes for production deployments. This configuration provides high availability and horizontal scaling capabilities.

Air-Gapped Deployment? This guide assumes internet connectivity. For air-gapped deployments, see the Air-Gapped Deployment Guide for additional requirements and procedures.

Prerequisites

Hardware Requirements

Refer to the System Requirements Guide for hardware specifications. Production deployments require:

Minimum 3 Server nodes (Control Plane Only or Combined role)
Optional Agent nodes for additional workload capacity

Operating System

Refer to the System Requirements Guide for supported operating systems.

Software Access

Installation ISO: esb3027-acd-manager-X.Y.Z.iso (for each node)
Extras ISO (air-gapped only): esb3027-acd-manager-extras-X.Y.Z.iso

Network Configuration

Ensure that required firewall ports are configured between all nodes before installation. See the Configuring Segregated Networks guide for the standard firewall configuration.

Note: When using segregated networks, the K3s API server on the primary node will be reachable via its internal/private interface. Consequently, when joining additional nodes, the <primary-server-ip> provided to the join script must be the internal/private IP address of the primary node to ensure the join request is routed correctly through the private network.

Single-NIC Deployments: If your nodes have only a single network interface, see the Shared Interface Setup guide instead. This guide assumes segregated networks with separate interfaces for cluster traffic (eth1) and external access (eth0).

Segregated Network Configuration

If your nodes have multiple network interfaces and you want to use a separate interface for cluster traffic (not the default route interface), configure the INSTALL_K3S_EXEC environment variable before installing the cluster or joining nodes.

For segregated networks (private cluster network on eth1 + public external access on eth0), set all three K3s flags:

# For server nodes
export INSTALL_K3S_EXEC="server --node-ip=<ETH1_IP> --node-external-ip=<ETH0_IP> --flannel-iface=eth1 --advertise-address=<ETH1_IP>"

# For agent nodes  
export INSTALL_K3S_EXEC="agent --node-ip=<ETH1_IP> --node-external-ip=<ETH0_IP> --flannel-iface=eth1"

Where:

Mode: Use server for the primary node establishing the cluster, or for additional server nodes. Use agent for agent nodes joining the cluster.
--node-ip=<ETH1_IP>: The internal/private IP address of eth1 for cluster communication
--node-external-ip=<ETH0_IP>: The public IP address of eth0 for external access (LoadBalancer services, ingress)
--flannel-iface=eth1: The network interface name for Flannel VXLAN overlay traffic
--advertise-address=<ETH1_IP>: The address the API server uses to advertise itself to cluster members. Must be set to the internal/private IP address in a segregated-network deployment; without this flag, k3s defaults to the external IP when --node-external-ip is set, causing the kubernetes service endpoint to register as an unreachable address. This flag is required for server nodes only; agent nodes do not run an API server.

Set this variable on each node before running the install or join scripts.

SELinux

If SELinux is to be used, it must be set to “Enforcing” mode before running the installer script. The installer will configure appropriate SELinux policies automatically. SELinux cannot be enabled after installation.

Installation Steps

Step 1: Prepare the Primary Server Node

Mount the installation ISO on the primary server node:

mkdir -p /mnt/esb3027
mount -o loop,ro esb3027-acd-manager-X.Y.Z.iso /mnt/esb3027

Replace X.Y.Z with the actual version number.

Step 2: Install the Base Cluster on Primary Server

Segregated Networks: If your node has multiple network interfaces, set the INSTALL_K3S_EXEC environment variable with the complete segregated network configuration before running the installer (see Segregated Network Configuration):
export INSTALL_K3S_EXEC="server --node-ip=<ETH1_IP> --node-external-ip=<ETH0_IP> --flannel-iface=eth1 --advertise-address=<ETH1_IP>"
Replace <ETH1_IP> with the internal/private IP address and <ETH0_IP> with the public IP address.

If your node has only a single network interface, do not set INSTALL_K3S_EXEC. K3s will use the default interface automatically.

Run the installer to set up the K3s Kubernetes cluster:

/mnt/esb3027/install

This installs:

K3s Kubernetes distribution
Longhorn distributed storage
Cloudnative PG operator for PostgreSQL
Base system dependencies

Important: After the installer completes, verify that all system pods in both namespaces are in the Running state before proceeding:

# Check kube-system namespace (Kubernetes core components)
kubectl get pods -n kube-system

# Check longhorn-system namespace (distributed storage)
kubectl get pods -n longhorn-system

All pods should show Running status. If any pods are still Pending or ContainerCreating, wait until they are ready. Proceeding with incomplete system pods can cause subsequent steps to fail in unpredictable ways.

This verification confirms:

K3s cluster is operational
Longhorn distributed storage is running
Cloudnative PG operator is deployed
All core components are healthy before continuing

Step 3: Retrieve the Node Token

Retrieve the node token for joining additional nodes:

cat /var/lib/rancher/k3s/server/node-token

Save this token for use on additional nodes. Also note the IP address of the primary server node.

Step 4: Server vs Agent Node Roles

Before joining additional nodes, determine which nodes will serve as Server nodes vs Agent nodes:

Role	Control Plane	Workloads	HA Quorum	Use Case
Server Node (Combined)	Yes (etcd, API server)	Yes	Participates	Default production role; minimum 3 nodes
Server Node (Control Plane Only)	Yes (etcd, API server)	No	Participates	Dedicated control plane; requires separate Agent nodes
Agent Node	No	Yes	No	Additional workload capacity only

Guidance:

Combined role (default): Server nodes run both control plane and workloads; minimum 3 nodes required for HA
Control Plane Only: Dedicate nodes to control plane functions; requires at least 3 Server nodes plus 3+ Agent nodes for workloads
Agent nodes are required if using Control Plane Only servers; optional if using Combined role servers
For most deployments, 3 Server nodes (Combined role) with no Agent nodes is sufficient
Add Agent nodes to scale workload capacity without affecting control plane quorum

Proceed to Step 5 to join Server nodes. Agent nodes are joined after all Server nodes are ready.

Step 5: Join Additional Server Nodes

On each additional server node:

Mount the ISO:

mkdir -p /mnt/esb3027
mount -o loop,ro esb3027-acd-manager-X.Y.Z.iso /mnt/esb3027

Join the cluster:

Segregated Networks: If your node has multiple network interfaces, set the INSTALL_K3S_EXEC environment variable with the complete segregated network configuration before running the join script (see Segregated Network Configuration):
export INSTALL_K3S_EXEC="server --node-ip=<ETH1_IP> --node-external-ip=<ETH0_IP> --flannel-iface=eth1 --advertise-address=<ETH1_IP>"
Replace <ETH1_IP> with the internal/private IP address and <ETH0_IP> with the public IP address.

If your node has only a single network interface, do not set INSTALL_K3S_EXEC. K3s will use the default interface automatically.

Note for Segregated Networks: When joining nodes in a segregated network environment, ensure the <primary-server-ip> used in the join command is the internal/private IP address (the eth1 address) of the primary server. Using the external IP will cause the join attempt to fail as the service will be listening on the private interface.

Run the join script:

/mnt/esb3027/join-server https://<primary-server-ip>:6443 <node-token>

Replace <primary-server-ip> with the IP address of the primary server and <node-token> with the token retrieved in Step 3.

Verify the node joined successfully:

kubectl get nodes

Repeat for each server node. A minimum of 3 server nodes is required for high availability.

Step 5b: Taint Control Plane Only Nodes (Optional)

If you are using dedicated Control Plane Only nodes (not Combined role), apply taints to prevent workload scheduling:

kubectl taint nodes <node-name> CriticalAddonsOnly=true:NoSchedule

Apply this taint to each Control Plane Only node. Verify taints are applied:

kubectl describe nodes | grep -A 5 "Taints"

Note: This step is only required if you want dedicated control plane nodes. For Combined role deployments, do not apply taints.

Important: Control Plane Only Server nodes can be deployed with lower hardware specifications (2 cores, 4 GiB, 64 GiB) than the installer’s default minimum requirements. If your Control Plane Only Server nodes do not meet the Single-Node Lab configuration minimums (8 cores, 16 GiB, 128 GiB), you must set the SKIP_REQUIREMENTS_CHECK environment variable before running the installer or join command:
# For the primary server node
export SKIP_REQUIREMENTS_CHECK=1
/mnt/esb3027/install

# For additional Control Plane Only Server nodes
export SKIP_REQUIREMENTS_CHECK=1
/mnt/esb3027/join-server https://<primary-server-ip>:6443 <node-token>
Note: This applies to Server nodes only. Agent nodes have separate minimum requirements.

Step 6: Join Agent Nodes (Optional)

On each agent node:

Mount the ISO:

mkdir -p /mnt/esb3027
mount -o loop,ro esb3027-acd-manager-X.Y.Z.iso /mnt/esb3027

Join the cluster as an agent:

Segregated Networks: If your node has multiple network interfaces, set the INSTALL_K3S_EXEC environment variable with the complete segregated network configuration before running the join script (see Segregated Network Configuration):
export INSTALL_K3S_EXEC="agent --node-ip=<ETH1_IP> --node-external-ip=<ETH0_IP> --flannel-iface=eth1"
Replace <ETH1_IP> with the internal/private IP address and <ETH0_IP> with the public IP address.

If your node has only a single network interface, do not set INSTALL_K3S_EXEC. K3s will use the default interface automatically.

Run the join script:

/mnt/esb3027/join-agent https://<primary-server-ip>:6443 <node-token>

Note for Segregated Networks: When joining nodes in a segregated network environment, ensure the <primary-server-ip> used in the join command is the internal/private IP address (the eth1 address) of the primary server. Using the external IP will cause the join attempt to fail as the service will be listening on the private interface.

Verify the node joined successfully from an existing server node:
```
kubectl get nodes
```

Agent nodes provide additional workload capacity but do not participate in the control plane quorum.

Step 7: Verify Cluster Status

After all nodes are joined, verify the cluster is operational:

1. Verify all nodes are ready:

kubectl get nodes

Expected output:

NAME                 STATUS   ROLES                       AGE   VERSION
k3s-server-0         Ready    control-plane,etcd,master   5m    v1.33.4+k3s1
k3s-server-1         Ready    control-plane,etcd,master   3m    v1.33.4+k3s1
k3s-server-2         Ready    control-plane,etcd,master   2m    v1.33.4+k3s1
k3s-agent-1          Ready    <none>                      1m    v1.33.4+k3s1
k3s-agent-2          Ready    <none>                      1m    v1.33.4+k3s1

2. Verify system pods in both namespaces are running:

# Check kube-system namespace (Kubernetes core components)
kubectl get pods -n kube-system

# Check longhorn-system namespace (distributed storage)
kubectl get pods -n longhorn-system

All pods should show Running status. If any pods are still Pending or ContainerCreating, wait until they are ready.

This verification confirms:

K3s cluster is operational across all nodes
Longhorn distributed storage is running
Cloudnative PG operator is deployed
All core components are healthy before proceeding to application deployment

Step 9: Air-Gapped Deployments (If Applicable)

If deploying in an air-gapped environment, on each node:

mkdir -p /mnt/esb3027-extras
mount -o loop,ro esb3027-acd-manager-extras-X.Y.Z.iso /mnt/esb3027-extras
/mnt/esb3027-extras/load-images

Step 10: Deploy the Manager Helm Chart

For complete instructions on deploying the CDN Manager Helm chart, including configuration file setup, MaxMind GeoIP database loading, TLS certificate configuration, deployment commands, and verification steps, see the Helm Chart Installation Guide.

This guide covers the common deployment steps that apply to all installation types. After completing the helm chart installation steps, proceed to Post-Installation below.

Step 15: Configure DNS (Optional)

Add DNS records for the manager hostname. For high availability, configure multiple A records pointing to different server nodes:

manager.example.com.  IN  A  <server-1-ip>
manager.example.com.  IN  A  <server-2-ip>
manager.example.com.  IN  A  <server-3-ip>

Alternatively, configure a load balancer to distribute traffic across nodes.

Post-Installation

After installation completes, proceed to the Next Steps guide for:

Initial user configuration
Accessing the web interfaces
Configuring authentication
Setting up monitoring

Accessing the System

Refer to the Accessing the System section in the Getting Started guide for service URLs and default credentials.

Note: A self-signed SSL certificate is deployed by default. For production deployments, configure a valid SSL certificate before exposing the system to users.

High Availability Considerations

Pod Distribution

The Helm chart configures pod anti-affinity rules to ensure:

Kafka controllers are scheduled on separate nodes
PostgreSQL cluster members are distributed across nodes
Application pods are spread across available nodes

Data Replication and Failure Tolerance

For detailed information on data replication strategies and failure scenario tolerance, refer to the Architecture Guide and System Requirements Guide.

Troubleshooting

If pods fail to start or nodes fail to join:

Check node status: kubectl get nodes
Describe problematic pods: kubectl describe pod <pod-name>
Review logs: kubectl logs <pod-name>
Check cluster events: kubectl get events --sort-by='.lastTimestamp'

Nodes Ready but Workloads Cannot Reach the API Server (Segregated Networks)

Symptom: All nodes show Ready status, but cluster components (kubelet, controller-manager, scheduler) or workloads fail to communicate with the API server. Pods in kube-system or longhorn-system may fail to start or remain in a crash loop.

Cause: This is caused by omitting --advertise-address from the server-node INSTALL_K3S_EXEC. When --node-external-ip is set without --advertise-address, k3s defaults the API server’s advertise address to the external IP (eth0). In a segregated-network topology where nodes are not routable to each other over eth0, the kubernetes service ClusterIP endpoint registers as an unreachable address.

Diagnostic check:

kubectl get endpoints kubernetes -n default

If the IP shown is the eth0 (external) address rather than the eth1 (internal) address, the cluster was installed without --advertise-address.

Remediation: The kubernetes service endpoint cannot be corrected by reconfiguration alone. K3s must be reinstalled on all server nodes with the correct flags:

export INSTALL_K3S_EXEC="server --node-ip=<ETH1_IP> --node-external-ip=<ETH0_IP> --flannel-iface=eth1 --advertise-address=<ETH1_IP>"

After reinstallation, re-run the diagnostic check to confirm the endpoint IP is now the eth1 (internal) address.

See the Troubleshooting Guide for additional assistance.

Next Steps

After successful installation:

Next Steps Guide - Post-installation configuration
Configuration Guide - System configuration
Operations Guide - Day-to-day operations