Skip to content

Cluster Configuration Migration: On-Premises to AKS

Status: Authored 2026-04-30 Audience: Platform engineers and infrastructure architects migrating cluster-level configuration from self-managed Kubernetes or OpenShift to AKS. Scope: Node pools, VM sizing, availability zones, CNI selection, kubelet configuration, cluster autoscaler, node auto-provisioning, and maintenance windows.


1. Cluster design decisions

Before creating your first AKS cluster, make these design decisions. Each maps to a configuration that is difficult or impossible to change after cluster creation.

Cluster identity

Decision Options Recommendation
Cluster identity type System-assigned managed identity, User-assigned managed identity User-assigned managed identity for production (portable, pre-configurable RBAC)
Entra ID integration Enabled, Disabled Always enable for production. Maps Entra ID groups to K8s RBAC
Azure RBAC for K8s Enabled (Azure RBAC), Disabled (K8s-native RBAC) Azure RBAC for centralized management; K8s-native RBAC if migrating existing RBAC policies
Local accounts Enabled, Disabled Disable local accounts for production (force Entra ID auth)

Networking (immutable after creation)

Decision Options Recommendation
Network plugin Azure CNI Overlay, Azure CNI (VNet), Azure CNI + Cilium, kubenet Azure CNI Overlay for most workloads; Azure CNI + Cilium for advanced network policy and observability
Network policy Azure (Azure NPM), Calico, Cilium Cilium (if using Azure CNI + Cilium); Calico for existing Calico policy migration
Pod CIDR Custom (default: 10.244.0.0/16) Size for growth. /16 provides 65K pod IPs per cluster. Overlay mode does not consume VNet IPs
Service CIDR Custom (default: 10.0.0.0/16) Non-overlapping with VNet and pod CIDR
DNS service IP Within service CIDR Default is typically fine
Private cluster Enabled (no public API endpoint), Disabled Enable for production / federal workloads. API server accessible only via Private Link
Outbound type Load balancer, User-defined routing, NAT Gateway, None User-defined routing with Azure Firewall for federal (egress control)

SKU and SLA

Decision Options Recommendation
AKS SKU Free, Standard, Premium Standard for production (99.95% SLA). Premium for LTS + advanced features
Uptime SLA Included in Standard/Premium Standard tier includes financially-backed SLA
AKS Automatic Enabled, Disabled Consider for new clusters where opinionated defaults are acceptable

2. Node pool configuration

System node pool

Every AKS cluster requires at least one system node pool running critical system pods (CoreDNS, metrics-server, kube-proxy, Azure CNI, CSI drivers).

# Create AKS cluster with system node pool
az aks create \
  --resource-group rg-aks-prod \
  --name aks-prod-eastus2 \
  --location eastus2 \
  --kubernetes-version 1.30 \
  --network-plugin azure \
  --network-plugin-mode overlay \
  --network-dataplane cilium \
  --enable-managed-identity \
  --assign-identity /subscriptions/.../resourceGroups/.../providers/Microsoft.ManagedIdentity/userAssignedIdentities/umi-aks-prod \
  --enable-aad \
  --enable-azure-rbac \
  --disable-local-accounts \
  --enable-private-cluster \
  --private-dns-zone system \
  --outbound-type userDefinedRouting \
  --node-count 3 \
  --node-vm-size Standard_D4s_v5 \
  --nodepool-name system \
  --nodepool-labels nodepool=system \
  --nodepool-taints CriticalAddonsOnly=true:NoSchedule \
  --zones 1 2 3 \
  --enable-cluster-autoscaler \
  --min-count 3 \
  --max-count 5 \
  --tier standard \
  --enable-defender \
  --enable-workload-identity \
  --enable-oidc-issuer \
  --attach-acr /subscriptions/.../resourceGroups/.../providers/Microsoft.ContainerRegistry/registries/csainaboxacr \
  --tags environment=production team=platform

User node pools

Create separate node pools for different workload types. This replaces the "one big cluster" pattern common in self-managed Kubernetes with targeted node pools.

# General-purpose workload pool
az aks nodepool add \
  --resource-group rg-aks-prod \
  --cluster-name aks-prod-eastus2 \
  --name workload \
  --node-vm-size Standard_D8s_v5 \
  --node-count 5 \
  --zones 1 2 3 \
  --enable-cluster-autoscaler \
  --min-count 3 \
  --max-count 20 \
  --labels workload-type=general \
  --max-pods 110 \
  --mode User

# Memory-optimized pool (for data-intensive workloads)
az aks nodepool add \
  --resource-group rg-aks-prod \
  --cluster-name aks-prod-eastus2 \
  --name highmem \
  --node-vm-size Standard_E16s_v5 \
  --node-count 2 \
  --zones 1 2 3 \
  --enable-cluster-autoscaler \
  --min-count 2 \
  --max-count 10 \
  --labels workload-type=memory-intensive \
  --node-taints workload=memory-intensive:NoSchedule \
  --mode User

# GPU pool (for ML inference / model serving)
az aks nodepool add \
  --resource-group rg-aks-prod \
  --cluster-name aks-prod-eastus2 \
  --name gpu \
  --node-vm-size Standard_NC24ads_A100_v4 \
  --node-count 0 \
  --zones 1 \
  --enable-cluster-autoscaler \
  --min-count 0 \
  --max-count 4 \
  --labels workload-type=gpu accelerator=nvidia-a100 \
  --node-taints nvidia.com/gpu=present:NoSchedule \
  --mode User

# Spot pool (for batch / fault-tolerant workloads)
az aks nodepool add \
  --resource-group rg-aks-prod \
  --cluster-name aks-prod-eastus2 \
  --name spot \
  --node-vm-size Standard_D8s_v5 \
  --node-count 0 \
  --enable-cluster-autoscaler \
  --min-count 0 \
  --max-count 30 \
  --priority Spot \
  --eviction-policy Delete \
  --spot-max-price -1 \
  --labels workload-type=batch kubernetes.azure.com/scalesetpriority=spot \
  --node-taints kubernetes.azure.com/scalesetpriority=spot:NoSchedule \
  --mode User

# FIPS-enabled pool (for federal compliance)
az aks nodepool add \
  --resource-group rg-aks-prod \
  --cluster-name aks-prod-eastus2 \
  --name fips \
  --node-vm-size Standard_D8s_v5 \
  --node-count 3 \
  --zones 1 2 3 \
  --enable-cluster-autoscaler \
  --min-count 3 \
  --max-count 15 \
  --enable-fips-image \
  --labels workload-type=fips-required \
  --mode User

VM size mapping: on-prem to Azure

On-prem server profile Azure VM size vCPU Memory Notes
General worker (4C/16GB) Standard_D4s_v5 4 16 GB System pool, light workloads
General worker (8C/32GB) Standard_D8s_v5 8 32 GB Most application workloads
General worker (16C/64GB) Standard_D16s_v5 16 64 GB Higher-density workloads
Memory-optimized (8C/64GB) Standard_E8s_v5 8 64 GB Caching, in-memory processing
Memory-optimized (16C/128GB) Standard_E16s_v5 16 128 GB Spark executors, large caches
Compute-optimized (8C/16GB) Standard_F8s_v2 8 16 GB CPU-intensive batch jobs
Storage-optimized (local NVMe) Standard_L8s_v3 8 64 GB Local SSD for databases, etcd
GPU (single GPU) Standard_NC6s_v3 6 112 GB ML inference (V100)
GPU (A100) Standard_NC24ads_A100_v4 24 220 GB ML training and inference
GPU (H100) Standard_ND96isr_H100_v5 96 1900 GB Large-scale ML training

3. Availability zones

AKS supports spreading node pools across Azure availability zones for high availability. This replaces the rack-aware scheduling and failure-domain configuration in self-managed clusters.

Zone-redundant deployment

# Node pool spread across all 3 zones
az aks nodepool add \
  --name workload \
  --zones 1 2 3 \
  --node-count 6  # 2 nodes per zone

Zone topology constraints

Use pod topology spread constraints to ensure even pod distribution across zones:

apiVersion: apps/v1
kind: Deployment
metadata:
    name: api-server
spec:
    replicas: 6
    template:
        spec:
            topologySpreadConstraints:
                - maxSkew: 1
                  topologyKey: topology.kubernetes.io/zone
                  whenUnsatisfiable: DoNotSchedule
                  labelSelector:
                      matchLabels:
                          app: api-server

Zone-aware storage

Azure Managed Disks are zonal resources. For StatefulSets using Azure Disk, the pod and disk must be in the same zone. AKS handles this automatically with the volumeBindingMode: WaitForFirstConsumer storage class.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
    name: managed-premium-zrs
provisioner: disk.csi.azure.com
parameters:
    skuName: Premium_ZRS # Zone-redundant storage
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true

4. CNI selection guide

  • Pods get IPs from a private CIDR (not VNet IPs)
  • Scales to thousands of pods without VNet exhaustion
  • Lower IP planning overhead
  • Compatible with Calico and Cilium network policies

Azure CNI (VNet)

  • Every pod gets a VNet IP address
  • Pods are directly reachable from VNet-peered networks
  • Higher IP planning overhead (need large subnets)
  • Best when: pods must be directly addressable from VNet or on-prem

Azure CNI powered by Cilium

  • eBPF-based dataplane (replaces iptables/ipvs)
  • Advanced network policy (L7 policy, DNS policy, FQDN policy)
  • Network observability (Hubble)
  • Better performance than iptables-based CNI
  • Best when: advanced network policy, observability, or high-performance networking required
  • Basic overlay networking
  • Limited to 400 nodes per cluster
  • No network policy support without Calico addon
  • Only use when: migrating from kubenet-based clusters and not ready to change CNI

5. Kubelet configuration

AKS supports custom kubelet configuration via JSON. This replaces the kubelet flags and configuration files in self-managed clusters.

# Create kubelet config file
cat > kubelet-config.json << 'EOF'
{
  "cpuManagerPolicy": "static",
  "cpuCfsQuota": true,
  "cpuCfsQuotaPeriod": "100ms",
  "topologyManagerPolicy": "best-effort",
  "allowedUnsafeSysctls": [
    "net.core.somaxconn",
    "net.ipv4.tcp_keepalive_time"
  ],
  "containerLogMaxSizeMB": 100,
  "containerLogMaxFiles": 5,
  "podMaxPids": 4096,
  "imageGcHighThreshold": 85,
  "imageGcLowThreshold": 80
}
EOF

# Apply to node pool
az aks nodepool add \
  --name highperf \
  --kubelet-config kubelet-config.json \
  --node-vm-size Standard_D16s_v5 \
  --node-count 3

Common kubelet configurations for data workloads

Setting Value Use case
cpuManagerPolicy: static Guaranteed QoS pods get dedicated CPUs Spark executors, database pods
topologyManagerPolicy: best-effort NUMA-aware scheduling GPU workloads, high-performance computing
podMaxPids: 4096 Higher PID limit per pod Java applications, Spark (many threads)
containerLogMaxSizeMB: 100 Larger container log files Debug scenarios, verbose logging

6. Cluster autoscaler configuration

Basic autoscaler

# Enable autoscaler on a node pool
az aks nodepool update \
  --resource-group rg-aks-prod \
  --cluster-name aks-prod-eastus2 \
  --name workload \
  --enable-cluster-autoscaler \
  --min-count 3 \
  --max-count 20

Autoscaler profile (cluster-wide settings)

az aks update \
  --resource-group rg-aks-prod \
  --name aks-prod-eastus2 \
  --cluster-autoscaler-profile \
    scan-interval=10s \
    scale-down-delay-after-add=10m \
    scale-down-delay-after-delete=10s \
    scale-down-unneeded-time=10m \
    scale-down-utilization-threshold=0.5 \
    max-graceful-termination-sec=600 \
    balance-similar-node-groups=true \
    expander=least-waste \
    skip-nodes-with-local-storage=false \
    skip-nodes-with-system-pods=true \
    max-node-provision-time=15m \
    max-total-unready-percentage=45 \
    ok-total-unready-count=3 \
    new-pod-scale-up-delay=0s

Autoscaler profile mapping from self-managed

Self-managed flag AKS profile parameter Notes
--scan-interval scan-interval How often autoscaler checks for pending pods
--scale-down-delay-after-add scale-down-delay-after-add Wait time before scale-down after scale-up
--scale-down-utilization-threshold scale-down-utilization-threshold Node utilization below which node is candidate for removal
--expander expander Options: random, most-pods, least-waste, priority
--max-node-provision-time max-node-provision-time Timeout for new node to become ready

7. Node auto-provisioning (NAP)

NAP, built on Karpenter, automatically selects the optimal VM size for pending pods based on their resource requests, node selectors, and tolerations. This replaces the manual VM size selection in self-managed clusters.

# Enable NAP on the cluster
az aks update \
  --resource-group rg-aks-prod \
  --name aks-prod-eastus2 \
  --enable-node-auto-provisioning \
  --nap-managed-network-plugin azure \
  --nap-managed-network-dataplane cilium

NAP automatically:

  • Selects the cheapest VM size that satisfies pod requirements
  • Uses Spot VMs when pods tolerate the kubernetes.azure.com/scalesetpriority=spot taint
  • Consolidates underutilized nodes by rescheduling pods and removing nodes
  • Respects pod topology spread constraints and anti-affinity rules

8. Maintenance windows

AKS maintenance windows replace the manual upgrade scheduling in self-managed clusters.

# Configure planned maintenance window for upgrades
az aks maintenancewindow add \
  --resource-group rg-aks-prod \
  --cluster-name aks-prod-eastus2 \
  --name default \
  --schedule-type Weekly \
  --day-of-week Saturday \
  --start-hour 2 \
  --duration 4 \
  --utc-offset -05:00

# Configure maintenance window for node OS updates
az aks maintenancewindow add \
  --resource-group rg-aks-prod \
  --cluster-name aks-prod-eastus2 \
  --name aksManagedNodeOSUpgradeSchedule \
  --schedule-type Weekly \
  --day-of-week Sunday \
  --start-hour 2 \
  --duration 4 \
  --utc-offset -05:00

9. Bicep deployment template

For teams using infrastructure as code (recommended for CSA-in-a-Box deployments), here is a representative Bicep template:

@description('AKS cluster configuration for CSA-in-a-Box data platform')
param clusterName string = 'aks-csa-prod'
param location string = resourceGroup().location
param kubernetesVersion string = '1.30'
param systemNodeCount int = 3
param workloadNodeCount int = 5

resource aksCluster 'Microsoft.ContainerService/managedClusters@2024-06-02-preview' = {
  name: clusterName
  location: location
  identity: {
    type: 'UserAssigned'
    userAssignedIdentities: {
      '${managedIdentity.id}': {}
    }
  }
  sku: {
    name: 'Base'
    tier: 'Standard'
  }
  properties: {
    kubernetesVersion: kubernetesVersion
    dnsPrefix: clusterName
    enableRBAC: true
    aadProfile: {
      managed: true
      enableAzureRBAC: true
      tenantID: subscription().tenantId
    }
    disableLocalAccounts: true
    networkProfile: {
      networkPlugin: 'azure'
      networkPluginMode: 'overlay'
      networkDataplane: 'cilium'
      networkPolicy: 'cilium'
      podCidr: '10.244.0.0/16'
      serviceCidr: '10.0.0.0/16'
      dnsServiceIP: '10.0.0.10'
      outboundType: 'userDefinedRouting'
      loadBalancerSku: 'standard'
    }
    apiServerAccessProfile: {
      enablePrivateCluster: true
      privateDNSZone: 'system'
    }
    autoUpgradeProfile: {
      upgradeChannel: 'patch'
      nodeOSUpgradeChannel: 'NodeImage'
    }
    securityProfile: {
      defender: {
        securityMonitoring: {
          enabled: true
        }
      }
      workloadIdentity: {
        enabled: true
      }
    }
    oidcIssuerProfile: {
      enabled: true
    }
    agentPoolProfiles: [
      {
        name: 'system'
        count: systemNodeCount
        vmSize: 'Standard_D4s_v5'
        osDiskSizeGB: 128
        osDiskType: 'Managed'
        osType: 'Linux'
        mode: 'System'
        availabilityZones: ['1', '2', '3']
        enableAutoScaling: true
        minCount: 3
        maxCount: 5
        nodeTaints: ['CriticalAddonsOnly=true:NoSchedule']
        nodeLabels: { nodepool: 'system' }
        maxPods: 110
      }
      {
        name: 'workload'
        count: workloadNodeCount
        vmSize: 'Standard_D8s_v5'
        osDiskSizeGB: 256
        osDiskType: 'Managed'
        osType: 'Linux'
        mode: 'User'
        availabilityZones: ['1', '2', '3']
        enableAutoScaling: true
        minCount: 3
        maxCount: 20
        nodeLabels: { 'workload-type': 'general' }
        maxPods: 110
      }
    ]
  }
}

10. Post-creation cluster configuration

After creating the AKS cluster, apply these configurations:

# Get cluster credentials
az aks get-credentials --resource-group rg-aks-prod --name aks-prod-eastus2

# Install NGINX Ingress Controller
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm install ingress-nginx ingress-nginx/ingress-nginx \
  --namespace ingress-nginx --create-namespace \
  --set controller.service.annotations."service\.beta\.kubernetes\.io/azure-load-balancer-internal"="true" \
  --set controller.nodeSelector."kubernetes\.io/os"=linux

# Install cert-manager
helm repo add jetstack https://charts.jetstack.io
helm install cert-manager jetstack/cert-manager \
  --namespace cert-manager --create-namespace \
  --set installCRDs=true \
  --set nodeSelector."kubernetes\.io/os"=linux

# Enable Azure Key Vault Secrets Provider
az aks enable-addons \
  --resource-group rg-aks-prod \
  --name aks-prod-eastus2 \
  --addons azure-keyvault-secrets-provider

# Enable Azure Monitor Container Insights
az aks enable-addons \
  --resource-group rg-aks-prod \
  --name aks-prod-eastus2 \
  --addons monitoring \
  --workspace-resource-id /subscriptions/.../resourceGroups/.../providers/Microsoft.OperationalInsights/workspaces/law-csa-prod

# Enable Flux GitOps
az k8s-configuration flux create \
  --resource-group rg-aks-prod \
  --cluster-name aks-prod-eastus2 \
  --cluster-type managedClusters \
  --name cluster-config \
  --url https://github.com/org/aks-cluster-config \
  --branch main \
  --kustomization name=infra path=./infrastructure prune=true \
  --kustomization name=apps path=./applications prune=true dependsOn=infra

Maintainers: CSA-in-a-Box core team Last updated: 2026-04-30 Related: Workload Migration | Networking Migration | Feature Mapping