Modern datacenters and beyond

Virtualization -- Quick Reference

Key Numbers

Parameter VMware (Current) OVE (KubeVirt/KVM) Azure Local (Hyper-V)
Max nodes/cluster 64 (vSphere 8) 250 worker nodes 16 nodes
Max VMs/cluster ~8,000 ~5,000-20,000 (depends on sizing) Hundreds; multi-cluster for >500
Typical VMs/node 50-100 30-100 (50 typ.) 50-100
Max vCPU/VM 768 710 (QEMU limit) 240 (Gen2)
Max RAM/VM 24 TB 12 TB (QEMU limit) 12 TB
Live migration downtime <1 s (vMotion) 50-500 ms (pre-copy) <1 s (Live), 30-90 s (Quick)
Concurrent migrations/node 4 (default, max 8) 2 (default, configurable) 2 default
Per-VM overhead (platform) ~50 MB (vmx+vmkernel) ~150-300 MB (libvirtd+virt-launcher+QEMU) ~50 MB (worker process)
VM boot time (warm image) 10-30 s 12-40 s (+2-10 s pod setup) 10-30 s
Pod limit/node (OVE) N/A 250 default, 500 tested N/A
Support SLA (P1 response) varies 1 h (Red Hat Premium) varies (MS Unified)

Decision Matrix

Scenario Recommended Approach Notes
General-purpose Linux VM virtio disk (block-mode PVC) + masquerade net Standard, live-migratable
High-throughput DB (>10 Gbps) virtio disk + SR-IOV NIC SR-IOV blocks live migration
Windows Server VM virtio + VirtIO drivers pre-installed Install virtio-win drivers before migration
GPU/ML inference VM VFIO passthrough No live migration; plan maintenance windows
Shared GPU (VDI) NVIDIA vGPU (time-sliced) Requires NVIDIA license; supports live migration
Latency-sensitive (trading) dedicatedCpuPlacement: true + hugepages + SR-IOV Pin vCPUs, 1 GiB hugepages, NUMA passthrough
Ephemeral/test VM containerDisk (OCI image) Disk is lost on stop; fast to spin up
Disk format: QCOW2 vs raw Raw (block-mode PVC) for production QCOW2 only for filesystem-mode; raw = less overhead
Boot mode: BIOS vs UEFI UEFI (q35 machine type) Secure Boot requires efi.secureBoot: true + SMM
Migration from VMware MTV (warm migration) Use CDI + VDDK; install VirtIO drivers pre-cutover

Essential Commands

Task VMware (govc/PowerCLI) OVE (virtctl/oc) Azure Local (PowerShell)
List VMs govc vm.info '*' oc get vm -A Get-VM -CimSession $cluster
Start VM govc vm.power -on my-vm virtctl start my-vm Start-VM -Name my-vm
Stop VM (graceful) govc vm.power -s my-vm virtctl stop my-vm Stop-VM -Name my-vm
Force stop govc vm.power -off my-vm virtctl stop my-vm --force Stop-VM -Name my-vm -Force
Restart VM govc vm.power -reset my-vm virtctl restart my-vm Restart-VM -Name my-vm
Pause VM N/A virtctl pause vm my-vm Suspend-VM -Name my-vm
Console (serial) N/A virtctl console my-vm N/A
Console (VNC/RDP) VMRC / web console virtctl vnc my-vm vmconnect $host my-vm
SSH into VM direct SSH virtctl ssh user@my-vm direct SSH / RDP
Live migrate vMotion (GUI/PowerCLI) virtctl migrate my-vm Move-VM -Name my-vm -Node node2
Drain node Maintenance mode oc adm drain node-1 --delete-emptydir-data --ignore-daemonsets Suspend-ClusterNode -Name node1 -Drain
Create snapshot govc snapshot.create oc apply -f vm-snapshot.yaml Checkpoint-VM -Name my-vm
Restore snapshot govc snapshot.revert oc apply -f vm-restore.yaml Restore-VMSnapshot -Name my-vm
Clone VM govc vm.clone DataVolumeTemplate with source PVC Copy-VM or differencing disk
Upload disk image Content Library virtctl image-upload dv my-dv --image-path=disk.qcow2 --size=50Gi Azure portal / WAC
Get VM status govc vm.info my-vm oc get vmi my-vm -o wide Get-VM -Name my-vm \| fl *
View events govc events oc get events --field-selector involvedObject.name=my-vm Get-WinEvent -LogName Microsoft-Windows-Hyper-V*
Create VM from YAML N/A oc apply -f my-vm.yaml New-VM or ARM template
Set affinity DRS rules (GUI) nodeAffinity / podAntiAffinity in VM YAML Set-ClusterOwnerNode
Hot-plug CPU vCenter GUI oc patch vmi my-vm --type merge -p '{"spec":{"domain":{"cpu":{"sockets":4}}}}' Set-VM -ProcessorCount 8
Collect diag bundle vm-support oc adm must-gather --image=cnv-must-gather Get-SDDCDiagnosticInfo

Architecture at a Glance (OVE)

+===============================================================================+
|  PHYSICAL LAYER                                                               |
|  [Bare-Metal Server: x86_64 + VT-x/EPT + IOMMU + NVMe + 25GbE NIC]         |
+===============================================================================+
|  OS: Red Hat CoreOS (RHCOS) -- immutable, Ignition-provisioned                |
|  Kernel: Linux 5.x/6.x + KVM module + VFIO + vhost-net                       |
+===============================================================================+
|  KUBERNETES LAYER                                                             |
|  kubelet + CRI-O (container runtime) + OVN-Kubernetes (CNI) + CSI drivers     |
+===============================================================================+
|  KUBEVIRT LAYER                                                               |
|  virt-handler (DaemonSet/node) -- registers /dev/kvm, syncs VMI state         |
|  virt-controller (Deployment) -- watches VM CRs, creates virt-launcher pods   |
|  virt-api (Deployment) -- admission webhooks, subresource API (console, VNC)  |
+===============================================================================+
|  PER-VM: virt-launcher Pod                                                    |
|  +-------------------------------------------------------------------------+  |
|  | virt-launcher process (PID 1) -> libvirtd (per-pod) -> qemu-kvm         | |
|  | Volumes: PVC (block/fs) via CSI | Network: tap+bridge / SR-IOV (VFIO)  |  |
|  | cgroup-enforced CPU/memory limits | QEMU Guest Agent for mgmt channel   |  |
|  +-------------------------------------------------------------------------+  |
|  Guest OS runs inside QEMU/KVM (hardware-assisted, near-native perf)         |
+===============================================================================+

VMware-to-KubeVirt Concept Mapping

vSphere Concept OVE (KubeVirt) Equivalent
vCenter Server kube-apiserver + virt-controller + virt-api
ESXi Host Worker node (RHCOS) + virt-handler DaemonSet
VMX process virt-launcher Pod (QEMU process inside)
hostd + vpxa virt-handler DaemonSet
VM (in inventory) VirtualMachine CR (persists across power-off)
Running VM instance VirtualMachineInstance CR + virt-launcher Pod
VM Template ClusterInstancetype + ClusterPreference + golden DataVolume
Resource Pool Namespace + ResourceQuota + LimitRange
DRS (load balancing) kube-scheduler (placement only) + Descheduler (optional)
vMotion VirtualMachineInstanceMigration CR (or virtctl migrate)
Maintenance Mode kubectl cordon + kubectl drain
vDS / port group OVN-Kubernetes CNI + NetworkAttachmentDefinition (Multus)
VMFS / vSAN datastore StorageClass + PVCs (one PVC per disk)
VMDK PVC (block or filesystem mode)
Content Library Container registry + DataVolume sources
vSphere HA Pod rescheduling + runStrategy: Always
DRS Affinity rules podAffinity / podAntiAffinity / nodeAffinity
Alarms Prometheus alerts + Alertmanager
RBAC (permissions) Kubernetes RBAC (Roles, RoleBindings, ClusterRoles)
Tags Labels + Annotations
Snapshots VolumeSnapshot (CSI)
Guest Customization cloud-init (Linux) / Sysprep (Windows)
OVF export/import virtctl image-upload / CDI DataVolume

Live Migration Quick Reference

Parameter How to Configure (KubeVirt MigrationPolicy)
Bandwidth cap per migration bandwidthPerMigration: 1Gi
Auto-converge (throttle vCPU) allowAutoConverge: true
Timeout per GiB completionTimeoutPerGiB: 150 (64 GB VM = ~2.7 h max)
Post-copy fallback allowPostCopy: false (risky -- VM lost if source crashes)
Concurrent outbound/node parallelOutboundMigrationsPerNode: 5 (in KubeVirt CR)
Concurrent per cluster parallelMigrationsPerCluster: 20 (in KubeVirt CR)
Dedicated migration network network: migration-network (NAD name, in KubeVirt CR)

Migration time estimates (single pre-copy pass, zero dirty pages):

VM RAM 10 Gbps (~1.2 GB/s) 25 Gbps (~3.1 GB/s) 100 Gbps
8 GB ~7 s ~3 s <1 s
32 GB ~27 s ~10 s ~3 s
64 GB ~54 s ~21 s ~5 s
256 GB ~213 s ~82 s ~20 s

Real-world: multiply by 2-5x for dirty page resends and convergence.

Blockers for live migration:

Migration Checklist (VMware to OVE)

Pre-Flight (per VM)

Migration Execution (MTV)

Post-Migration Validation