Migration Tooling & Formats

Why This Matters

Migrating 5,000+ VMs is the single largest operational risk in this project. It is not a weekend activity. It is a sustained engineering operation spanning months, requiring deep knowledge of disk image formats, conversion toolchains, driver injection, network remapping, and application-level validation. A single misconfigured virtio driver can render a Windows Server VM unbootable. A misunderstood VMDK descriptor file can silently corrupt a database disk. A migration tool that cannot leverage Changed Block Tracking (CBT) will quadruple the cutover window for every VM.

The previous chapters covered the target platforms (KVM/KubeVirt for OVE, Hyper-V for Azure Local) and the source platform (VMware vSphere/ESXi). This chapter covers the bridge between them: the formats, tools, and operational processes that move a running VM from the old world to the new world without losing data, breaking applications, or exhausting the team.

Three distinct migration paths exist for this evaluation:

VMware to OVE: Uses virt-v2v and/or the Migration Toolkit for Virtualization (MTV). Converts VMDK to QCOW2 or raw, injects virtio drivers, imports into KubeVirt as VirtualMachine CRDs with DataVolumes.
VMware to Azure Local: Uses Azure Migrate. Converts VMDK to VHD/VHDX, adjusts boot configuration, imports into Hyper-V.
VMware to Swisscom ESC: VMware-to-VMware migration (vMotion, HCX, or OVA export/import). No hypervisor conversion required because ESC currently runs on VMware vSphere. This is the simplest path technically but does not solve the strategic goal of leaving VMware.

Each path has different risk profiles, throughput characteristics, and failure modes. This chapter covers all three in the depth required to plan a production migration.

Concepts

1. OVA / OVF / VMDK / QCOW2

OVF -- Open Virtualization Format

OVF is a DMTF standard (DSP0243) that describes the metadata of a virtual machine in an XML document. It is not a disk image -- it is a manifest that describes the virtual hardware configuration, references disk files, defines network connections, and carries product metadata. Think of it as the blueprint that tells the target hypervisor how to reconstruct the VM.

An OVF file contains these key sections:

<?xml version="1.0" encoding="UTF-8"?>
<Envelope xmlns="http://schemas.dmtf.org/ovf/envelope/1"
          xmlns:ovf="http://schemas.dmtf.org/ovf/envelope/1"
          xmlns:rasd="http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/..."
          xmlns:vmw="http://www.vmware.com/schema/ovf">

  <!-- References: pointers to the disk files included in this package -->
  <References>
    <File ovf:href="webserver-disk1.vmdk" ovf:id="file1" ovf:size="8589934592"/>
    <File ovf:href="webserver-disk2.vmdk" ovf:id="file2" ovf:size="53687091200"/>
  </References>

  <!-- DiskSection: logical disk descriptions (capacity, format, parent) -->
  <DiskSection>
    <Disk ovf:capacity="50" ovf:capacityAllocationUnits="byte * 2^30"
          ovf:diskId="vmdisk1" ovf:fileRef="file1"
          ovf:format="http://www.vmware.com/interfaces/specifications/vmdk.html"/>
    <Disk ovf:capacity="200" ovf:capacityAllocationUnits="byte * 2^30"
          ovf:diskId="vmdisk2" ovf:fileRef="file2"
          ovf:format="http://www.vmware.com/interfaces/specifications/vmdk.html"/>
  </DiskSection>

  <!-- NetworkSection: logical network names -->
  <NetworkSection>
    <Network ovf:name="VM Network">
      <Description>The production network</Description>
    </Network>
  </NetworkSection>

  <!-- VirtualSystem: the VM definition -->
  <VirtualSystem ovf:id="webserver-01">
    <ProductSection>
      <Product>Internal Web Server</Product>
      <Vendor>Infrastructure Team</Vendor>
      <Version>2.1</Version>
    </ProductSection>
    <OperatingSystemSection ovf:id="101">
      <Description>Red Hat Enterprise Linux 9 (64-bit)</Description>
    </OperatingSystemSection>
    <VirtualHardwareSection>
      <!-- CPU: 4 vCPUs -->
      <Item>
        <rasd:ElementName>4 virtual CPUs</rasd:ElementName>
        <rasd:ResourceType>3</rasd:ResourceType>
        <rasd:VirtualQuantity>4</rasd:VirtualQuantity>
      </Item>
      <!-- Memory: 16 GB -->
      <Item>
        <rasd:ElementName>16384 MB of memory</rasd:ElementName>
        <rasd:ResourceType>4</rasd:ResourceType>
        <rasd:VirtualQuantity>16384</rasd:VirtualQuantity>
      </Item>
      <!-- Disk Controller: SCSI -->
      <Item>
        <rasd:ElementName>SCSI Controller</rasd:ElementName>
        <rasd:ResourceSubType>lsilogic</rasd:ResourceSubType>
        <rasd:ResourceType>6</rasd:ResourceType>
      </Item>
      <!-- Network Adapter: VMXNET3 -->
      <Item>
        <rasd:ElementName>Network adapter 1</rasd:ElementName>
        <rasd:ResourceSubType>VmxNet3</rasd:ResourceSubType>
        <rasd:ResourceType>10</rasd:ResourceType>
        <rasd:Connection>VM Network</rasd:Connection>
      </Item>
    </VirtualHardwareSection>
  </VirtualSystem>
</Envelope>

Why OVF matters for migration: OVF carries the metadata needed to recreate a VM on any platform that understands the format. However, VMware extends OVF with proprietary vmw: namespace elements (ExtraConfig keys, boot options, vApp properties) that other hypervisors ignore. During migration, these VMware-specific elements are typically discarded, and their equivalent settings must be reconfigured on the target platform manually or through the migration tool.

OVA -- Open Virtual Appliance

An OVA file is simply an OVF file plus all referenced disk files (VMDKs), packaged together as a single TAR archive. The TAR is not compressed -- the individual VMDKs inside may be compressed (streamOptimized format), but the TAR envelope is a plain concatenation.

  OVA File Structure (TAR Archive)
  ================================================================

  +--------------------------------------------------------------+
  |  webserver-01.ova  (TAR file)                                |
  |                                                              |
  |  +--------------------------------------------------------+  |
  |  |  webserver-01.ovf           (XML manifest)             |  |
  |  |  - Virtual hardware definition (CPU, RAM, NICs)        |  |
  |  |  - Disk references (file1, file2)                      |  |
  |  |  - Network mappings                                    |  |
  |  |  - Product metadata                                    |  |
  |  +--------------------------------------------------------+  |
  |                                                              |
  |  +--------------------------------------------------------+  |
  |  |  webserver-01.mf            (SHA256 manifest)          |  |
  |  |  - SHA256(webserver-01.ovf) = a3f8c1...                |  |
  |  |  - SHA256(webserver-disk1.vmdk) = 7b2e4d...            |  |
  |  |  - SHA256(webserver-disk2.vmdk) = e9f1a2...            |  |
  |  +--------------------------------------------------------+  |
  |                                                              |
  |  +--------------------------------------------------------+  |
  |  |  webserver-01.cert          (optional code signing)    |  |
  |  +--------------------------------------------------------+  |
  |                                                              |
  |  +--------------------------------------------------------+  |
  |  |  webserver-disk1.vmdk       (boot disk, 50 GB)         |  |
  |  |  - streamOptimized format for portability              |  |
  |  |  - Sparse: only allocated blocks are stored            |  |
  |  +--------------------------------------------------------+  |
  |                                                              |
  |  +--------------------------------------------------------+  |
  |  |  webserver-disk2.vmdk       (data disk, 200 GB)        |  |
  |  |  - streamOptimized format                              |  |
  |  +--------------------------------------------------------+  |
  +--------------------------------------------------------------+

  Key: OVF must be the FIRST file in the TAR archive.
       The .mf file contains checksums for integrity verification.
       The .cert file is an optional X.509 signature.

Practical note: When exporting VMs from vSphere for migration, OVA is convenient for single-VM transfers but inefficient at scale. Each OVA must be fully written to disk before it can be imported. For 5,000+ VM migrations, direct disk-level access via the vSphere API (VDDK/nbdkit) is far more efficient than OVA export/import.

VMDK -- Virtual Machine Disk

VMDK is VMware's virtual disk format. It is more complex than most people realize. A VMDK is not a single monolithic file -- it consists of a descriptor file (a small text file with metadata) and one or more extent files (the actual data blocks).

VMDK Descriptor File:

  VMDK Descriptor File (webserver-disk1.vmdk)
  ================================================================

  # Disk DescriptorFile
  version=1
  CID=fffffffe
  parentCID=ffffffff
  createType="vmfsSparse"

  # Extent description
  # access  size-in-sectors  type       filename
  RW 104857600 VMFSSPARSE "webserver-disk1-flat.vmdk"

  # The Disk Data Base (DDB) -- virtual geometry
  ddb.virtualHWVersion = "19"
  ddb.geometry.cylinders = "6527"
  ddb.geometry.heads = "255"
  ddb.geometry.sectors = "63"
  ddb.adapterType = "lsilogic"
  ddb.thinProvisioned = "1"
  ddb.uuid = "60 00 C2 93 e4 d1 a7 bc-4d 8a 2c 73 45 9e 12 f7"

VMDK format variants and their migration implications:

VMDK Type	createType	Description	Migration Implication
Thin	`vmfsThin`	Allocates space on demand. The extent file grows as data is written.	Most common in production. Sparse on export. qemu-img can convert efficiently.
Thick Eager Zero	`vmfsFlat` (eagerzeroedthick)	All space pre-allocated and zeroed at creation.	Full-size extent file. Conversion reads the entire file even if mostly zeros. Use `qemu-img convert -O qcow2` to reclaim zero space.
Thick Lazy Zero	`vmfsFlat` (zeroedthick)	All space pre-allocated but not zeroed until first write.	Similar to eager zero for migration -- the extent file is full size on disk.
Sparse	`monolithicSparse`	Single-file VMDK with embedded descriptor and sparse data.	Common for OVA exports. Single file simplifies handling.
Split Sparse	`twoGbMaxExtentSparse`	Split into 2 GB chunks. Used for FAT32 filesystems.	Rare in enterprise. Must be reassembled before conversion.
Stream Optimized	`streamOptimized`	Compressed, read-only format designed for OVA distribution.	Cannot be used directly. Must be converted to flat or sparse before use.
seSparse	`vmfsSparse` (SE)	Space-efficient sparse. VMware's modern snapshot format.	Used for VM snapshots on VMFS 6+. Conversion must handle the grain directory structure.
vmfsSparse	`vmfsSparse`	Legacy sparse format for snapshots.	Snapshot delta disks. Must be consolidated before migration.

  VMDK Internal Structure (Sparse VMDK)
  ================================================================

  +--------------------------------------------------------------+
  |  VMDK File                                                   |
  |                                                              |
  |  +-- Header (512 bytes) --------------------------------+    |
  |  |  Magic: KDMV (0x564d444b)                            |    |
  |  |  Version: 1                                          |    |
  |  |  Flags: 3 (valid new line detection + redundant GT)  |    |
  |  |  Capacity: 104857600 sectors (50 GB)                 |    |
  |  |  Grain Size: 128 sectors (64 KB)                     |    |
  |  |  Descriptor Offset: 1 (sector)                       |    |
  |  |  Descriptor Size: 20 (sectors)                       |    |
  |  |  Num GTE per GT: 512                                 |    |
  |  |  GD Offset: sector of grain directory                |    |
  |  |  Overhead: sectors before grain data starts          |    |
  |  +------------------------------------------------------+    |
  |                                                              |
  |  +-- Embedded Descriptor ------+                             |
  |  |  (same as text descriptor   |                             |
  |  |   shown above)              |                             |
  |  +-----------------------------+                             |
  |                                                              |
  |  +-- Grain Directory (GD) -----+                             |
  |  |  GD[0] -> GT sector 100     |  Points to grain tables     |
  |  |  GD[1] -> GT sector 200     |                             |
  |  |  GD[2] -> 0 (not allocated) |  Zero = no data in range    |
  |  |  GD[3] -> GT sector 300     |                             |
  |  |  ...                        |                             |
  |  +-----------------------------+                             |
  |                                                              |
  |  +-- Grain Tables (GT) --------+                             |
  |  |  GT[0]:                     |                             |
  |  |    GTE[0] -> grain at 1000  |  Points to 64KB data grains |
  |  |    GTE[1] -> 0 (sparse)     |  Zero = unallocated (thin)  |
  |  |    GTE[2] -> grain at 1128  |                             |
  |  |    ...                      |                             |
  |  |  GT[1]:                     |                             |
  |  |    GTE[0] -> grain at 2000  |                             |
  |  |    ...                      |                             |
  |  +-----------------------------+                             |
  |                                                              |
  |  +-- Data Grains ----+----+----+----+                        |
  |  |  Grain 0 (64 KB)  | G1 | G2 | G3 | ...                   |
  |  |  Actual VM data   |    |    |    |                        |
  |  +-------------------+----+----+----+                        |
  +--------------------------------------------------------------+

  Two-level lookup: GD -> GT -> Grain (data)
  Unallocated regions (sparse) are represented by zero entries.

Snapshot chains: When a VMware snapshot is taken, the original VMDK becomes read-only, and a new delta VMDK (redo log) is created in vmfsSparse or seSparse format. Each subsequent snapshot adds another delta file. Reads traverse the chain from newest to oldest delta until a non-zero block is found. Before migration, all snapshot chains must be consolidated into a single flat VMDK. Migrating a VM with active snapshots is not supported by any conversion tool and will fail or produce a corrupted disk.

QCOW2 -- QEMU Copy-On-Write Version 2

QCOW2 is KVM's native disk image format and the format most commonly used for VM disks on OVE. It is far more sophisticated than a simple flat file. QCOW2 supports thin provisioning, snapshots, backing files (for template-based VMs), compression, AES encryption, and preallocation modes -- all managed through an internal two-level table structure.

  QCOW2 Internal Structure
  ================================================================

  +--------------------------------------------------------------+
  |  QCOW2 File                                                 |
  |                                                              |
  |  +-- Header (variable, min 104 bytes) --+                   |
  |  |  Magic: QFI\xfb (0x514649fb)         |                   |
  |  |  Version: 3                           |                   |
  |  |  Backing file offset: 0 (or pointer)  |                   |
  |  |  Backing file size: 0 (or length)     |                   |
  |  |  Cluster bits: 16 (64 KB clusters)    |                   |
  |  |  Size: 53687091200 (50 GB virtual)    |                   |
  |  |  Crypt method: 0 (none)               |                   |
  |  |  L1 size: 800 entries                 |                   |
  |  |  L1 table offset: 0x30000             |                   |
  |  |  Refcount table offset: 0x10000       |                   |
  |  |  Refcount table clusters: 1           |                   |
  |  |  Nb snapshots: 0                      |                   |
  |  |  Header extensions:                   |                   |
  |  |    - Feature name table               |                   |
  |  |    - Bitmap directory (dirty tracking) |                   |
  |  +---------------------------------------+                   |
  |                                                              |
  |  +-- L1 Table (top-level index) --------+                   |
  |  |  L1[0] -> L2 table at offset 0x40000 |                   |
  |  |  L1[1] -> L2 table at offset 0x50000 |                   |
  |  |  L1[2] -> 0 (unallocated range)      |                   |
  |  |  L1[3] -> L2 table at offset 0x60000 |                   |
  |  |  ...                                  |                   |
  |  |  Each L1 entry covers:               |                   |
  |  |    L2_entries * cluster_size          |                   |
  |  |    = 8192 * 64KB = 512 MB per L1     |                   |
  |  +----- --------------------------------+                   |
  |                                                              |
  |  +-- L2 Tables (second-level index) ----+                   |
  |  |  L2[0]:                               |                   |
  |  |    Entry[0] -> cluster at 0x100000    |                   |
  |  |    Entry[1] -> 0 (unallocated)        |                   |
  |  |    Entry[2] -> cluster at 0x110000    |                   |
  |  |    Entry[3] -> compressed cluster*    |                   |
  |  |    ...                                |                   |
  |  |  * Compressed entries use bits 62:0   |                   |
  |  |    for offset and bits 63 for flag    |                   |
  |  +---------------------------------------+                   |
  |                                                              |
  |  +-- Refcount Table + Refcount Blocks --+                   |
  |  |  Tracks how many references point    |                   |
  |  |  to each cluster (for snapshot COW): |                   |
  |  |                                      |                   |
  |  |  Refcount = 1: normal allocation     |                   |
  |  |  Refcount = 2+: shared by snapshots  |                   |
  |  |  Refcount = 0: free cluster          |                   |
  |  |                                      |                   |
  |  |  When a snapshot-shared cluster is    |                   |
  |  |  written, QEMU copies it to a new    |                   |
  |  |  cluster (COW) and updates the L2    |                   |
  |  |  entry. The old cluster's refcount   |                   |
  |  |  decrements.                         |                   |
  |  +--------------------------------------+                   |
  |                                                              |
  |  +-- Data Clusters ---+-------+-------+                     |
  |  |  Cluster 0 (64 KB) | C1    | C2    | ...                  |
  |  |  Actual VM data    |       |       |                      |
  |  +--------------------+-------+-------+                     |
  +--------------------------------------------------------------+

  Address resolution:
    Guest offset -> L1 index -> L2 table -> cluster offset
    L1 index = guest_offset / (L2_entries * cluster_size)
    L2 index = (guest_offset / cluster_size) % L2_entries

QCOW2 backing files (template chains):

QCOW2 supports a backing_file pointer in the header. When a cluster is read but not present in the current file (L2 entry is zero), QEMU reads it from the backing file instead. This enables instant VM provisioning from templates: the template is a read-only base image, and each VM gets a thin overlay QCOW2 that only stores the differences.

  QCOW2 Backing File Chain (Template Pattern)
  ================================================================

  +----------------------------+
  |  golden-image.qcow2       |  <-- Base image (read-only)
  |  (RHEL 9 template, 3 GB)  |      Contains full OS install
  +----------------------------+
         ^           ^
         |           |
  +------+-----+  +--+------------+
  |  vm-01.qcow2|  |  vm-02.qcow2 |  <-- Overlay images (read-write)
  |  backing:   |  |  backing:     |      Only store changed clusters
  |  golden-    |  |  golden-      |      Initial size: ~200 KB
  |  image.qcow2|  |  image.qcow2 |
  |  (50 MB of  |  |  (120 MB of  |
  |   changes)  |  |   changes)   |
  +-------------+  +--------------+

  Read path for vm-01:
    1. Guest reads sector X
    2. QEMU checks L1/L2 in vm-01.qcow2
    3. If cluster allocated -> return data from vm-01.qcow2
    4. If cluster NOT allocated -> read from golden-image.qcow2
    5. If also not in backing file -> return zeros

  Write path for vm-01:
    1. Guest writes sector X
    2. QEMU allocates new cluster in vm-01.qcow2 (COW)
    3. Writes data to vm-01.qcow2
    4. Backing file is never modified

QCOW2 preallocation modes:

Mode	Behavior	Use Case
`off` (default)	Clusters allocated on first write. File starts small.	General workloads. Best storage efficiency.
`metadata`	L1/L2 tables pre-allocated, data clusters allocated on write.	Reduces metadata allocation overhead during I/O.
`falloc`	File space pre-allocated via `fallocate()`, data zeroed lazily.	Avoids fragmentation. Near-raw performance for sequential I/O.
`full`	File space pre-allocated and fully zeroed.	Maximum performance. No allocation overhead during writes. Same size as virtual disk.

Raw Format

A raw disk image is a byte-for-byte representation of a virtual disk. No headers, no metadata tables, no L1/L2 indirection. Offset 0 in the file is LBA 0 on the virtual disk. Offset N is LBA N.

Advantages:

Maximum I/O performance: no metadata lookup on every read/write
Simplest format: any tool can read it directly (dd, hexdump, mount via loopback)
No format-specific bugs or corruption risks

Disadvantages:

No thin provisioning at the image level (the file is always the full virtual disk size unless the filesystem supports sparse files or fallocate with hole-punching)
No snapshots, no backing files, no compression within the format itself

When to use raw on OVE: When the underlying storage system handles thin provisioning and snapshots at the block device level (e.g., Ceph RBD, LVM thin), raw is the preferred format because the QCOW2 L1/L2 tables become redundant overhead. OVE with ODF (OpenShift Data Foundation, backed by Ceph RBD) typically uses raw images stored in PersistentVolumeClaims because Ceph itself provides thin provisioning, snapshots, and cloning at the RADOS level.

When to use QCOW2 on OVE: When the underlying storage does not provide native thin provisioning or snapshot capabilities (e.g., local disks, basic NFS), QCOW2's built-in features become necessary. QCOW2 is also useful when backing file chains (template patterns) are needed without storage-level COW.

Format Conversion: qemu-img

qemu-img is the Swiss Army knife for disk format conversion. It is the core utility used by virt-v2v, MTV, and manual migration workflows.

# VMDK to QCOW2 (most common migration conversion)
qemu-img convert -f vmdk -O qcow2 source.vmdk target.qcow2

# VMDK to raw (for Ceph/RBD-backed storage)
qemu-img convert -f vmdk -O raw source.vmdk target.raw

# QCOW2 to raw (for performance-critical VMs)
qemu-img convert -f qcow2 -O raw source.qcow2 target.raw

# With progress output and parallel I/O
qemu-img convert -p -W -f vmdk -O qcow2 source.vmdk target.qcow2
#                  ^  ^
#                  |  +-- Write target in parallel (multiple coroutines)
#                  +---- Show progress percentage

# With QCOW2 options: preallocation, cluster size, compression
qemu-img convert -f vmdk -O qcow2 \
  -o preallocation=metadata,cluster_size=65536,compat=1.1 \
  source.vmdk target.qcow2

# Check image integrity after conversion
qemu-img check target.qcow2

# Show image info (format, virtual size, actual size, backing file)
qemu-img info --output=json target.qcow2

Conversion performance and space implications:

Source Format	Target Format	Throughput (10 Gbps network, NVMe local)	Space Change
VMDK thin (50 GB virtual, 20 GB actual)	QCOW2	~500-800 MB/s with `-W`	~20-22 GB (slight metadata overhead)
VMDK thin (50 GB virtual, 20 GB actual)	raw	~500-800 MB/s	50 GB (no thin provisioning in raw) or 20 GB if filesystem supports sparse
VMDK thick (50 GB virtual, 50 GB actual)	QCOW2	~400-600 MB/s	~20 GB (QCOW2 reclaims zero blocks)
VMDK streamOptimized	QCOW2	~200-400 MB/s (decompress overhead)	Varies by content

Practical tip: For large-scale migrations, always use -W (parallel writes) with qemu-img convert. Without it, conversion is serialized and substantially slower. Also consider -m 4 or higher to use multiple coroutines for reading.

vSphere API for Disk Export (VDDK, nbdkit)

In a production migration of 5,000+ VMs, you do not export OVA files manually. You use programmatic APIs to stream disk data directly from the vSphere storage layer.

VDDK (Virtual Disk Development Kit): VMware's proprietary C library for reading and writing VMDK files remotely. VDDK connects to vCenter, authenticates, opens a VM's disk, and reads blocks over the network. It supports Changed Block Tracking (CBT) -- the ability to read only the blocks that have changed since a specific snapshot, which is critical for warm migrations and incremental replication.

nbdkit: An open-source NBD (Network Block Device) server with a VDDK plugin. nbdkit acts as a bridge: it uses VMware's VDDK library to connect to vCenter and exposes the VM's disk as an NBD endpoint that open-source tools (qemu-img, virt-v2v) can read natively. This is how virt-v2v and MTV access VMware disks without needing direct VMFS access.

  Disk Export Path: vCenter -> VDDK -> nbdkit -> qemu-img
  ================================================================

  +----------+      HTTPS/SOAP     +----------+
  | vCenter  | <----- API -------> | nbdkit   |
  | Server   |                     | (with    |
  +----------+                     | VDDK     |
       |                           | plugin)  |
       |  VMkernel data path       +----------+
       |  (NFC / NBDSSL)                |
       v                                | NBD protocol
  +----------+                          | (unix socket
  | ESXi     |                          |  or TCP)
  | Host     |                          v
  | +------+ |                     +----------+
  | | VMDK | | ---- disk data ---> | qemu-img |
  | | on   | |     (streamed)      | convert  |
  | | VMFS | |                     +----------+
  | +------+ |                          |
  +----------+                          v
                                   +----------+
                                   | target   |
                                   | .qcow2   |
                                   | or .raw  |
                                   +----------+

  CBT (Changed Block Tracking) flow for warm migration:
  1. Initial snapshot: read ALL blocks via VDDK
  2. VM continues running, CBT tracks changes
  3. Delta sync: read ONLY changed blocks since last snapshot
  4. Repeat delta syncs until change set is small
  5. Final cutover: quiesce VM, read last delta, convert, boot on target

VDDK licensing: VDDK is freely downloadable from VMware but requires acceptance of VMware's SDK license agreement. The VDDK .so library must be placed on the migration host and is not redistributable. MTV ships a mechanism to mount the VDDK library into its conversion pods.

2. virt-v2v / Migration Toolkit for Virtualization (MTV)

virt-v2v: The Upstream Conversion Tool

virt-v2v is a command-line tool from the libguestfs project that converts virtual machines from foreign hypervisors (VMware, Hyper-V) to KVM. It is the upstream engine that MTV wraps in a Kubernetes-native workflow. Understanding virt-v2v's internals is essential because it is the tool that performs the actual conversion work -- even inside MTV, the conversion pod runs virt-v2v.

What virt-v2v does in a single conversion:

Connects to the source -- vCenter (via VDDK/nbdkit) or local disk file
Copies the disk(s) -- streams VMDK data through nbdkit, writes to the target format
Inspects the guest OS -- uses libguestfs to mount the guest filesystem read-only and identify the OS type, version, installed drivers, and bootloader
Removes source hypervisor artifacts -- uninstalls VMware Tools (open-vm-tools), removes VMware SVGA driver, removes VMware paravirtual SCSI driver references from boot configuration
Injects target hypervisor drivers -- installs virtio drivers (vioscsi, viostor, NetKVM, balloon) for Windows; verifies virtio modules are present in initramfs for Linux
Fixes the bootloader -- updates GRUB/BCD to reference virtio SCSI instead of VMware PVSCSI or LSI Logic
Adjusts guest OS configuration -- fixes NIC naming (eth0 vs ens192 vs persistent names), updates fstab if needed, reconfigures network manager
Outputs the result -- writes converted disk plus domain XML (for libvirt) or creates KubeVirt resources (when used with MTV)

  virt-v2v Conversion Pipeline
  ================================================================

  +---------+     +----------+     +----------+     +----------+
  | Source   |     | Disk     |     | Guest OS |     | Driver   |
  | Connect  | --> | Copy &   | --> | Inspect  | --> | Inject   |
  |          |     | Convert  |     |          |     |          |
  | vCenter  |     | VMDK ->  |     | Mount FS |     | Windows: |
  | via VDDK |     | QCOW2/   |     | Detect   |     | virtio-  |
  | or local |     | raw      |     | OS type  |     | win ISO  |
  | file     |     |          |     | Detect   |     | Linux:   |
  |          |     |          |     | drivers  |     | initramfs|
  +---------+     +----------+     +----------+     | rebuild  |
                                                     +----------+
                                                          |
                                                          v
                                   +----------+     +----------+
                                   | Boot     |     | Output   |
                                   | Fixup    | --> | Write    |
                                   |          |     |          |
                                   | GRUB for |     | QCOW2 + |
                                   | Linux    |     | libvirt  |
                                   | BCD for  |     | XML, or  |
                                   | Windows  |     | kubevirt |
                                   +----------+     | CRD      |
                                                     +----------+

Input modes:

# Mode 1: Direct from VMware vCenter (most common for production)
virt-v2v -i vmx \
  -ic "vpx://vcenter.example.com/Datacenter/cluster/esxi-host?no_verify=1" \
  -it vddk \
  -io vddk-libdir=/opt/vmware-vddk-lib-7.0 \
  -io vddk-thumbprint=AA:BB:CC:DD:... \
  "vm-name" \
  -o kubevirt \
  -os /output/directory

# Mode 2: From local disk files (OVA or individual VMDK)
virt-v2v -i ova /path/to/exported.ova \
  -o qemu \
  -os /output/directory

# Mode 3: From a running Hyper-V VM (via SSH)
virt-v2v -i disk /path/to/disk.vhdx \
  -o local \
  -os /output/directory

Driver Injection -- The Critical Step

Driver injection is where most conversions succeed or fail. A VM without the correct storage and network drivers for the target hypervisor cannot boot.

Linux VMs:

Linux VMs are generally straightforward. The virtio kernel modules (virtio_blk, virtio_scsi, virtio_net, virtio_balloon, virtio_rng) have been included in the mainline Linux kernel since version 2.6.25 (2008). For any RHEL 6+, SLES 12+, Ubuntu 14.04+, or Debian 8+ VM, the drivers are already in the kernel. virt-v2v's job for Linux is:

Verify that the virtio modules exist in the kernel modules directory
Rebuild the initramfs/initrd to include virtio modules (so the boot disk is accessible during early boot)
Update GRUB configuration to remove VMware-specific kernel parameters
Update /etc/fstab if disk device names change (e.g., /dev/sda stays the same with virtio-scsi, but /dev/vda if using virtio-blk)
Remove VMware-specific udev rules that persist NIC names based on VMware MAC address prefixes

Windows VMs -- The Hard Case:

Windows does not include virtio drivers by default. They must be injected into the guest's driver store before the VM can boot on KVM. This is where the virtio-win driver package becomes critical.

The virtio-win ISO (or MSI installer) provides these drivers:

Driver	Device	Purpose
`viostor`	VirtIO SCSI controller (legacy)	Block storage access for the boot disk
`vioscsi`	VirtIO SCSI controller (modern)	Preferred over viostor for new installations
`NetKVM`	VirtIO network adapter	Network connectivity
`Balloon`	VirtIO balloon device	Memory ballooning for overcommitment
`qxldod` / `qxl`	QXL display adapter	Console display (replaces VMware SVGA)
`pvpanic`	PV Panic device	Crash notification to hypervisor
`vioinput`	VirtIO input device	Keyboard/mouse for SPICE/VNC
`viorng`	VirtIO RNG device	Hardware random number generator
`viofs`	VirtIO FS (virtiofs)	Host-guest shared filesystem
`vioserial`	VirtIO serial device	QEMU Guest Agent communication channel
`fwcfg`	QEMU fw_cfg	Firmware configuration data access

  Windows Driver Injection Flow (virt-v2v)
  ================================================================

  1. Mount Windows partition via libguestfs (read-write)

  2. Detect Windows version and architecture:
     - Read: \Windows\System32\config\SOFTWARE registry
     - Determine: Windows Server 2019 x64, build 17763

  3. Copy drivers from virtio-win ISO to guest:
     Source: virtio-win.iso:/vioscsi/2k19/amd64/
     Target: \Windows\System32\drivers\vioscsi.sys
             \Windows\INF\vioscsi.inf
             \Windows\INF\vioscsi.cat

  4. Inject drivers into Windows driver store:
     - Write registry entries to:
       HKLM\SYSTEM\ControlSet001\Services\vioscsi
       HKLM\SYSTEM\ControlSet001\Services\netkvm
       HKLM\SYSTEM\ControlSet001\Services\viostor
     - Mark drivers as boot-start (Start=0 for storage drivers)
     - This ensures Windows loads the driver during boot,
       before the filesystem is available

  5. Update BCD (Boot Configuration Data):
     - Ensure the boot disk reference uses the VirtIO SCSI
       controller path instead of VMware PVSCSI

  6. Remove VMware Tools:
     - Delete VMware Tools service entries
     - Remove VMware SVGA driver
     - Remove VMware mouse driver
     - Clean up "vm-tools-upgrader" scheduled tasks

  7. Handle NIC renaming:
     - Windows assigns NICs persistent names based on PCI slot
     - New virtio NIC gets a new name ("Ethernet 2" or "Ethernet 3")
     - Old VMware NIC configuration (IP, DNS, routes) is orphaned
     - virt-v2v attempts to preserve network configuration,
       but manual verification is recommended

Common Windows conversion failures:

Failure	Symptom	Cause	Resolution
Blue Screen (BSOD) at boot	INACCESSIBLE_BOOT_DEVICE (0x7B)	Storage driver not injected or wrong version	Verify vioscsi/viostor driver matches Windows version. Boot from recovery media and inject manually.
No network after boot	VM boots but has no connectivity	NetKVM driver not installed or IP config lost	Install NetKVM driver from virtio-win ISO inside the guest. Reconfigure IP settings.
UEFI boot failure	VM does not POST, no bootloader found	Secure Boot enabled in source, OVMF does not have the same certificates	Disable Secure Boot in VM config or enroll the appropriate certificates in OVMF
Windows activation	"Windows is not activated" error	Hardware ID changed (motherboard UUID, BIOS serial)	Re-activate Windows. For KMS-activated VMs, ensure KMS server is reachable. For MAK/OEM, may need to contact Microsoft.
NIC ordering change	Applications bound to "Local Area Connection 2" cannot connect	VMware NIC removed, virtio NIC added with different name	Rename the NIC in Windows network settings, or update application configuration
Time zone / clock drift	Clock is wrong after migration	VMware uses UTC for BIOS clock, Windows expects local time, KVM defaults may differ	Set `<clock offset="localtime"/>` in libvirt XML for Windows VMs, or configure Windows to use UTC
Mouse/keyboard not working	Cannot interact with console	QXL/VirtIO input drivers not installed	Install full virtio-win driver set. Use tablet input device in VM config.
Hyper-V enlightenments	Poor performance on KVM	Windows detects KVM but Hyper-V enlightenments not enabled	Enable Hyper-V enlightenments in KubeVirt VM spec: `hyperv: {relaxed, vapic, spinlocks, vpindex, runtime, synic, stimer, reset, frequencies}`

UEFI / Secure Boot considerations:

VMs with UEFI firmware (as opposed to legacy BIOS) require additional handling during conversion:

The VM must be configured to use OVMF (Open Virtual Machine Firmware) on the target KVM host
The EFI System Partition (ESP) must be preserved during disk conversion
If Secure Boot was enabled on VMware, the VM's boot chain (bootloader, kernel) was signed with VMware's or Microsoft's certificates. OVMF includes Microsoft's UEFI CA certificates by default, so most Windows VMs with Secure Boot will work. Custom certificate chains require manual enrollment.
virt-v2v detects UEFI firmware and sets the appropriate libvirt XML (<os firmware="efi"/>)

MTV -- Migration Toolkit for Virtualization

MTV is Red Hat's productized, Kubernetes-native migration tool built on top of virt-v2v. While virt-v2v is a single-VM command-line tool, MTV provides a multi-VM, workflow-driven migration platform with a web UI, REST API, provider inventory, network/storage mapping, migration plans, and wave execution. MTV is the tool that makes migrating 5,000 VMs operationally feasible -- virt-v2v alone would require 5,000 manual invocations.

MTV is based on the upstream open-source project Forklift (previously called Konveyor Forklift), which is part of the Konveyor community project for application modernization.

MTV Architecture:

  MTV (Migration Toolkit for Virtualization) Architecture
  ================================================================

  +=====================================================================+
  |  OpenShift / Kubernetes Cluster (Target)                            |
  |                                                                     |
  |  +---------------------------------------------------------------+  |
  |  |  MTV Operator (Deployed via OLM)                              |  |
  |  |                                                               |  |
  |  |  +-------------------------+  +----------------------------+  |  |
  |  |  |  forklift-controller    |  |  forklift-ui               |  |  |
  |  |  |                         |  |  (Web console plugin)      |  |  |
  |  |  |  - Reconciles Migration |  |                            |  |  |
  |  |  |    CRDs                 |  |  - Provider management     |  |  |
  |  |  |  - Orchestrates plans   |  |  - Plan creation wizard    |  |  |
  |  |  |  - Manages conversion   |  |  - Migration monitoring    |  |  |
  |  |  |    pods                 |  |  - Network/storage mapping |  |  |
  |  |  |  - Handles rollback     |  |                            |  |  |
  |  |  +-------------------------+  +----------------------------+  |  |
  |  |                                                               |  |
  |  |  +-------------------------+  +----------------------------+  |  |
  |  |  |  forklift-validation    |  |  inventory-service         |  |  |
  |  |  |                         |  |                            |  |  |
  |  |  |  - Pre-migration checks |  |  - Discovers VMs from     |  |  |
  |  |  |  - OS compatibility     |  |    vCenter provider        |  |  |
  |  |  |  - Driver availability  |  |  - Caches VM metadata      |  |  |
  |  |  |  - Disk/NIC analysis    |  |  - Tracks provider state   |  |  |
  |  |  |  - Warm migration       |  |  - Provides API for UI     |  |  |
  |  |  |    feasibility          |  |    and controller           |  |  |
  |  |  +-------------------------+  +----------------------------+  |  |
  |  +---------------------------------------------------------------+  |
  |                                                                     |
  |  Per-VM Migration Execution:                                        |
  |  +---------------------------------------------------------------+  |
  |  |  Conversion Pod (per VM)                                      |  |
  |  |  +----------------------+  +-------------------------------+  |  |
  |  |  |  nbdkit + VDDK       |  |  virt-v2v                     |  |  |
  |  |  |                      |  |                               |  |  |
  |  |  |  Connects to vCenter |  |  Converts disk format         |  |  |
  |  |  |  Reads VMDK blocks   |  |  Injects virtio drivers       |  |  |
  |  |  |  Streams via NBD     |  |  Fixes bootloader             |  |  |
  |  |  +----------------------+  +-------------------------------+  |  |
  |  |                                      |                        |  |
  |  |                                      v                        |  |
  |  |                         +-------------------------------+     |  |
  |  |                         |  CDI (Containerized Data      |     |  |
  |  |                         |       Importer)               |     |  |
  |  |                         |                               |     |  |
  |  |                         |  Writes converted disk into   |     |  |
  |  |                         |  PVC (PersistentVolumeClaim)  |     |  |
  |  |                         |  as DataVolume                |     |  |
  |  |                         +-------------------------------+     |  |
  |  +---------------------------------------------------------------+  |
  |                                                                     |
  |  Result:                                                            |
  |  +---------------------------------------------------------------+  |
  |  |  VirtualMachine CRD + DataVolume(s)                           |  |
  |  |  Ready to start on KubeVirt                                   |  |
  |  +---------------------------------------------------------------+  |
  +=====================================================================+

  Source:
  +==========================+
  |  VMware vCenter          |
  |  - Provides VM inventory |
  |  - VDDK disk access      |
  |  - CBT for warm migration|
  +==========================+

MTV Custom Resources:

MTV extends the Kubernetes API with several CRDs that define the migration workflow:

CRD	Purpose
`Provider`	Defines a connection to a source (VMware vCenter, RHV, OpenStack) or target (KubeVirt) platform. Contains credentials, URL, and TLS configuration.
`NetworkMap`	Maps source networks (VMware port groups) to target networks (KubeVirt NADs or pod network).
`StorageMap`	Maps source datastores (VMFS, NFS) to target storage classes (e.g., ocs-storagecluster-ceph-rbd).
`Plan`	Defines a migration plan: which VMs to migrate, which network/storage maps to use, warm vs. cold mode, and pre/post-migration hooks.
`Migration`	Represents the execution of a Plan. Created when the user starts the migration. Tracks status per VM.
`Hook`	A pre- or post-migration hook (an Ansible playbook or container image) that runs at specific points in the migration lifecycle.

Migration workflow: Provider to Plan to Migration

  MTV Migration Workflow
  ================================================================

  Step 1: Create Provider (connect to vCenter)
  +----------------------------------------------------------+
  | apiVersion: forklift.konveyor.io/v1beta1                 |
  | kind: Provider                                           |
  | metadata:                                                |
  |   name: vmware-prod                                      |
  | spec:                                                    |
  |   type: vsphere                                          |
  |   url: https://vcenter.example.com/sdk                   |
  |   secret: vcenter-credentials   # vCenter username/pwd   |
  +----------------------------------------------------------+
       |
       | Inventory service discovers all VMs, networks,
       | datastores, hosts, clusters from vCenter
       v
  Step 2: Create NetworkMap + StorageMap
  +----------------------------------------------------------+
  | NetworkMap: "VM Network" -> "br-prod" (Multus NAD)       |
  |             "Backup LAN" -> "br-backup" (Multus NAD)     |
  |                                                          |
  | StorageMap: "datastore-ssd" -> "ocs-rbd" (StorageClass)  |
  |             "datastore-hdd" -> "ocs-rbd-bulk"            |
  +----------------------------------------------------------+
       |
       v
  Step 3: Create Plan (select VMs, configure migration type)
  +----------------------------------------------------------+
  | Plan: "wave-01-webservers"                               |
  |   VMs: [web-01, web-02, web-03, ..., web-50]            |
  |   NetworkMap: prod-network-map                           |
  |   StorageMap: prod-storage-map                           |
  |   Type: warm (pre-copy with CBT)                         |
  |   Hooks:                                                 |
  |     pre: run-pre-check-playbook                          |
  |     post: run-smoke-test-playbook                        |
  +----------------------------------------------------------+
       |
       | Validation service checks each VM:
       | - OS supported?
       | - Disk format convertible?
       | - Network mappings valid?
       | - Snapshots consolidated?
       | - VMware Tools installed? (needed for quiesce)
       v
  Step 4: Execute Migration (create Migration CR)
  +----------------------------------------------------------+
  | Migration: "wave-01-execution-1"                         |
  |   Plan: wave-01-webservers                               |
  |   Status per VM:                                         |
  |     web-01: CopyingDisk (45%)                            |
  |     web-02: ConvertingGuest                              |
  |     web-03: Pending                                      |
  +----------------------------------------------------------+
       |
       | Controller creates conversion pods, manages
       | parallelism, tracks progress, handles failures
       v
  Step 5: Result
  +----------------------------------------------------------+
  | VirtualMachine CRDs created in target namespace          |
  | DataVolumes with converted disks in PVCs                 |
  | VMs ready to start                                       |
  +----------------------------------------------------------+

Warm vs. Cold Migration

The choice between warm and cold migration directly impacts downtime per VM. For 5,000+ VMs, this choice scales to determine the total project timeline and business disruption.

Cold migration:

  Cold Migration Timeline (per VM)
  ================================================================

  Source (VMware)                    Target (KubeVirt)
  |                                 |
  |  VM running normally            |
  |  ========================       |
  |                                 |
  t=0  SHUT DOWN VM                 |
  |    |                            |
  |    | VM is offline              |
  |    |                            |
  |    +-- Full disk copy --------->|  Copy entire disk(s)
  |    |   (100 GB @ 500 MB/s       |  via VDDK + nbdkit
  |    |    = ~200 seconds)         |
  |    |                            |
  |    +-- Convert (virt-v2v) ----->|  Driver injection,
  |    |   (~60-120 seconds)        |  bootloader fixup
  |    |                            |
  |    +-- Import to PVC --------->|  CDI writes to DataVolume
  |       (~30-60 seconds)          |
  |                                 |
  |                                 |  START VM on KubeVirt
  |                                 |  ======================
  |                                 |
  |  Total downtime: ~5-7 minutes   |
  |  for a 100 GB VM                |
  |                                 |
  |  For a 2 TB VM:                 |
  |  Full copy: ~70 minutes         |
  |  Total downtime: ~75 minutes    |

Pros: Simple, reliable, no dependency on CBT or snapshot support
Cons: Downtime equals the full disk copy time plus conversion time. For large VMs (500 GB to 2 TB+), downtime can be hours.

Warm migration (pre-copy with CBT):

  Warm Migration Timeline (per VM)
  ================================================================

  Source (VMware)                    Target (KubeVirt)
  |                                 |
  |  VM running normally            |
  |  ========================       |
  |                                 |
  t=0  Create snapshot S1           |
  |    Enable CBT                   |
  |    |                            |
  |    +-- Full disk copy --------->|  Copy ALL blocks
  |    |   (100 GB @ 500 MB/s       |  (initial sync)
  |    |    = ~200 seconds)         |
  |    |                            |
  |    VM continues running         |
  |    CBT tracks changed blocks    |
  |    |                            |
  t+1h Create snapshot S2           |
  |    |                            |
  |    +-- Delta copy 1 ---------->|  Copy ONLY changed blocks
  |    |   (2 GB @ 500 MB/s         |  since S1 (via CBT query)
  |    |    = ~4 seconds)           |
  |    |                            |
  |    VM continues running         |
  |    CBT tracks changes since S2  |
  |    |                            |
  t+2h Create snapshot S3           |
  |    |                            |
  |    +-- Delta copy 2 ---------->|  Copy changed blocks
  |    |   (500 MB = ~1 second)     |  since S2
  |    |                            |
  |    VM continues running         |
  |    |                            |
  t+Xh  CUTOVER (scheduled window)  |
  |    |                            |
  |    +-- QUIESCE VM               |
  |    |   (graceful shutdown or    |
  |    |    freeze I/O)             |
  |    |                            |
  |    +-- Final delta copy ------->|  Copy last changes
  |    |   (50 MB = <1 second)      |  since S3
  |    |                            |
  |    +-- Convert (virt-v2v) ----->|  Driver injection,
  |    |   (~60-120 seconds)        |  bootloader fixup
  |    |                            |
  |    +-- Import to PVC --------->|  CDI writes to DataVolume
  |       (~30-60 seconds)          |
  |                                 |
  |                                 |  START VM on KubeVirt
  |                                 |  ======================
  |                                 |
  |  Total downtime: ~2-4 minutes   |
  |  regardless of disk size!       |
  |                                 |
  |  Pre-copy phase (background):   |
  |  hours/days before cutover      |

Pros: Downtime is nearly constant regardless of VM disk size. The bulk of the data is copied while the VM is still running. The cutover window is short and predictable.
Cons: Requires CBT to be enabled on the source VM (VMware must support it, and the VM must not have snapshots that prevent CBT). Requires VDDK integration. More complex orchestration. If the VM's change rate is extremely high (e.g., a database with constant writes), the delta sets may not converge.
When to use warm migration: For any VM with more than ~50 GB of disk data, or any VM where downtime must be minimized (production databases, trading systems, customer-facing applications).

MTV warm migration mechanics:

MTV implements warm migration through a series of snapshots and delta copies coordinated by the forklift-controller:

The controller creates an initial VMware snapshot on the source VM and enables CBT if not already active
A conversion pod copies all disk blocks via VDDK/nbdkit to a staging PVC
After the initial copy completes, the controller creates a new snapshot
The conversion pod queries CBT for changed blocks since the previous snapshot and copies only those blocks
Steps 3-4 repeat on a configurable interval (default: every 60 minutes)
When the operator triggers cutover, the controller: a. Quiesces the source VM (if VMware Tools is installed) or shuts it down b. Creates a final snapshot and copies the last delta c. Runs virt-v2v conversion on the staged disk d. Imports the converted disk into the target PVC via CDI e. Creates the VirtualMachine CRD f. Optionally starts the VM on KubeVirt

CDI -- Containerized Data Importer

CDI is the KubeVirt component that handles importing disk images into PersistentVolumeClaims. It is not specific to migration -- CDI is used for any disk import operation, including importing ISO images, container disk images, and HTTP-hosted disk files. During migration, CDI is the final step that writes the converted disk into the storage backend.

CDI works through DataVolume CRDs:

apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  name: web-01-boot-disk
  namespace: migrated-vms
spec:
  source:
    # Option 1: Import from HTTP URL
    http:
      url: "https://migration-staging.internal/web-01-boot.qcow2"

    # Option 2: Import from container registry
    # registry:
    #   url: "docker://registry.internal/vm-disks/web-01:latest"

    # Option 3: Upload from local file (via CDI upload proxy)
    # upload: {}

    # Option 4: Clone from existing PVC
    # pvc:
    #   name: golden-image-rhel9
    #   namespace: templates

  pvc:
    accessModes:
      - ReadWriteOnce
    resources:
      requests:
        storage: 100Gi
    storageClassName: ocs-storagecluster-ceph-rbd

CDI creates an importer pod that downloads, decompresses (if needed), converts (if needed), and writes the disk data into the target PVC. The importer pod supports QCOW2, raw, VMDK, VHD, and VHDX input formats. It can also apply GZIP or XZ decompression on the fly.

Scale Considerations for 5,000+ VMs

Migrating at scale is not just running the same tool 5,000 times. Network bandwidth, storage I/O, vCenter API load, and conversion pod resource consumption all become bottlenecks.

Parallel migration capacity:

Bottleneck	Limit	Mitigation
vCenter API load	vCenter can handle ~20-30 concurrent VDDK connections per host before performance degrades	Stagger migrations across hosts. Use multiple vCenter sessions.
Network bandwidth	10 Gbps link = ~1.1 GB/s theoretical. 20 concurrent VMs each copying at 500 MB/s = 10 GB/s = saturated.	Dedicated migration VLAN. 25 Gbps or bonded 10 Gbps links. QoS policies.
Target storage IOPS	Ceph cluster write throughput is finite. 20 concurrent imports at 500 MB/s each = 10 GB/s sustained write.	Size the Ceph cluster for migration burst. Use dedicated pool.
Conversion pod resources	Each virt-v2v conversion pod needs ~2 vCPU and 2-4 GB RAM. 20 concurrent conversions = 40 vCPU + 60 GB RAM.	Dedicate migration worker nodes. Set resource limits in MTV config.
CDI concurrent imports	CDI has a configurable limit on concurrent data imports (default: 3 per namespace, configurable).	Increase CDI configuration: `filesystemOverhead`, `uploadProxyURL`, `insecureRegistries`.

Bandwidth planning formula:

  Migration Bandwidth Planning
  ================================================================

  Inputs:
    Total data to migrate:  200 TB (5,000 VMs, average 40 GB used per VM)
    Available bandwidth:    10 Gbps dedicated = 1.1 GB/s
    Utilization factor:     70% (protocol overhead, contention)
    Effective throughput:   0.77 GB/s = ~66 TB/day

  Cold migration (all downtime):
    200 TB / 66 TB/day = ~3 days of continuous copying
    But VMs are offline during copy, so this is 3 days of total downtime
    NOT ACCEPTABLE for 5,000+ VMs

  Warm migration (background pre-copy):
    Phase 1 (background): 200 TB / 66 TB/day = ~3 days
    Phase 2 (cutover): 5,000 VMs * 2-minute cutover each
      At 20 parallel cutover slots = 5,000 / 20 = 250 rounds
      250 rounds * 4 minutes = ~17 hours of total cutover time
      Spread across 30 migration waves over 8 weeks

  25 Gbps network:
    Effective: 2.2 GB/s -> 190 TB/day
    Phase 1: ~1 day
    This is why dedicated 25 Gbps migration network is recommended

Migration validation: hooks and health checks

MTV supports pre- and post-migration hooks that run at specific lifecycle points. These are critical for validating that a migrated VM actually works.

Pre-migration hooks (run before disk copy starts):

Verify VM snapshots are consolidated
Check that CBT is available
Verify application is in a consistent state (e.g., drain connections, flush caches)
Tag the VM in vCenter as "migration-in-progress"

Post-migration hooks (run after VM starts on KubeVirt):

Verify network connectivity (ping gateway, DNS resolution)
Verify application health (HTTP health check, database connection test)
Verify disk integrity (filesystem check, application-level data validation)
Update DNS records to point to the new VM's IP
Update load balancer backend pools
Send notification to migration tracking system

apiVersion: forklift.konveyor.io/v1beta1
kind: Hook
metadata:
  name: post-migration-smoke-test
spec:
  image: registry.internal/migration-tools/smoke-test:latest
  playbook: |
    ---
    - hosts: localhost
      tasks:
        - name: Wait for VM to be reachable
          wait_for:
            host: "{{ vm_ip }}"
            port: 22
            timeout: 300

        - name: Verify SSH access
          command: ssh -o StrictHostKeyChecking=no admin@{{ vm_ip }} "hostname"
          register: hostname_result

        - name: Verify application health
          uri:
            url: "http://{{ vm_ip }}:8080/health"
            status_code: 200
          when: app_type == "webserver"

        - name: Update DNS
          nsupdate:
            server: "dns.internal"
            zone: "example.com"
            record: "{{ vm_name }}"
            value: "{{ vm_ip }}"
            type: "A"

3. Azure Migrate

Azure Migrate is Microsoft's discovery, assessment, and migration platform for moving workloads into Azure (cloud) or Azure Local (on-premises). For this evaluation, the relevant scenario is migrating VMware VMs to Azure Local -- an on-premises Hyper-V-based platform managed through Azure Arc.

Azure Migrate Architecture

  Azure Migrate Architecture for VMware-to-Azure Local
  ================================================================

  +=====================================================================+
  |  Azure Portal (cloud control plane)                                 |
  |                                                                     |
  |  +----------------------------+  +-------------------------------+  |
  |  |  Azure Migrate Hub         |  |  Azure Migrate: Server        |  |
  |  |                            |  |  Migration                    |  |
  |  |  - Project management      |  |                               |  |
  |  |  - Assessment reports      |  |  - Replication management     |  |
  |  |  - Dependency visualization|  |  - Test migration             |  |
  |  |  - Cost estimation         |  |  - Production migration       |  |
  |  +----------------------------+  +-------------------------------+  |
  +=====================================================================+
       |                                    |
       | HTTPS (443)                        | HTTPS (443)
       | (metadata, control)               | (control, replication mgmt)
       |                                    |
  +====|====================================|============================+
  |  On-Premises Environment               |                            |
  |                                        |                            |
  |  +----------------------------+        |                            |
  |  |  Azure Migrate Appliance   |        |                            |
  |  |  (Windows Server VM)       |        |                            |
  |  |                            |        |                            |
  |  |  +----------------------+  |        |                            |
  |  |  | Discovery Agent      |  |        |                            |
  |  |  | - Connects to vCenter|  |        |                            |
  |  |  | - Inventories VMs    |  |        |                            |
  |  |  | - Collects perf data |  |        |                            |
  |  |  +----------------------+  |        |                            |
  |  |                            |        |                            |
  |  |  +----------------------+  |        |                            |
  |  |  | Assessment Agent     |  |        |                            |
  |  |  | - Readiness analysis |  |        |                            |
  |  |  | - Sizing (CPU/RAM)   |  |        |                            |
  |  |  | - Cost estimation    |  |        |                            |
  |  |  +----------------------+  |        |                            |
  |  |                            |        |                            |
  |  |  +----------------------+  |        |                            |
  |  |  | Replication Provider |  |        |                            |
  |  |  | (for agentless mode) |  |        |                            |
  |  |  | - Snapshot-based     |  |        |                            |
  |  |  |   replication        |  |        |                            |
  |  |  | - Delta sync via CBT |  |        |                            |
  |  |  +----------------------+  |        |                            |
  |  +----------------------------+        |                            |
  |       |                                |                            |
  |       | vCenter API                    |                            |
  |       v                                |                            |
  |  +----------------------------+        |                            |
  |  |  VMware vCenter            |        |                            |
  |  |  - Source VMs              |        |                            |
  |  |  - VMDK disks              |        |                            |
  |  +----------------------------+        |                            |
  |                                        |                            |
  |                                        v                            |
  |  +---------------------------------------------------+             |
  |  |  Azure Local Cluster (Target)                     |             |
  |  |                                                   |             |
  |  |  Hyper-V VMs created from converted disks         |             |
  |  |  VMDK -> VHD/VHDX conversion                     |             |
  |  |  Managed through Azure Arc                        |             |
  |  +---------------------------------------------------+             |
  +=====================================================================+

Discovery

Azure Migrate's discovery phase is agentless -- no software is installed on the VMs being discovered. The Migrate Appliance connects to vCenter via the vSphere API and collects:

VM inventory: names, IDs, OS type, power state, CPU/RAM configuration, disk sizes, network adapters, IP addresses
Performance data: CPU utilization, memory utilization, disk IOPS, disk throughput, network throughput -- collected over a configurable period (default: 30 days) to establish baselines
Dependency mapping: Optional agent-based (Microsoft Monitoring Agent) or agentless (via vCenter guest operations API) dependency analysis that maps which VMs communicate with which others. This is critical for wave planning -- you want to migrate VMs that depend on each other in the same wave.
Application discovery: Identification of installed software, roles, and features on Windows/Linux VMs

Discovery runs continuously, updating the inventory every 15 minutes for new or removed VMs and every 5 minutes for performance data.

Assessment

The assessment phase analyzes discovered VMs and produces readiness reports:

Readiness analysis:

Is the OS supported on Azure Local? (Windows Server 2012 R2+, RHEL 7+, Ubuntu 16.04+, etc.)
Are the disk sizes within Azure Local limits?
Are there unsupported features? (e.g., shared disks, pass-through disks, RDM)
UEFI or BIOS? Generation 1 or Generation 2 VM on Hyper-V?

Sizing recommendations:

Based on collected performance data, Azure Migrate recommends the VM size (CPU/RAM) on the target platform
"As-on-premises" sizing: match the source VM's configuration exactly
"Performance-based" sizing: right-size based on actual utilization (e.g., a VM with 16 vCPUs but average 10% utilization gets recommended at 4 vCPUs)

Cost estimation:

Estimates compute, storage, and licensing costs on Azure or Azure Local
Includes Windows Server licensing cost calculation (Azure Hybrid Benefit considerations)

Migration Methods

Azure Migrate supports two migration methods for VMware VMs:

Agentless migration (recommended):

No software installed on source VMs
Uses VMware vSphere APIs for snapshot-based replication
Takes an initial snapshot, copies all blocks, then uses CBT for delta syncs
Similar in concept to MTV's warm migration
Converts VMDK to VHD/VHDX during the copy process
Supports up to 500 concurrent replications per appliance

Agent-based migration:

Installs the Azure Site Recovery Mobility Service agent on each source VM
The agent streams write operations to a Process Server (on-premises), which forwards them to Azure/Azure Local
Required for physical servers or scenarios where agentless is not supported
More intrusive (requires agent installation and reboot)

Azure Migrate for Azure Local: Key Differences from Cloud Migration

Migrating to Azure Local (on-premises) differs from migrating to Azure (cloud) in several important ways:

Aspect	Azure (Cloud)	Azure Local (On-Premises)
Network path	Replication data traverses WAN/internet to Azure datacenter	Replication data stays on-premises (LAN)
Bandwidth	Limited by WAN bandwidth (typically 1-10 Gbps)	Full LAN bandwidth available (10-25 Gbps)
Target VM type	Azure VM sizes (standardized)	Hyper-V VMs (custom sizing)
Disk format	Managed Disks (VHD)	VHD/VHDX on local storage (CSV, Storage Spaces Direct)
Networking	Azure VNet, NSG	Azure Local logical networks, SDN (optional)
Management	Azure portal, fully managed	Azure Arc + local admin (hybrid management)
Availability	Azure SLAs	Customer-managed HA (Hyper-V failover clustering)

Azure Local specific considerations:

Azure Local clusters must be registered with Azure Arc before Azure Migrate can target them
The target storage must be configured as Cluster Shared Volumes (CSVs) backed by Storage Spaces Direct
Network mapping from VMware port groups to Azure Local logical networks must be configured
Generation selection: BIOS-based VMs become Generation 1 Hyper-V VMs, UEFI-based VMs become Generation 2

Hyper-V Conversion: VMDK to VHD/VHDX

Azure Migrate handles disk format conversion automatically, but understanding the process is important for troubleshooting:

VHD (Virtual Hard Disk):

Fixed or dynamically expanding format
Maximum size: 2 TB (for Generation 1 VMs)
Used by: Hyper-V Generation 1 VMs, Azure managed disks

VHDX (Virtual Hard Disk v2):

Maximum size: 64 TB
Supports 4 KB logical sector sizes (for 4Kn drives)
Built-in protection against power failure corruption (log-based metadata updates)
Used by: Hyper-V Generation 2 VMs

Driver injection for Hyper-V:

Hyper-V Integration Services (enlightenments) are built into the Windows kernel since Windows Server 2016 and Windows 10
For older Windows versions (2012 R2 and earlier), the Hyper-V Integration Services must be installed
Linux VMs: Hyper-V modules (hv_vmbus, hv_storvsc, hv_netvsc, hv_utils, hv_balloon) are included in the Linux kernel since version 3.4+. Most modern distributions work without additional driver injection.
Azure Migrate injects the necessary drivers and the Azure VM Agent (waagent for Linux, WindowsAzureGuestAgent for Windows) during conversion

Network and Storage Mapping

Similar to MTV's NetworkMap and StorageMap, Azure Migrate requires mapping source VMware constructs to target Azure Local constructs:

Network mapping:

Source: VMware port groups (e.g., "VLAN-100-Production", "VLAN-200-Backup")
Target: Azure Local logical networks or VM switches
IP address handling: can be static (manually assigned on target), DHCP, or preserved from source

Storage mapping:

Source: VMware datastores (VMFS, NFS, vSAN)
Target: Azure Local storage paths on Cluster Shared Volumes
Disk type selection: Fixed or dynamic VHDX

Migration Strategy for 5,000+ VMs

This section addresses the operational reality of migrating an estate of this size. The tooling covered above handles individual VM conversions. This section covers how to orchestrate thousands of conversions into a managed program.

Wave Planning

Migration waves group VMs into batches that are migrated together during a scheduled window. Wave composition is the most important planning decision in the migration program.

  Wave Planning Model
  ================================================================

  Wave 0: Proof of Concept (2-4 weeks)
  +----------------------------------------------------------+
  |  10-20 VMs                                               |
  |  Purpose: Validate tooling, process, team readiness      |
  |  Selection criteria:                                     |
  |    - Non-production, low-risk VMs                        |
  |    - Mix of Linux + Windows                              |
  |    - Mix of disk sizes (small, medium, large)            |
  |    - At least one UEFI VM                                |
  |    - At least one VM with multiple NICs                  |
  |    - At least one VM with multiple disks                 |
  |  Exit criteria:                                          |
  |    - All VMs boot and pass health checks                 |
  |    - Migration runbook validated                         |
  |    - Rollback procedure tested                           |
  |    - Team trained on tooling                             |
  +----------------------------------------------------------+

  Wave 1: Development & Test (2-3 weeks)
  +----------------------------------------------------------+
  |  200-500 VMs                                             |
  |  Purpose: Scale validation, process refinement           |
  |  Selection criteria:                                     |
  |    - Development and test environments                   |
  |    - Applications with known low criticality             |
  |    - VMs with owners who can validate quickly            |
  |  Exit criteria:                                          |
  |    - Throughput baseline established                     |
  |    - 95%+ success rate on first attempt                  |
  |    - Failure patterns documented                         |
  +----------------------------------------------------------+

  Waves 2-N: Production (8-16 weeks)
  +----------------------------------------------------------+
  |  Remaining ~4,500 VMs in groups of 100-200               |
  |  Grouping criteria (in priority order):                  |
  |    1. Application dependency (all VMs of an application  |
  |       migrate together)                                  |
  |    2. Criticality tier (Tier 3 first, Tier 1 last)       |
  |    3. Complexity (simple VMs first, complex last)        |
  |    4. Business unit readiness                            |
  |    5. Maintenance window alignment                       |
  |                                                          |
  |  Each wave:                                              |
  |    - Pre-migration: discovery validation, hook tests     |
  |    - Migration: warm pre-copy (days), cutover (hours)    |
  |    - Validation: health checks, application sign-off     |
  |    - Soak: 5-10 business days dual-monitoring            |
  |    - Decommission: power off source VMs, archive disks   |
  +----------------------------------------------------------+

  Final Wave: Critical Infrastructure
  +----------------------------------------------------------+
  |  50-100 VMs                                              |
  |  Domain controllers, DNS servers, monitoring,            |
  |  certificate authorities, backup infrastructure          |
  |  Requires most careful planning, smallest batch sizes,   |
  |  and longest soak periods.                               |
  +----------------------------------------------------------+

Grouping by application dependency is the most critical criterion. Migrating half of an application's VMs to the new platform while the other half remains on VMware creates a "split-brain" scenario where cross-platform network latency, firewall rules, and DNS inconsistencies can cause application failures. Use the dependency mapping from Azure Migrate or a CMDB to identify application groups.

Complexity classification:

Complexity	Characteristics	Expected Migration Effort
Simple	Linux, single disk < 100 GB, single NIC, no special hardware, stateless	Fully automated. 5-10 minutes downtime.
Medium	Windows Server, 1-3 disks, multiple NICs, domain-joined, basic services	Automated with manual validation. 10-30 minutes downtime.
Complex	Large disks (500 GB+), clustered applications (Oracle RAC, SQL Always-On), GPU, UEFI + Secure Boot, custom drivers, RDM disks	Semi-automated. May require manual conversion steps. 1-4 hours downtime.
Special	Physical-to-virtual legacy systems, appliances with locked OS, VMs with USB passthrough, VMs with SRIOV	Manual migration or rebuild. May not be convertible -- may require re-platforming.

Migration Factory

A "migration factory" is the operational model for executing waves at a sustained pace. It is a dedicated team, with defined roles, tools, runbooks, and escalation paths, that operates like a production line.

  Migration Factory Operating Model
  ================================================================

  +-------------------+     +-------------------+     +-------------------+
  |  INTAKE           |     |  EXECUTE          |     |  VALIDATE         |
  |                   |     |                   |     |                   |
  |  - Receive wave   | --> |  - Run pre-copy   | --> |  - Health checks  |
  |    manifest       |     |    (warm mode)    |     |  - App owner      |
  |  - Validate VM    |     |  - Schedule       |     |    sign-off       |
  |    readiness      |     |    cutover window |     |  - Performance    |
  |  - Check snapshot |     |  - Execute        |     |    comparison     |
  |    consolidation  |     |    cutover        |     |  - Soak period    |
  |  - Verify network |     |  - Monitor        |     |    monitoring     |
  |    mappings       |     |    conversion     |     |                   |
  |  - Assign to      |     |  - Handle         |     |                   |
  |    migration slot |     |    failures       |     |                   |
  +-------------------+     +-------------------+     +-------------------+
         |                         |                         |
         v                         v                         v
  +-------------------+     +-------------------+     +-------------------+
  |  Roles:           |     |  Roles:           |     |  Roles:           |
  |  Migration Lead   |     |  Migration Eng.   |     |  App Owner        |
  |  App Owner        |     |  Platform Eng.    |     |  QA Engineer      |
  |  (intake form)    |     |  Network Eng.     |     |  Migration Lead   |
  +-------------------+     +-------------------+     +-------------------+

  Throughput target: 100-200 VMs per week at steady state
  Team size: 4-6 migration engineers + 1 lead + shared platform/network

Key metrics for the migration factory:

Metric	Target	Why
VMs migrated per week	100-200	Determines total project duration. At 150/week, 5,000 VMs = ~33 weeks.
First-attempt success rate	>95%	Failed migrations consume 3-5x the effort of successful ones.
Average cutover downtime	<15 minutes (warm), <60 minutes (cold)	Business tolerance for per-VM downtime.
Rollback rate	<5%	VMs that must revert to VMware after cutover.
Mean time to validate	<4 hours	Time from VM-start-on-target to application owner sign-off.

Rollback Strategy

Every migration must have a tested rollback path. The source VMware environment must remain operational until the migrated VM is validated and the soak period is complete.

Rollback approach for warm migration:

Do NOT decommission the source VM immediately after cutover. Power it off but retain the disks.
If the migrated VM fails validation, power off the target VM, power on the source VM on VMware.
DNS and load balancer changes must be reversible (use short TTLs during migration, keep old configuration ready).
Set a soak period (5-10 business days) after which the source VM can be archived and eventually deleted.
Archive source VMDKs to cold storage (object store, tape) for a defined retention period (90-180 days) as a safety net.

Rollback approach for cold migration:

Same as above -- the source VM is still on VMware, just powered off. Power it back on to rollback.

What makes rollback hard:

If the migrated VM has been running in production and receiving new data (database writes, user uploads), reverting to the source VM loses that data. This is why soak periods should be kept as short as possible for write-heavy workloads, and why some organizations implement data synchronization in both directions during the soak period.
DNS changes may have propagated to clients with long TTLs. Clients may cache the new IP address.
If shared infrastructure (file servers, databases) has been updated to reference the new VM's address, rollback requires reverting those changes too.

Parallel Running (Dual-Stack Period)

During migration, both the VMware platform and the target platform (OVE or Azure Local) run simultaneously. This "dual-stack" period requires:

Network connectivity between platforms: VMs on VMware must communicate with VMs on the target platform. This requires routing between the VMware network segments and the target platform's networks, or a shared L2 segment.
Shared services: DNS, Active Directory, monitoring, backup, and logging must serve both platforms simultaneously.
Capacity planning: The organization needs enough hardware to run both platforms at the same time. The target platform must be fully deployed before migration starts. Hardware from VMware can only be reclaimed after VMs are migrated and decommissioned.
Monitoring: Unified monitoring that covers VMs on both platforms, with dashboards showing migration progress and comparative health metrics.

Acceptance Criteria

A migrated VM is "done" when all of the following are true:

Criterion	Verification
VM boots successfully	Console access confirms OS boot, login prompt
Network connectivity	Ping gateway, resolve DNS, reach dependent services
Correct IP configuration	IP address, subnet, gateway, DNS match the pre-migration configuration
Storage accessible	All disks mounted, file systems intact, data readable
Application functional	Application-specific health check passes (HTTP 200, DB connection, etc.)
Performance acceptable	CPU, memory, disk I/O, network throughput within 20% of baseline
Monitoring integrated	VM appears in monitoring system, alerts fire correctly
Backup configured	Backup agent/schedule active on the new platform
Application owner sign-off	Written confirmation from the application team
Soak period complete	N business days without incidents

Timeline Estimation

Realistic throughput estimates per platform:

  Timeline Estimation for 5,000 VM Migration
  ================================================================

  Assumptions:
  - 25 Gbps dedicated migration network
  - Average VM: 2 disks, 80 GB used data
  - 70% Linux, 30% Windows
  - 80% simple/medium, 20% complex/special
  - Warm migration for all VMs > 50 GB
  - 20 parallel migration slots

  Phase 1: Setup & Wave 0 (4-6 weeks)
  +-----------+----------------------------------------------+
  | Week 1-2  | Deploy target platform, configure MTV/Azure  |
  |           | Migrate, set up monitoring, build runbooks    |
  | Week 3-4  | Wave 0: 20 VMs (PoC validation)              |
  | Week 5-6  | Process review, runbook refinement            |
  +-----------+----------------------------------------------+

  Phase 2: Dev/Test Waves (3-4 weeks)
  +-----------+----------------------------------------------+
  | Week 7-8  | Wave 1: 200-500 dev/test VMs                 |
  | Week 9-10 | Validate, measure, optimize throughput        |
  +-----------+----------------------------------------------+

  Phase 3: Production Waves (16-24 weeks)
  +-----------+----------------------------------------------+
  | Week 11+  | Waves 2-N: 200 VMs/wave, 1-2 waves/week     |
  |           | ~150 VMs/week sustained throughput            |
  |           | 4,500 VMs / 150 per week = ~30 weeks         |
  |           | With ramp-up and delays: 16-24 weeks          |
  +-----------+----------------------------------------------+

  Phase 4: Critical Infrastructure & Cleanup (4-6 weeks)
  +-----------+----------------------------------------------+
  | Final     | Domain controllers, DNS, monitoring          |
  | weeks     | Decommission VMware, reclaim hardware        |
  +-----------+----------------------------------------------+

  Total estimated duration: 6-10 months
  (assuming dedicated migration team and no major blockers)

How the Candidates Handle This

Aspect	OVE (MTV)	Azure Local (Azure Migrate)	Swisscom ESC
Primary tool	Migration Toolkit for Virtualization (MTV / Forklift)	Azure Migrate: Server Migration	Swisscom Professional Services (VMware HCX or vMotion)
Source platforms	VMware vSphere, Red Hat Virtualization, OpenStack, oVirt	VMware vSphere, Hyper-V, physical servers	VMware vSphere (VMware-to-VMware)
Disk format conversion	VMDK to QCOW2 or raw (via qemu-img / virt-v2v)	VMDK to VHD/VHDX (handled by migration service)	No conversion needed (VMware-to-VMware)
Warm migration	Yes. CBT-based delta sync via VDDK. Cutover downtime: 2-5 minutes.	Yes. Snapshot-based replication with CBT. Similar cutover window.	Yes (vMotion for same-vCenter, HCX for cross-vCenter). Near-zero downtime.
Cold migration	Yes. Full disk copy + convert + import.	Yes. Full disk copy + convert.	Yes (OVA export/import).
Driver injection	virt-v2v: virtio-win for Windows, initramfs rebuild for Linux. Automated.	Azure Migrate: Hyper-V Integration Services injection. Automated for supported OS.	Not needed (same hypervisor).
Parallel migrations	Configurable. Typically 10-20 concurrent. Limited by vCenter API and storage write IOPS.	Up to 500 concurrent replications per appliance.	Limited by network bandwidth and vMotion/HCX capacity.
UI	OpenShift Console plugin (MTV UI). Web-based.	Azure Portal. Cloud-based. Requires internet.	Swisscom portal + professional services.
API / Automation	Kubernetes CRDs. Fully automatable via kubectl, Ansible, Terraform.	Azure REST API, PowerShell, Azure CLI.	API availability unclear. Migration managed by Swisscom.
Pre-migration validation	Forklift-validation: automated OS/disk/network checks.	Azure Migrate Assessment: readiness, sizing, cost.	Swisscom professional services assessment.
Post-migration hooks	Yes. Ansible playbooks or container hooks at Plan level.	Limited. Azure Automation runbooks.	Not self-service. Managed by Swisscom.
Dependency mapping	Not built into MTV. Use third-party or manual CMDB.	Yes. Built-in agentless or agent-based dependency analysis.	Swisscom assessment.
Rollback	Manual. Keep source VM powered off, re-start if needed.	Built-in. Can resume replication to source (for Azure cloud). For Azure Local: manual.	VMware-to-VMware rollback is straightforward.
Assessment / right-sizing	Not built into MTV. Manual or use third-party tools.	Yes. Performance-based sizing recommendations. Cost estimation.	Included in Swisscom assessment.
Scale proven	Proven at 1,000+ VM scale in Red Hat customer deployments. Scaling to 5,000+ requires careful planning.	Proven at large scale (1,000+ VMs per appliance). Microsoft's most mature migration tool.	Swisscom manages migration; scale depends on engagement model.
VDDK dependency	Yes. VDDK library required for VMware source. Not redistributable -- must be obtained from VMware/Broadcom.	No. Uses vSphere APIs directly (no VDDK dependency).	N/A (VMware native tools).
Windows support	Full. virtio-win drivers for all supported Windows versions. UEFI supported.	Full. Hyper-V enlightenments built into Windows. UEFI Generation 2 supported.	Full (no conversion needed).
Linux support	Full. virtio modules in kernel. initramfs rebuild automated.	Full. Hyper-V modules in kernel.	Full (no conversion needed).

Key Takeaways

Warm migration is non-negotiable at 5,000+ VM scale. Cold migration's downtime (proportional to disk size) is unacceptable for production VMs. Both MTV and Azure Migrate support warm migration with CBT-based delta sync, reducing cutover downtime to minutes regardless of disk size. The migration network should be 25 Gbps dedicated to support the sustained data transfer volume.
VMDK snapshot chains must be consolidated before migration. No conversion tool can migrate a VM with active VMware snapshots. The pre-migration validation phase must verify that every VM's snapshot chain is fully consolidated into a single flat VMDK. This is a common source of migration failures that can be entirely prevented with upfront validation.
Windows VM conversion is the highest-risk area. Linux VMs almost always convert cleanly because virtio drivers are in the kernel. Windows VMs require driver injection (virtio-win for OVE, Integration Services for Azure Local), bootloader reconfiguration, and frequently have NIC ordering and Windows activation issues. Budget extra time and manual validation for every Windows VM.
MTV's Kubernetes-native approach is both a strength and a complexity. MTV's CRD-based workflow (Provider, NetworkMap, StorageMap, Plan, Migration) integrates naturally into GitOps and automation pipelines. However, it requires Kubernetes expertise that a VMware-trained team may not have. Azure Migrate's Azure Portal-based workflow is more accessible to traditional infrastructure teams but less automatable.
VDDK is a licensing and supply-chain dependency for MTV. MTV requires VMware's proprietary VDDK library to access VMware disks. VDDK must be obtained from VMware/Broadcom under their SDK license. In a scenario where the organization is leaving VMware due to licensing concerns, maintaining a VDDK dependency during migration is an awkward but necessary reality. Azure Migrate does not have this dependency.
The migration is a 6-10 month program, not a one-time event. Wave planning, migration factory operations, validation, soak periods, and rollback handling require a dedicated team operating at sustained capacity. The total elapsed time depends on parallel migration capacity (constrained by network bandwidth, target storage throughput, and team size) and the tolerance for downtime windows.
Swisscom ESC migration is technically trivial but strategically questionable. VMware-to-VMware migration avoids all conversion risks (no format conversion, no driver injection, no bootloader changes). However, it does not address the strategic objective of leaving VMware. If the long-term goal is VMware independence, ESC migration is a lateral move that defers the conversion problem.
Disk format selection on OVE depends on the storage backend. If OVE uses ODF (Ceph RBD), raw format is preferred because Ceph provides thin provisioning and snapshots at the RADOS layer, making QCOW2's features redundant overhead. If the storage backend lacks these capabilities, QCOW2 provides them at the image level. This decision should be made before migration begins, as converting between formats after migration adds unnecessary work.
Post-migration validation must be application-aware, not just infrastructure-aware. A VM that boots, has network connectivity, and shows healthy CPU/memory metrics can still have a broken application. Migration hooks (MTV) or automation runbooks (Azure Migrate) should include application-specific health checks: HTTP endpoints, database connectivity tests, queue processing verification, and end-to-end transaction tests.
Rollback readiness is a planning requirement, not an afterthought. Retain source VMware VMs (powered off) through the soak period. Use short DNS TTLs during cutover. Keep load balancer configurations reversible. Archive VMDKs to cold storage for a defined retention period. The cost of maintaining rollback capability is small compared to the cost of a failed migration with no way back.

Discussion Guide

Use these questions when engaging with vendors, Red Hat/Microsoft/Swisscom field teams, or internal subject matter experts.

Migration Tooling and Process

Demonstrate a warm migration of a Windows Server 2022 VM with 500 GB of disk data, domain-joined, running IIS with a SQL Server backend. Walk through every step: pre-copy, delta syncs, cutover, driver injection, boot on target, application validation. What is the total cutover downtime? Does Windows activation survive the migration? Why this matters: Windows VMs with large disks and domain membership represent the most common complex migration scenario. The demonstration must prove that the full pipeline works end-to-end, including driver injection and post-conversion functionality.
Show how MTV/Azure Migrate handles a VM with multiple NICs on different VLANs. After migration, are the NIC-to-VLAN mappings preserved? Is the NIC ordering inside the guest OS preserved? What happens to static IP configurations? Why this matters: Multi-NIC VMs (e.g., management + data + backup networks) are common in enterprise environments. NIC ordering changes can break applications that bind to specific interface names.
What is the maximum number of concurrent migrations you have demonstrated in a production customer environment? What were the bottlenecks? How was the migration network sized? Why this matters: The vendor's answer reveals real-world scale limitations that may not appear in lab tests. The number must be compared against the organization's throughput requirements for the migration timeline.
How does the tool handle a conversion failure mid-flight? If virt-v2v / Azure Migrate fails during driver injection on VM number 47 out of 200 in a wave, what happens to VM 47? What happens to the remaining 153 VMs in the wave? Is there automatic retry? Why this matters: At scale, failures are statistical certainties. The tool's failure handling (retry, skip, abort-wave) determines how much manual intervention is needed per wave.
Demonstrate rollback: migrate a VM to the new platform, let it run for 24 hours with active data changes, then roll back to VMware. How is data handled? Is there any data loss? How long does rollback take? Why this matters: Rollback is the safety net. If rollback is painful or lossy, the team will hesitate to proceed with production migrations, slowing the entire program.

Disk Formats and Conversion

What disk format is recommended on the target platform (QCOW2 vs. raw for OVE, fixed vs. dynamic VHDX for Azure Local), and why? What is the performance difference? Can the format be changed after migration without re-migrating the VM? Why this matters: The disk format choice affects I/O performance, storage efficiency, and snapshot capabilities. Making the wrong choice pre-migration may require costly re-conversion later.
How does the conversion tool handle a VMDK with seSparse snapshots that have not been consolidated? Does it detect this condition and warn the user, or does it attempt conversion and fail? Why this matters: Unconsolidated snapshots are the most common pre-migration blocker. The tool should fail fast with a clear error message, not attempt a conversion that produces a corrupted disk.

Wave Planning and Scale

For an estate of 5,000+ VMs, what is the recommended migration team size, wave cadence, and total project duration? What are the key assumptions behind that estimate? Provide reference customers of comparable scale. Why this matters: The vendor's answer calibrates timeline expectations against real-world experience. Reference customers provide validation that the tool and process have been proven at this scale.
How does your tooling support wave planning? Can we group VMs by application, tag them, and execute them as a unit? Can we define dependencies between waves (e.g., "do not start Wave 3 until Wave 2 validation is complete")? Why this matters: Wave orchestration at scale requires tooling support, not just spreadsheets. The ability to define, track, and gate waves determines how safely the migration progresses.
What happens if the VMware license expires or VMware support is terminated during the migration? Does the migration tool still function? Are there any VDDK or vCenter API dependencies that would break? Why this matters: The migration timeline may overlap with VMware contract expiration. If the migration tool depends on active VMware licensing (VDDK, vCenter), a license lapse could halt the migration program. This is a genuine risk that must be contractually and technically mitigated.