NSX-T -- Current Networking Baseline

Why This Matters

NSX-T (now branded "VMware NSX" under Broadcom) is the SDN platform that the organization has operated for years. Every firewall rule, every overlay segment, every load-balanced virtual service, and every distributed routing decision currently runs through NSX. Before evaluating OVN (OVE), Azure SDN/Network Controller (Azure Local), or VLAN-based networking (Swisscom ESC), the team must deeply understand what NSX-T actually does -- not at a marketing level, but at the data-plane-packet-path level.

This understanding serves three purposes:

Gap analysis: Which NSX capabilities are essential to replicate, and which are VMware-specific implementation details that any competent SDN replaces differently?
Migration risk identification: Which NSX features have created hidden dependencies in the current environment (e.g., DFW rules referencing vCenter objects, Tier-1 gateways providing DHCP for specific segments)?
Operational baseline: What does "normal" look like today -- latency, rule counts, upgrade cadence, incident patterns -- so the replacement can be measured against real experience rather than vendor promises?

NSX-T is not a single product. It is a layered system of management, control, and data planes, each with its own failure modes, scaling limits, and operational characteristics. This document dissects each layer.

Concepts

1. NSX-T Architecture -- Management, Control, and Data Planes

NSX-T separates its functionality into three planes, each with distinct components and failure characteristics.

+=====================================================================+
|                       MANAGEMENT PLANE (MP)                         |
|                                                                     |
|  +---------------------+  +---------------------+  +-----------+   |
|  | NSX Manager Node 1  |  | NSX Manager Node 2  |  | NSX Mgr 3 |   |
|  | (UI, API, Policy)   |  | (UI, API, Policy)   |  | (standby) |   |
|  +----------+----------+  +----------+----------+  +-----+-----+   |
|             |                        |                    |         |
|             +------------------------+--------------------+         |
|                         VIP (Virtual IP)                            |
|                    Corfu distributed datastore                      |
+============================+========================================+
                             |
                    MP-to-CCP channel
                    (gRPC / protobuf)
                             |
+============================v========================================+
|                       CONTROL PLANE (CCP)                           |
|                                                                     |
|  Central Control Plane (runs inside NSX Manager nodes)              |
|  - Computes forwarding tables from desired-state config             |
|  - Pushes flow rules to transport nodes                             |
|  - Resolves logical-to-physical bindings                            |
|                                                                     |
|  Local Control Plane (LCP) -- runs on EVERY transport node          |
|  - nsx-proxy daemon on ESXi hosts / KVM hosts / Edge nodes          |
|  - Receives rules from CCP, programs the local data plane           |
|  - Reports upward: port status, tunnel health, statistics           |
+============================+========================================+
                             |
                    CCP-to-LCP channel
                    (RPC / protobuf over TCP)
                             |
+============================v========================================+
|                        DATA PLANE (DP)                              |
|                                                                     |
|  ESXi hosts:  N-VDS (NSX Virtual Distributed Switch)                |
|               or VDS 7.0+ with NSX integration                      |
|               - GENEVE encap/decap                                  |
|               - DFW rule enforcement (per-vNIC kernel module)       |
|               - Distributed routing (DR component)                  |
|                                                                     |
|  Edge Nodes:  SR (Service Router) for north-south traffic           |
|               - NAT, VPN, Gateway Firewall                          |
|               - BGP/OSPF peering with physical routers              |
|               - Bare-metal or VM form factor                        |
|                                                                     |
|  KVM hosts:   OVS-based data plane with NSX agent                   |
+=====================================================================+

Management Plane (MP) -- NSX Manager Cluster

The NSX Manager cluster consists of three nodes (for production environments) running the Corfu distributed datastore. Corfu is a shared-log-based database that replaces the vPostgres database used in older NSX versions.

Key responsibilities:

Policy API (the /policy/api/v1/ endpoint) -- the primary interface for configuring NSX. All modern configurations (segments, gateways, firewall rules) go through the Policy API.
Manager API (the /api/v1/ endpoint) -- the legacy "imperative" API. Still used by some integrations but deprecated for new configurations.
UI -- the NSX Manager web console, accessible via the VIP or individual node IPs.
Configuration persistence -- all desired-state configuration is stored in Corfu and replicated across the three nodes.
RBAC -- role-based access control, integrated with vIDM (VMware Identity Manager) or LDAP.

Operational pain point -- MP availability: If the NSX Manager cluster loses quorum (2 of 3 nodes down), no new configuration can be written. Existing data-plane forwarding continues (the data plane is autonomous once programmed), but no new VMs can be connected to NSX segments, no firewall rules can be modified, and no new logical routers can be created. This is a critical distinction: MP failure is a management failure, not a forwarding failure. However, in a 5,000+ VM environment where provisioning happens continuously, an MP outage effectively halts operations.

Operational pain point -- MP upgrades: NSX Manager upgrades are rolling (one node at a time), but each upgrade step requires the cluster to re-form quorum. In practice, major version upgrades (e.g., 3.2 to 4.1) can take 4-8 hours and require a maintenance window. During the upgrade, mixed-version states can cause API inconsistencies.

Central Control Plane (CCP)

The CCP runs as a process within the NSX Manager nodes (it is not a separate appliance). Its job is to take the desired-state configuration from the MP and compute the realized-state forwarding tables that each transport node needs.

Example: When an administrator creates a DFW rule "allow TCP/443 from Group-A to Group-B," the CCP must:

Resolve Group-A and Group-B memberships (which VMs, which IPs, which vNICs).
Compute the specific flow rules for every host that has members of either group.
Push those rules (via protobuf messages) to the LCP agent on each relevant host.

Operational pain point -- CCP split-brain: In rare failure scenarios (network partition between NSX Manager nodes), the CCP can enter a split-brain state where two nodes believe they are the active controller. This results in conflicting rules being pushed to transport nodes. Recovery requires manual intervention -- identify the stale controller, restart it, and force re-synchronization. In a 5,000+ VM environment, a CCP split-brain during a DFW rule change can cause transient connectivity loss for affected VMs.

Local Control Plane (LCP)

The LCP is the nsx-proxy daemon running on every ESXi host, KVM host, and Edge node. It maintains a persistent connection to the CCP and programs the local data plane (N-VDS or VDS) based on the rules it receives.

Key behaviors:

Stateful connection to CCP: If the CCP connection is lost, the LCP continues operating with its last-known rules. New configuration cannot be applied until the connection is restored.
Local caching: The LCP caches all rules locally. If the ESXi host reboots, the LCP can restore the data plane from its local cache without waiting for the CCP -- assuming the cache has not been corrupted.
Heartbeat and health reporting: The LCP sends periodic heartbeats to the CCP. If heartbeats stop, the CCP marks the transport node as "degraded" in the UI.

2. Transport Zones and Transport Nodes

A transport zone defines the scope of a logical network -- which hosts can participate in a given overlay or VLAN-backed segment.

Transport Zone Types:

+-------------------------------------------------------+
|              OVERLAY TRANSPORT ZONE                    |
|                                                       |
|  - Uses GENEVE encapsulation                          |
|  - Hosts communicate via TEP (Tunnel Endpoint) IPs    |
|  - Segments are identified by VNI (24-bit)            |
|  - BUM traffic handled via head-end replication       |
|    or multicast (rare in production)                  |
|                                                       |
|  Scope: All hosts that need overlay connectivity      |
|  Typical: One overlay TZ spanning the entire cluster  |
|                                                       |
|  +-----------+    GENEVE tunnel    +-----------+      |
|  | ESXi-01   |<------------------->| ESXi-02   |      |
|  | TEP: .10  |    UDP/6081         | TEP: .11  |      |
|  +-----------+                     +-----------+      |
|       |                                 |             |
|    [VM-A]                            [VM-B]           |
|    Segment: web-tier                 Segment: web-tier|
|    VNI: 72000                        VNI: 72000       |
+-------------------------------------------------------+

+-------------------------------------------------------+
|              VLAN TRANSPORT ZONE                       |
|                                                       |
|  - No encapsulation -- direct 802.1Q VLAN tagging     |
|  - Used for:                                          |
|    - Edge node uplinks (Tier-0 peering with physical) |
|    - Bridging between overlay and physical VLANs      |
|    - Hosts that need direct VLAN access               |
|                                                       |
|  Scope: Hosts that need specific VLAN connectivity    |
|  Typical: One VLAN TZ per site or per function        |
|                                                       |
|  +-----------+     802.1Q trunk    +-----------+      |
|  | Edge-01   |<------------------->| Leaf SW   |      |
|  | VLAN 100  |                     | VLAN 100  |      |
|  +-----------+                     +-----------+      |
+-------------------------------------------------------+

Transport Nodes

A transport node is any host or Edge appliance that participates in NSX networking. When a host is added as a transport node:

The NSX VIBs (vSphere Installation Bundles) are installed on ESXi, or the NSX agent is installed on KVM.
The host is assigned to one or more transport zones.
A TEP (Tunnel Endpoint) IP is assigned from a TEP IP pool. This IP is used as the outer source/destination IP for GENEVE tunnels.
The N-VDS (or VDS 7.0+) is configured with the appropriate uplink profiles (teaming, MTU, VLAN for TEP traffic).

TEP communication: Every transport node establishes GENEVE tunnels (UDP port 6081) to every other transport node in the same overlay transport zone. In a cluster of 100 ESXi hosts, this means up to 4,950 tunnel pairs (n*(n-1)/2). The tunnels are established on-demand (when VMs on different hosts need to communicate) and are torn down when idle.

TEP Communication Model (100-host cluster):

  TEP IP Pool: 192.168.250.0/24

  ESXi-01 (TEP: 192.168.250.10)
      |--- GENEVE tunnel ---> ESXi-02 (TEP: 192.168.250.11)
      |--- GENEVE tunnel ---> ESXi-03 (TEP: 192.168.250.12)
      |--- GENEVE tunnel ---> ESXi-04 (TEP: 192.168.250.13)
      |         ...
      |--- GENEVE tunnel ---> ESXi-100 (TEP: 192.168.250.109)

  GENEVE Packet on the wire:

  +------------------+------------------+-----------------+
  | Outer Ethernet   | Outer IP         | Outer UDP       |
  | Dst: next-hop MAC| Src: 192.168.250.10               |
  | Src: ESXi-01 MAC | Dst: 192.168.250.11  Dst: 6081   |
  +------------------+------------------+-----------------+
  | GENEVE Header                                         |
  | VNI: 72000  (identifies the logical segment)          |
  | Options: (variable, e.g., security tag, trace flag)   |
  +-----------------+------------------------------------+
  | Inner Ethernet  | Inner IP          | Payload         |
  | (original VM    | (original VM      | (original VM    |
  |  frame)         |  packet)          |  data)          |
  +-----------------+-------------------+-----------------+

  Total overhead: 50-74 bytes (depending on GENEVE options)
  --> Physical MTU MUST be >= inner MTU + 74 bytes
  --> For inner MTU 1500: physical MTU >= 1574 (use 1600 or 9000)

N-VDS vs. VDS Integration

Historically, NSX-T required its own virtual switch -- the N-VDS (NSX-managed Virtual Distributed Switch), which was separate from the vDS (vSphere Distributed Switch) used by vCenter. This created operational complexity: two separate switches on the same host, two sets of uplink configurations, and potential for misconfiguration.

Starting with NSX-T 3.0 and VDS 7.0, NSX can be installed on top of the standard vDS, eliminating the need for a separate N-VDS. This is now the recommended configuration and is mandatory for VCF (VMware Cloud Foundation) deployments.

Migration note: If the current environment still uses N-VDS, migrating from N-VDS to VDS should be considered before the platform migration -- not during it. Migrating two things simultaneously (NSX switch type + platform) compounds risk.

3. Logical Switching -- Segments

An NSX segment is the Layer-2 broadcast domain where VMs connect. It is the equivalent of a VLAN in traditional networking but with overlay encapsulation providing isolation and scalability.

Segment Types

Type	Backing	Encapsulation	Use Case
Overlay Segment	Overlay Transport Zone	GENEVE (VNI)	Default for VM-to-VM L2 connectivity across hosts
VLAN Segment	VLAN Transport Zone	802.1Q (VLAN ID)	Edge uplinks, bridging to physical, specific legacy needs

GENEVE Encapsulation

NSX-T uses GENEVE (Generic Network Virtualization Encapsulation, RFC 8926) as its overlay protocol. GENEVE was chosen over VXLAN because:

Extensible options field: GENEVE supports variable-length options in the header, allowing NSX to embed metadata (security tags, tracing flags, QoS markings) without modifying the encapsulation protocol.
24-bit VNI: Like VXLAN, GENEVE uses a 24-bit Virtual Network Identifier, supporting up to 16,777,216 segments (vs. 4,094 VLANs).
UDP-based: GENEVE uses UDP port 6081 (VXLAN uses 4789). Both benefit from ECMP hashing on the outer UDP header in the physical fabric.

BUM Traffic Handling

BUM (Broadcast, Unknown Unicast, Multicast) traffic in overlay networks requires special handling because there is no physical switch to flood frames across all ports.

NSX-T supports two modes:

Head-End Replication (default): The source TEP replicates the BUM frame and sends a unicast copy inside a GENEVE tunnel to every other TEP in the segment. This is simple but scales poorly -- a broadcast from one VM in a segment with 200 VMs on 50 hosts generates 49 unicast copies from the source TEP.
Two-Tier Replication: The source TEP sends the BUM frame to a set of designated "replication proxy" TEPs (typically one per rack or per cluster). Each proxy then replicates to the TEPs in its scope. This reduces the replication burden on the source but adds latency.

Head-End Replication (default):

  VM-A broadcasts an ARP request on Segment "web-tier" (VNI 72000)
  VM-A is on ESXi-01 (TEP .10)
  Segment has VMs on ESXi-02 (.11), ESXi-05 (.14), ESXi-12 (.21)

  ESXi-01 TEP .10 ---GENEVE(VNI=72000)---> ESXi-02 TEP .11
  ESXi-01 TEP .10 ---GENEVE(VNI=72000)---> ESXi-05 TEP .14
  ESXi-01 TEP .10 ---GENEVE(VNI=72000)---> ESXi-12 TEP .21

  3 unicast copies sent by ESXi-01 for 1 broadcast frame.
  With 100 hosts in the segment: 99 copies per broadcast frame.

ARP Suppression

To reduce BUM traffic (which is mostly ARP in IPv4 networks), NSX-T implements ARP suppression at the data plane level. When a VM sends an ARP request:

The local N-VDS/VDS intercepts the ARP request before it enters the overlay.
The data plane checks a local ARP suppression table (populated by the CCP from learned MAC/IP bindings).
If the target IP/MAC is known, the data plane generates a proxy ARP reply locally -- the ARP request never reaches the overlay.
If the target is unknown, the ARP request is forwarded into the overlay as a BUM frame (head-end replication).

ARP suppression dramatically reduces BUM traffic in large segments. In a segment with 500 VMs, it can reduce ARP broadcast traffic by 90%+ after the initial learning phase.

4. Logical Routing -- Tier-0 and Tier-1 Gateways

NSX-T uses a two-tier routing model that separates north-south routing (Tier-0) from tenant/application routing (Tier-1). Both tiers have two components: a Distributed Router (DR) running on every transport node and a Service Router (SR) running on Edge nodes.

Two-Tier Routing Architecture:

                 Physical Network
                 (BGP peers, ISP,
                  branch offices)
                       |
                       | BGP / OSPF / Static
                       |
              +--------v--------+
              |   TIER-0 SR     |  <-- Runs on Edge Node(s)
              | (Service Router)|      Handles: BGP/OSPF peering,
              |                 |      NAT, VPN, Gateway Firewall
              | Active-Standby  |
              | or Active-Active|
              +--------+--------+
                       |
                 Transit Segment (100.64.x.x/31)
                 (auto-created by NSX)
                       |
              +--------v--------+
              |   TIER-0 DR     |  <-- Runs on EVERY host
              | (Distributed    |      Handles: first-hop routing
              |  Router)        |      for segments directly on T0
              +--------+--------+
                       |
                 Tier-0-to-Tier-1 transit
                 (100.64.x.x/31, auto-created)
                       |
              +--------v--------+
              |   TIER-1 SR     |  <-- Runs on Edge Node(s)
              | (Service Router)|      Handles: NAT, LB, GW FW
              |                 |      for this T1 (optional --
              | Only needed if  |      only if services are
              | services are    |      configured on this T1)
              | configured      |
              +--------+--------+
                       |
                 Transit Segment
                       |
              +--------v--------+
              |   TIER-1 DR     |  <-- Runs on EVERY host that has
              | (Distributed    |      VMs connected to this T1's
              |  Router)        |      segments
              +--------+--------+
                   |       |
            +------+       +------+
            |                     |
      +-----------+         +-----------+
      | Segment A |         | Segment B |
      | 10.10.1.0 |         | 10.10.2.0 |
      |   /24     |         |   /24     |
      +-----------+         +-----------+
         |    |                |    |
       VM-1  VM-2           VM-3  VM-4

Distributed Router (DR) -- The Performance Engine

The DR is the key performance feature of NSX-T routing. It runs as a kernel module on every ESXi host, making routing decisions locally without sending traffic to a centralized router.

How DR routing works for east-west traffic:

VM-A (10.10.1.10, on Segment A, ESXi-01)
  wants to reach
VM-B (10.10.2.20, on Segment B, ESXi-03)

1. VM-A sends packet to its default gateway: 10.10.1.1
   (This is the Tier-1 DR on ESXi-01)

2. The Tier-1 DR on ESXi-01 performs a route lookup:
   - 10.10.2.0/24 is connected to Segment B on this same Tier-1
   - DR resolves VM-B's MAC via ARP (or ARP suppression cache)

3. The DR rewrites:
   - Src MAC = Tier-1 DR's MAC for Segment B interface
   - Dst MAC = VM-B's MAC
   - Decrements TTL

4. The packet is encapsulated in GENEVE (VNI for Segment B)
   and sent from ESXi-01's TEP to ESXi-03's TEP

5. ESXi-03 decapsulates, delivers to VM-B on Segment B.

CRITICAL: The packet NEVER left ESXi-01 for routing.
The routing decision was made locally in the kernel.
Only the final delivery required crossing the overlay.
Traffic between different subnets on the same host
is routed ENTIRELY locally -- it never hits the wire.

This distributed routing model means that east-west traffic between subnets on the same Tier-1 gateway does not hairpin through an Edge node. The routing happens at wire speed in the ESXi kernel. This is the single most important NSX-T capability to understand when evaluating replacements -- any replacement that centralizes inter-subnet routing will introduce latency and create a bandwidth bottleneck.

Service Router (SR) -- The Services Engine

The SR runs on Edge nodes (bare-metal or VM) and handles services that cannot be distributed: NAT, VPN (IPsec, L2VPN), Gateway Firewall (stateful), and load balancing. Traffic only flows through the SR when it needs one of these services.

Tier-0 SR: Handles north-south traffic leaving the NSX domain. Peers with the physical network via BGP or OSPF. Supports Active-Standby (one SR active, one standby with stateful failover) or Active-Active (ECMP across multiple SRs, but stateful services like NAT are distributed across SRs per-flow).

Tier-1 SR: Only instantiated when the Tier-1 gateway has services configured (NAT, LB, GW Firewall). If a Tier-1 has no services, all routing is handled by the DR, and no SR is created -- this is the optimal configuration for pure east-west routing.

Route Redistribution

NSX-T supports route redistribution between Tier-0 and Tier-1, and between Tier-0 and the physical network:

Route Redistribution Flow:

  Physical Network
       ^
       | BGP advertisements
       | (controlled by route redistribution rules on Tier-0)
       |
  Tier-0 Gateway
       |  Redistributes:
       |  - Tier-1 connected subnets  (10.10.1.0/24, 10.10.2.0/24)
       |  - Tier-1 NAT IPs            (if configured)
       |  - Tier-1 LB VIPs            (if configured)
       |  - Tier-0 static routes
       |  - Tier-1 static routes
       |
  Tier-1 Gateway
       |  Redistributes to Tier-0:
       |  - Connected segments (automatic if "Route Advertisement" enabled)
       |  - NAT rules (SNAT/DNAT IPs)
       |  - LB VIPs
       |  - Static routes
       |
  Segments
  (10.10.1.0/24, 10.10.2.0/24)

BGP configuration on Tier-0:

Tier-0 SR peers with physical leaf/spine switches via eBGP.
Typical configuration: two Edge nodes in Active-Standby, each peering with a different leaf switch.
The Tier-0 announces redistributed routes (Tier-1 subnets, NAT IPs, VIPs) to the physical fabric.
The physical fabric announces default routes or specific external routes to the Tier-0.
BFD (Bidirectional Forwarding Detection) is strongly recommended for sub-second failover.

OSPF support: NSX-T supports OSPF on Tier-0 for environments that use OSPF on the physical fabric. However, BGP is recommended for new deployments due to better ECMP support and simpler multi-tenancy.

Routing Between Tiers

The Tier-0-to-Tier-1 connection uses automatically created transit segments with /31 link addresses from the 100.64.0.0/10 range (RFC 6598). These transit segments are internal to NSX and not visible to VMs or the physical network.

Inter-tier transit (automatic):

  Tier-0 DR: 100.64.0.0/31 (interface .0)
       |
       | Transit Segment (auto-created)
       |
  Tier-1 DR: 100.64.0.1/31 (interface .1)

  Multiple Tier-1s get separate transit segments:
    T0 <-> T1-A: 100.64.0.0/31
    T0 <-> T1-B: 100.64.0.2/31
    T0 <-> T1-C: 100.64.0.4/31

5. Distributed Firewall (DFW)

The DFW is NSX-T's most critical security feature and the single hardest capability to replicate in a migration. It enforces firewall rules at the vNIC level of every VM, directly in the ESXi kernel, before traffic enters the virtual switch.

DFW Enforcement Point:

  +---------+
  |   VM    |
  |  vNIC   |
  +----+----+
       |
  +----v----+
  | DFW     |  <-- Rules evaluated HERE, per-packet
  | filter  |      In the ESXi kernel (vmkernel module)
  +---------+      Before traffic reaches the N-VDS/VDS
       |
  +----v----+
  | N-VDS / |
  | VDS     |
  +---------+
       |
    (to overlay / physical network)

  DFW processes BOTH directions:
  - Egress: packets leaving the VM
  - Ingress: packets arriving at the VM
  The filter sits between the vNIC and the virtual switch.

Rule Processing Order

DFW rules are organized into categories (also called "sections" in the Manager API). Rules are evaluated top-to-bottom within a category, and categories are evaluated in a fixed order:

DFW Rule Processing Order:

  1. ETHERNET          (L2 rules -- EtherType, MAC-based)
       |
       v
  2. EMERGENCY         (break-glass rules -- highest priority L3/L4)
       |
       v
  3. INFRASTRUCTURE    (rules for infrastructure services -- DNS,
       |                NTP, AD, backup agents)
       v
  4. ENVIRONMENT       (zone-based rules -- prod vs. dev,
       |                PCI vs. non-PCI)
       v
  5. APPLICATION       (application-specific rules -- allow
       |                web-to-app, app-to-db, etc.)
       v
  6. DEFAULT           (catch-all -- typically "deny all" for
                        zero-trust, or "allow all" for legacy)

  Within each category, rules are evaluated top-to-bottom.
  First match wins -- no further rules are evaluated.

  +----------------------------------------------------+
  | Category: APPLICATION                              |
  |                                                    |
  | Rule 1: Allow | Group-Web --> Group-App | TCP/8080 |  <-- match? stop
  | Rule 2: Allow | Group-App --> Group-DB  | TCP/5432 |  <-- match? stop
  | Rule 3: Drop  | Any       --> Group-DB  | Any      |  <-- match? stop
  | Rule 4: Allow | Group-Mgmt--> Any       | TCP/22   |  <-- match? stop
  +----------------------------------------------------+
  |                                                    |
  | If no rule matches in APPLICATION, move to DEFAULT |
  +----------------------------------------------------+

Applied-To Scope

Every DFW rule has an "Applied-To" field that determines which vNICs the rule is actually pushed to. This is critical for performance:

Applied-To: DFW (default) -- the rule is pushed to every vNIC on every host. In a 5,000-VM environment, this means every host evaluates this rule for every packet on every VM, even if the rule only applies to 5 VMs. This is the most common misconfiguration and the primary cause of DFW performance problems at scale.
Applied-To: specific group -- the rule is only pushed to hosts that have VMs matching the group. This dramatically reduces the number of rules each host must evaluate.

Applied-To Impact (5,000 VMs across 100 hosts):

  Scenario A: 1,000 rules, all Applied-To: DFW
  --> Every host receives all 1,000 rules
  --> Every packet on every VM is evaluated against 1,000 rules
  --> Total rule evaluations per packet: up to 1,000

  Scenario B: 1,000 rules, Applied-To: specific groups
  --> Each host receives only rules for its local VMs
  --> A host with 50 VMs from 5 application groups
      receives ~50 rules (only those for its groups)
  --> Total rule evaluations per packet: up to 50

  Performance difference: 20x fewer rule evaluations per packet

Operational pain point -- Applied-To neglect: Many NSX deployments start with all rules Applied-To: DFW because it is the default and "just works." Over time, as rules accumulate (500, 1000, 2000+ rules), hosts begin to show increased CPU utilization for DFW processing. The fix is to retroactively add proper Applied-To scoping, which requires understanding which groups each rule actually targets -- a time-consuming audit.

DFW Rule Capacity

NSX-T has published limits for DFW:

Maximum rules per transport node: 100,000 (but practical limit is much lower due to CPU impact)
Maximum rules per vNIC: 10,000
Maximum security groups: 10,000
Maximum members per group: 500,000 IP addresses (in IP set form)

In practice, performance degrades noticeably above 2,000-5,000 rules per host, depending on traffic volume and rule complexity.

Identity Firewall (IDFW)

The Identity Firewall extends DFW rules to match on Active Directory user identity rather than just IP/MAC. When a user logs into a VM (via Remote Desktop, for example), the IDFW maps the user's AD identity to the VM's IP address and applies identity-based rules.

Use case: "Allow users in the AD group 'Finance-Users' to access the finance database on TCP/1433, regardless of which VM they log into."

IDFW requires:

Active Directory integration (via LDAP or the NSX Guest Introspection framework)
Guest Introspection agents on the VMs (Windows only, via VMware Tools)
NSX logging infrastructure to track user-to-IP mappings

Migration note: IDFW is a VMware-specific feature. OVN (OVE) and Azure SDN do not have native equivalents. Replicating this requires an external identity-aware proxy or re-implementing the logic at the application layer.

6. Gateway Firewall

The Gateway Firewall runs on the SR (Service Router) of Tier-0 and Tier-1 gateways. Unlike the DFW, which is distributed across all hosts, the Gateway Firewall is centralized on Edge nodes.

DFW vs. Gateway Firewall:

  DFW (Distributed):                  Gateway Firewall (Centralized):
  +-------+    +-------+             +-------+    +-------+
  |  VM-A |    |  VM-B |             |  VM-A |    |  VM-B |
  | [DFW] |    | [DFW] |             |       |    |       |
  +---+---+    +---+---+             +---+---+    +---+---+
      |            |                     |            |
      +----+-------+                     +-----+------+
           |                                   |
      East-West traffic                   +----v----+
      filtered at source                  | Tier-0  |
      and destination                     |   SR    |
                                          | [GW FW] |
                                          +----+----+
                                               |
                                          North-South traffic
                                          filtered at the gateway

The Gateway Firewall handles:

North-south stateful firewalling -- all traffic entering/leaving the NSX domain through the Tier-0 gateway.
NAT -- SNAT (source NAT for outbound VM traffic), DNAT (destination NAT for inbound traffic to VMs), and reflexive NAT.
VPN -- IPsec (site-to-site and remote access) and L2VPN (extending Layer-2 segments across sites).

NAT on Tier-0 and Tier-1

NAT Processing Order:

  Inbound traffic (from physical to overlay):
  1. Gateway Firewall (pre-NAT rules)
  2. DNAT (destination IP rewritten: public IP --> private VM IP)
  3. Gateway Firewall (post-NAT rules)
  4. Routing to Tier-1 / Segment
  5. DFW on destination VM

  Outbound traffic (from overlay to physical):
  1. DFW on source VM
  2. Routing to Tier-0
  3. Gateway Firewall (pre-NAT rules)
  4. SNAT (source IP rewritten: private VM IP --> public/external IP)
  5. Gateway Firewall (post-NAT rules)
  6. Physical network

Tier-1 NAT is used for per-tenant address translation (e.g., each Tier-1 has its own SNAT IP for outbound traffic). Tier-0 NAT is used for global address translation.

IPsec VPN

NSX-T supports:

Route-based IPsec VPN: Creates a virtual tunnel interface (VTI) and routes traffic into it. Supports dynamic routing (BGP) over the tunnel. Preferred for site-to-site VPN.
Policy-based IPsec VPN: Traffic is matched against a policy (source/destination subnets) and encrypted. Does not support dynamic routing. Simpler but less flexible.

The VPN terminates on the Tier-0 SR (Edge node). In Active-Standby mode, VPN sessions fail over to the standby Edge with session state preserved (stateful failover).

L2VPN

L2VPN extends a Layer-2 segment across two sites over an IPsec-encrypted GRE tunnel. Use case: live migration of VMs between sites while maintaining the same IP address and broadcast domain.

Migration note: L2VPN is commonly used in VMware environments for disaster recovery and site migration. OVN does not have a native L2VPN equivalent. If the organization relies on L2VPN for cross-site L2 extension, this must be addressed in the migration architecture (e.g., using a dedicated L2VPN appliance, or re-architecting to use L3 routing between sites).

7. Load Balancing

NSX-T historically included a built-in load balancer (NSX-T LB) running on Edge nodes. Starting with NSX-T 3.0+, VMware deprecated the built-in LB in favor of NSX Advanced Load Balancer (NSX ALB), formerly Avi Networks.

Built-in NSX-T Load Balancer (Deprecated)

Ran on the Tier-1 SR (Edge node).
Supported L4 (TCP/UDP) and L7 (HTTP/HTTPS) load balancing.
Limited scale: Small (10 VIPs), Medium (100 VIPs), Large (1000 VIPs) per Edge.
No auto-scaling, no advanced WAF, no GSLB.
Status: deprecated in NSX 4.x, scheduled for removal. Not available in new VCF deployments.

NSX ALB (Avi)

NSX ALB is a distributed, software-defined load balancer that runs as Service Engines (SEs) -- dedicated VMs that process load-balanced traffic.

NSX ALB Architecture:

  +-------------------+
  | Avi Controller    |  <-- Management, analytics, policy
  | (3-node cluster)  |      REST API, UI
  +--------+----------+
           |
    Control channel
           |
  +--------v----------+     +-------------------+
  | Service Engine 1  |     | Service Engine 2  |
  | (VM on ESXi)      |     | (VM on ESXi)      |
  | - VIP: 10.0.0.100 |     | - VIP: 10.0.0.100 |
  | - Pool members:   |     | - Pool members:   |
  |   10.10.1.10:8080 |     |   10.10.1.10:8080 |
  |   10.10.1.11:8080 |     |   10.10.1.11:8080 |
  |   10.10.1.12:8080 |     |   10.10.1.12:8080 |
  +-------------------+     +-------------------+

  Virtual Service: frontend.example.com
    --> VIP: 10.0.0.100:443 (HTTPS)
    --> SSL termination on SE
    --> Pool: backend-servers (10.10.1.10-12:8080, HTTP)
    --> Health check: HTTP GET /healthz every 5s
    --> LB algorithm: least-connections

Key features:

L4 and L7 load balancing with SSL termination, content switching, HTTP header manipulation.
WAF (Web Application Firewall) -- integrated with the LB for L7 security.
Auto-scaling Service Engines -- SEs can be scaled out automatically based on traffic.
Analytics -- per-request logging, latency distribution, error rate tracking.
GSLB (Global Server Load Balancing) -- DNS-based load balancing across sites.
Integration with NSX-T -- SEs can be placed on NSX segments and use NSX security groups.

Migration note: NSX ALB (Avi) is a separate product with its own licensing. Under Broadcom, Avi licensing has been bundled into VCF but is expensive as a standalone. In an OVE migration, the load balancing function is typically replaced by a combination of MetalLB (L4), HAProxy/Nginx Ingress (L7), and potentially F5 BIG-IP or similar if advanced WAF/GSLB is required.

8. Micro-Segmentation Model

Micro-segmentation is the practice of enforcing firewall rules at the individual workload level rather than at the network perimeter. NSX-T's DFW is the enforcement engine; the micro-segmentation model defines how workloads are grouped and policies are applied.

Security Groups and Dynamic Membership

NSX-T security groups define sets of workloads that share a security posture. Groups can be defined by:

Membership Criteria	Example	Dynamic?
VM name pattern	`Name contains 'web-prod'`	Yes
NSX Tag	`Tag = 'environment:production'`	Yes
Segment membership	`All VMs on Segment 'dmz-segment'`	Yes
IP address / range	`10.10.1.0/24`	No (static)
AD group (with IDFW)	`AD Group = 'Finance-Users'`	Yes
OS name	`OS contains 'Windows Server 2019'`	Yes
vCenter object	`VM in folder 'Production/Web-Tier'`	Yes

Dynamic membership is the key differentiator. When a new VM is deployed and tagged environment:production + app:web-frontend, it is automatically added to every security group that matches those criteria. The DFW rules for those groups are automatically pushed to the host. No manual firewall rule update is required.

Dynamic Membership Example:

  Security Group: "Web-Production"
    Criteria: Tag = "environment:production" AND Tag = "tier:web"

  DFW Rule: Allow | Web-Production --> App-Production | TCP/8080

  Day 1: 10 VMs match Web-Production
  --> Rule is pushed to 10 hosts

  Day 2: A new VM "web-prod-11" is deployed with tags:
    environment:production, tier:web
  --> VM automatically joins Web-Production group
  --> DFW rule is automatically pushed to the new VM's host
  --> No manual intervention required

  Day 3: VM "web-prod-05" is re-tagged to tier:app
  --> VM automatically leaves Web-Production group
  --> VM automatically joins App-Production group
  --> DFW rules are updated on the VM's host

Zero-Trust Model

The NSX micro-segmentation model enables a zero-trust architecture:

Zero-Trust Firewall Policy Structure:

  Category: EMERGENCY
    Rule: Allow | Admin-Jump-Hosts --> Any | TCP/22,3389
           (break-glass access for emergency troubleshooting)

  Category: INFRASTRUCTURE
    Rule: Allow | All-VMs --> DNS-Servers | UDP/53, TCP/53
    Rule: Allow | All-VMs --> NTP-Servers | UDP/123
    Rule: Allow | All-VMs --> AD-DCs     | TCP/389,636,88,445
    Rule: Allow | All-VMs --> SCCM       | TCP/8530,8531
    Rule: Allow | All-VMs --> Backup-Agents | TCP/9000-9010

  Category: ENVIRONMENT
    Rule: Drop  | Production-VMs --> Development-VMs | Any
    Rule: Drop  | PCI-Zone --> Non-PCI-Zone | Any
           (environment isolation -- prod can't reach dev, PCI is isolated)

  Category: APPLICATION
    Rule: Allow | Web-Tier --> App-Tier | TCP/8080
    Rule: Allow | App-Tier --> DB-Tier  | TCP/5432
    Rule: Allow | App-Tier --> Cache-Tier | TCP/6379
    Rule: Drop  | Web-Tier --> DB-Tier  | Any
           (web cannot directly access database -- must go through app)

  Category: DEFAULT
    Rule: Drop  | Any --> Any | Any
           (zero-trust: everything not explicitly allowed is denied)

This model provides defense-in-depth: even if an attacker compromises a web-tier VM, they cannot directly access the database because the DFW on the web-tier VM's vNIC blocks all traffic except TCP/8080 to the app tier.

Tagging Strategy

The effectiveness of micro-segmentation depends entirely on a consistent tagging strategy. Best practice:

Tag Taxonomy (example):

  Scope: environment    Values: production, staging, development, test
  Scope: tier           Values: web, app, db, cache, messaging
  Scope: application    Values: trading-platform, risk-engine, portal
  Scope: compliance     Values: pci, sox, gdpr, none
  Scope: owner          Values: team-alpha, team-beta, platform

  VM: trading-web-prod-01
    Tags:
      environment:production
      tier:web
      application:trading-platform
      compliance:pci
      owner:team-alpha

Operational pain point -- tag drift: Over time, VMs are cloned, migrated between clusters, or re-purposed without updating their tags. A VM originally tagged environment:development gets promoted to production but keeps its development tag, causing it to receive development-tier firewall rules in production. Regular tag audits are essential.

Operational pain point -- DFW rule sprawl: Without governance, DFW rules accumulate. Teams add "temporary" allow rules that become permanent. Allow-any rules are added for troubleshooting and never removed. Over 12-24 months, a 5,000-VM environment can easily accumulate 2,000-5,000 DFW rules, many of which are redundant, overlapping, or no longer needed. NSX Intelligence (analytics) can help identify unused rules, but many organizations do not have NSX Intelligence deployed.

9. NSX Networking for VMs

Port Binding

When a VM is powered on and its vNIC connects to an NSX segment, the following sequence occurs:

vCenter notifies NSX Manager that a VM vNIC has been connected to a segment port.
NSX Manager creates a logical port on the segment and assigns it to the VM's vNIC.
The CCP computes the forwarding rules for this port (segment membership, DFW rules, DHCP, SpoofGuard).
The CCP pushes the rules to the LCP on the host where the VM is running.
The LCP programs the local data plane (N-VDS/VDS) with the rules.
The VM's vNIC is now connected to the overlay segment with all policies enforced.

This process typically completes in 1-5 seconds. However, under heavy load (e.g., 100 VMs powering on simultaneously during a disaster recovery test), the CCP can become a bottleneck, and port binding can take 30-60 seconds.

SpoofGuard

SpoofGuard prevents VMs from using IP or MAC addresses that have not been assigned to them. This is a critical security feature that prevents:

IP spoofing: A compromised VM sending traffic with a forged source IP.
MAC spoofing: A VM impersonating another VM's MAC address to intercept traffic.
ARP poisoning: A VM sending gratuitous ARP replies to redirect traffic.

SpoofGuard modes:

Automatic (default): NSX learns the VM's IP/MAC from the first ARP/DHCP packet and locks it. Additional IPs can be learned automatically up to a configurable limit.
Manual: Administrator explicitly configures allowed IP/MAC pairs. Stricter but more operationally demanding.
Disabled: No enforcement. Not recommended.

DHCP in NSX

NSX-T provides two DHCP mechanisms:

DHCP Server: An NSX-managed DHCP server running on the segment or on the Tier-1 gateway. Assigns IPs from a configured pool with options (DNS, gateway, domain, etc.). Useful for overlay segments where external DHCP infrastructure cannot reach (because DHCP broadcasts do not cross GENEVE tunnels without relay).
DHCP Relay: Forwards DHCP requests from VMs on an NSX segment to an external DHCP server (e.g., Windows DHCP, Infoblox). The relay agent runs on the Tier-1 DR or Tier-0 DR and converts DHCP broadcasts into unicast messages to the configured DHCP server.

DHCP Relay on Tier-1:

  VM (10.10.1.x) sends DHCP Discover (broadcast)
       |
       v
  Tier-1 DR (DHCP relay configured)
       |
       | Converts to unicast, adds relay agent info (option 82)
       | Src: Tier-1 DR IP, Dst: External DHCP Server IP
       |
       v
  External DHCP Server (10.1.1.53)
       |
       | DHCP Offer (unicast to Tier-1 DR)
       |
       v
  Tier-1 DR
       |
       | Forwards to VM (unicast or broadcast depending on flags)
       |
       v
  VM receives DHCP Offer

Metadata Proxy

For cloud-init-based VM provisioning, NSX-T provides a metadata proxy that serves instance metadata (hostname, SSH keys, network configuration) to VMs via the well-known metadata IP (169.254.169.254). This is similar to the cloud metadata service in AWS/Azure/GCP.

The metadata proxy runs on the Tier-1 SR and intercepts HTTP requests to 169.254.169.254 from VMs on that Tier-1's segments.

Migration note: This is a VMware-specific feature. In OVE, cloud-init metadata is provided via the KubeVirt cloudInitNoCloud or cloudInitConfigDrive mechanisms, which inject metadata directly into the VM's virtual disk or via a configuration drive. The metadata IP approach is not used.

10. Monitoring and Troubleshooting

Traceflow

Traceflow is NSX-T's packet-tracing tool. It injects a synthetic packet into the data plane at a source VM and traces its path through the overlay, showing every hop, every rule evaluation, and every encapsulation/decapsulation step.

Traceflow Example:

  Source: VM-A (10.10.1.10) on ESXi-01
  Destination: VM-B (10.10.2.20) on ESXi-03

  Trace output:
  +---------+---------------------------------------------------+
  | Hop     | Action                                            |
  +---------+---------------------------------------------------+
  | ESXi-01 | VM-A vNIC: packet injected                        |
  | ESXi-01 | DFW: Rule 1042 matched, action: ALLOW             |
  | ESXi-01 | SpoofGuard: IP/MAC validated                      |
  | ESXi-01 | DR (Tier-1): route lookup, next-hop: Segment B    |
  | ESXi-01 | GENEVE encap: VNI=72001, TEP src=.10, dst=.12    |
  | ESXi-03 | GENEVE decap: VNI=72001                           |
  | ESXi-03 | DR (Tier-1): deliver to Segment B port            |
  | ESXi-03 | DFW: Rule 1050 matched, action: ALLOW             |
  | ESXi-03 | VM-B vNIC: packet delivered                       |
  +---------+---------------------------------------------------+

Traceflow is invaluable for diagnosing connectivity issues because it shows exactly where a packet is dropped and which rule caused the drop. In a complex environment with thousands of DFW rules across multiple categories, manually determining which rule blocks a given flow is impractical without Traceflow.

Migration note: OVN provides ovn-trace, which serves a similar purpose but with different syntax and output format. Azure Local has limited packet-tracing capability through Network Controller diagnostics. Neither provides the same polished UI experience as NSX Traceflow, but the underlying capability exists.

Port Mirroring

NSX-T supports SPAN/RSPAN-equivalent port mirroring on overlay segments. Traffic from a VM's vNIC can be mirrored to a monitoring VM or an external IDS/IPS.

Mirroring types:

Logical port mirroring: Mirror all traffic from a specific VM's vNIC to a destination port.
Segment mirroring: Mirror all traffic on a segment (expensive -- captures everything).

Flow Monitoring

NSX-T exports flow records via IPFIX (IP Flow Information Export) to external collectors (e.g., vRealize Network Insight, now "Aria Operations for Networks," or third-party tools like Splunk, Elastic).

IPFIX records include:

Source/destination IP, port, protocol
Byte and packet counts
DFW rule ID that matched the flow
NSX segment and logical port IDs

NSX Intelligence

NSX Intelligence is an analytics engine that provides:

Traffic flow visualization: Real-time and historical maps of which VMs communicate with which other VMs, at which ports, with which protocols.
Rule recommendation: Analyzes actual traffic flows and recommends DFW rules to enforce the observed communication patterns. This is the primary tool for moving from "allow all" to zero-trust.
Unused rule detection: Identifies DFW rules that have not matched any traffic in a configurable time window.

Operational reality: NSX Intelligence requires significant resources (dedicated appliance cluster, substantial storage for flow data). Many organizations have NSX-T but do not have NSX Intelligence deployed, which means they lack visibility into actual traffic patterns and cannot easily determine which DFW rules are needed vs. redundant.

11. Licensing and Broadcom Impact

Pre-Broadcom Licensing (Legacy)

NSX-T was available in multiple editions:

NSX-T Standard: Basic switching and routing (no DFW, no advanced LB).
NSX-T Professional: Added DFW, micro-segmentation.
NSX-T Advanced: Added Gateway Firewall, VPN, advanced LB integration.
NSX-T Enterprise Plus: Added NSX Intelligence, N-VDS to VDS migration tools, federation.

Licensed per-CPU (per physical processor socket on ESXi hosts).

Post-Broadcom Licensing (Current)

Broadcom eliminated standalone NSX licensing in most cases. NSX is now bundled into:

VMware Cloud Foundation (VCF): The only supported VMware deployment model for new customers. VCF includes vSphere, vSAN, NSX, and Aria Suite as a single license. NSX is no longer available as a standalone product for new customers.
VMware vSphere Foundation (VVF): A smaller bundle for existing customers who do not need the full VCF stack. Includes basic NSX networking features but not all advanced features.

Key impacts:

Price increases: VCF licensing has been reported as 2-5x more expensive than previous per-CPU NSX licensing, depending on the configuration and the customer's negotiating leverage.
Feature bundling: Features that were previously available in lower tiers (e.g., DFW in Professional) now require the full VCF license.
Perpetual license elimination: Broadcom has moved to subscription-only licensing. Existing perpetual licenses are honored but cannot be renewed.
Support changes: Support tiers have been restructured. Many organizations report longer support response times and reduced access to Level-3 engineering support.

Impact on migration decision: The Broadcom acquisition and licensing changes are a primary driver for evaluating alternative platforms. Even if NSX-T is technically excellent, the licensing costs and vendor uncertainty make it a risk for a 5-10 year infrastructure commitment.

What to Preserve vs. What to Leave Behind

Must Preserve (Essential Capabilities)

Capability	Why Essential	Replacement Approach
Distributed Firewalling (DFW)	Zero-trust micro-segmentation is a regulatory requirement for financial enterprises. East-west traffic cannot be sent through a central chokepoint.	OVE: OVN ACLs via Kubernetes NetworkPolicies and AdminNetworkPolicies. Azure Local: Datacenter Firewall via VFP.
Distributed Routing (DR)	East-west inter-subnet routing at kernel speed without hairpinning through an Edge. At 5,000+ VMs, centralizing routing would be a performance disaster.	OVE: OVN distributed logical router on every node. Azure Local: Hyper-V distributed routing via VFP.
Dynamic Security Groups	VMs must automatically receive correct firewall policies when deployed. Manual per-VM rule management does not scale.	OVE: Kubernetes labels + label selectors in NetworkPolicies. Azure Local: Network Security Groups with dynamic membership (limited compared to NSX).
Overlay Networking	Multi-tenancy, address space isolation, and segment scalability beyond 4,094 VLANs.	OVE: GENEVE via OVN. Azure Local: VXLAN via Microsoft SDN.
North-South Routing with BGP	The platform must announce service VIPs and VM subnets to the physical fabric.	OVE: MetalLB (BGP mode) + OVN gateway. Azure Local: RAS Gateway with BGP.
SpoofGuard	IP/MAC spoofing prevention is a security baseline requirement.	OVE: OVN port security (automatic). Azure Local: VFP port ACLs (automatic).

Nice-to-Have (Valuable but Not Essential)

Capability	Why Nice-to-Have	Migration Notes
NSX Intelligence (flow analytics)	Useful for rule optimization and traffic visibility but can be replaced with third-party tools.	OVE: Network Observability Operator + eBPF-based flow collection. Azure Local: Azure Monitor Network Insights.
Traceflow	Excellent troubleshooting tool but `ovn-trace` and `tcpdump` provide equivalent (if less polished) capability.	Team must learn CLI-based troubleshooting instead of UI-driven Traceflow.
NSX ALB (Avi) L7 LB	Advanced L7 LB with WAF and analytics. Valuable but replaceable with dedicated LB products.	OVE: HAProxy/Nginx Ingress + F5 BIG-IP for WAF. Azure Local: Azure LB or third-party.
Identity Firewall (IDFW)	Useful for environments where users roam between VMs (RDP). Not widely used in most deployments.	No direct replacement. Requires re-architecture at the application layer or an identity-aware proxy.
L2VPN	Cross-site L2 extension for DR and migration. Useful during migration but not needed long-term if the target architecture uses L3 routing between sites.	Dedicated VPN appliance or re-architect for L3 inter-site.

Leave Behind (VMware-Specific Implementation Details)

Capability	Why Leave Behind	Conceptual Replacement
N-VDS	VMware-proprietary virtual switch. Any SDN has its own data plane.	OVE: OVS (br-int). Azure Local: Hyper-V vSwitch + VFP.
GENEVE with NSX-specific options	The specific GENEVE option fields used by NSX (security tags, tracing) are NSX-internal. The concept of overlay encapsulation is universal.	OVE uses GENEVE with OVN-specific options. Azure Local uses VXLAN. The encapsulation protocol is an implementation detail, not a capability.
Corfu datastore	NSX's internal distributed database. Any SDN controller has its own state store.	OVE: OVN Northbound/Southbound DBs (OVSDB protocol). Azure Local: Network Controller SQL database.
TEP pools and TEP IP management	NSX-specific mechanism for assigning tunnel endpoint IPs to hosts.	OVE: OVN manages tunnel endpoints automatically using node IPs. Azure Local: Provider addresses managed by Network Controller.
NSX Manager VIP and cluster	NSX-specific management cluster.	OVE: Kubernetes API server (HA by design). Azure Local: Network Controller (3-node cluster).
vCenter integration (VM inventory sync)	NSX syncs VM metadata from vCenter for security group membership. This coupling disappears when vCenter is decommissioned.	OVE: Kubernetes is the source of truth for workload metadata (labels, annotations). Azure Local: SCVMM or Azure Arc provides VM inventory.

Key Takeaways

NSX-T is three systems in one -- and each has different failure characteristics. The Management Plane can fail without affecting forwarding (but blocks provisioning). The Central Control Plane can split-brain (causing conflicting rules). The Data Plane is autonomous but cannot be reconfigured if the CCP is down. Understanding which plane is failing is the first step in any NSX outage diagnosis.
Distributed Firewall is the crown jewel -- and the hardest to migrate. DFW enforcement at the vNIC level, with dynamic group membership and category-based rule ordering, is NSX-T's most differentiated capability. Any replacement must provide equivalent distributed enforcement. Centralizing firewalling (e.g., routing all traffic through a central firewall appliance) is architecturally unacceptable at 5,000+ VMs.
Distributed Routing eliminates the Edge as a bottleneck for east-west traffic. The DR component on every host means that inter-subnet traffic between VMs on the same host never touches the wire, and inter-subnet traffic between VMs on different hosts traverses only one overlay hop. This is the performance baseline that replacements must match.
The micro-segmentation model depends on consistent tagging and group governance. NSX's dynamic security groups are powerful, but their value is directly proportional to the quality of the tagging strategy. If the current environment has tag drift, redundant groups, or poorly scoped Applied-To fields, those problems will transfer to any replacement unless they are addressed during migration.
DFW rule sprawl is a technical debt that should be cleaned up before migration, not during it. Many NSX deployments accumulate hundreds of unnecessary rules. Migrating 2,000 rules to a new platform when only 500 are needed wastes engineering effort and recreates technical debt. Use NSX Intelligence (or manual traffic analysis) to identify and remove unused rules before migration.
The Broadcom licensing model is a forcing function. The shift to VCF-only bundling, subscription-only licensing, and reported price increases of 2-5x make continued investment in NSX-T a financial risk. This is not a technical argument against NSX -- it is a business argument for migration.
NSX operational complexity is non-trivial. NSX Manager upgrades, CCP health monitoring, TEP connectivity troubleshooting, DFW rule management at scale, and Edge node sizing are all specialized skills. The replacement platform will have its own operational complexity, but it will be different complexity. The team must plan for a learning curve regardless of which platform is chosen.
Not everything needs to be replicated 1:1. Features like Identity Firewall, L2VPN, and NSX Intelligence are valuable in the VMware ecosystem but may not be needed in the new architecture. The migration is an opportunity to simplify -- carry forward the essential capabilities, replace the nice-to-haves with simpler alternatives, and leave behind the VMware-specific implementation details.

Discussion Guide

These questions are designed to be asked internally -- to your own network and platform teams -- to understand the current NSX-T deployment deeply before evaluating replacements. The answers will reveal hidden dependencies, operational pain points, and the true scope of the migration.

Current DFW State and Hygiene

"How many DFW rules do we have in production today? How many are in each category (Emergency, Infrastructure, Environment, Application, Default)? What percentage of rules have the Applied-To field scoped to a specific group vs. the default 'DFW' (all vNICs)? When was the last time we audited rules for unused entries?"

Why this matters: The DFW rule count and quality directly determine migration effort. 500 well-scoped rules migrate cleanly. 3,000 rules with broad Applied-To scoping and no usage data require months of analysis before migration.

Security Group Membership and Tagging

"How many security groups do we maintain? What criteria drive membership -- NSX tags, VM names, vCenter folders, IP sets, or a combination? Is there a documented tagging standard, and how well is it enforced? How many VMs have incorrect or missing tags?"

Why this matters: Security groups map to Kubernetes labels (OVE) or NSG membership (Azure Local). If the current groups are based on vCenter-specific objects (folders, resource pools), those criteria will not exist in the new platform and must be re-mapped.

Tier-0 and Tier-1 Topology

"How many Tier-0 gateways do we operate? How many Tier-1 gateways? Which Tier-1 gateways have services configured (NAT, LB, Gateway Firewall), and which are pure routing (DR-only)? What are the BGP peering details for each Tier-0 -- peer IPs, AS numbers, advertised prefixes, BFD timers?"

Why this matters: The Tier-0/Tier-1 topology must be mapped to the new platform's routing model. OVN uses a flat routing model (one distributed router per cluster with multiple subnets), not a two-tier hierarchy. Tier-1 gateways with services (NAT, LB) require explicit replacement in the new architecture.

NAT and VPN Dependencies

"Which applications depend on NSX NAT (SNAT or DNAT)? Are there any applications with hardcoded IP addresses that rely on DNAT to redirect traffic to the actual VM? How many IPsec VPN tunnels terminate on our Tier-0 gateways, and who are the remote peers? Do we use L2VPN for any cross-site L2 extensions?"

Why this matters: NAT and VPN configurations are often undocumented and discovered only when they break during migration. Every NAT rule and VPN tunnel must be explicitly accounted for in the migration plan.

Edge Node Sizing and Utilization

"How many Edge nodes do we operate, and what is their form factor (VM or bare-metal)? What is the current CPU and memory utilization on the Edge nodes? Are we hitting any Edge throughput limits for north-south traffic or NAT/VPN processing? Have we ever had an Edge node failover in production, and how long did it take?"

Why this matters: Edge node sizing determines north-south bandwidth limits. If the current Edge nodes are undersized (common), the replacement architecture should be sized based on actual traffic needs, not on the current (potentially inadequate) deployment.

Load Balancing Inventory

"What load balancing solution are we using -- the built-in NSX-T LB, NSX ALB (Avi), or a third-party appliance? How many virtual services (VIPs) do we have? Which are L4 vs. L7? Do any virtual services use advanced features like WAF, SSL offloading, or content-based routing? What are the throughput and connection-rate requirements?"

Why this matters: Load balancing is the most fragmented capability across the replacement platforms. The requirements determine whether MetalLB + Ingress Controller is sufficient or whether a dedicated L7 LB (F5, HAProxy Enterprise) is needed.

NSX Upgrade History and Pain Points

"How many NSX-T upgrades have we performed in the last three years? How long did each upgrade take? Were there any incidents during upgrades (CCP split-brain, DFW rule inconsistency, Edge failover during upgrade)? What is the current NSX-T version, and how far behind the latest release are we?"

Why this matters: NSX upgrade pain is a common driver for migration. If the team has experienced upgrade-related incidents, this is a data point for evaluating operational complexity. It also reveals the current version, which determines API compatibility for rule export.

Monitoring and Visibility

"Do we have NSX Intelligence deployed? If yes, do we use the rule recommendations and flow analytics regularly? If no, how do we determine which DFW rules are actively matching traffic vs. unused? What IPFIX collectors do we use, and are they integrated with our SIEM? Can we generate a complete map of VM-to-VM communication flows today?"

Why this matters: If the team cannot produce a communication flow map, migrating DFW rules becomes a guessing game. The ability to export current flow data is a prerequisite for designing the replacement security policy.

DHCP and Metadata Dependencies

"Which NSX segments use NSX's built-in DHCP server vs. DHCP relay to an external server? Are there any VMs that depend on the NSX metadata proxy (169.254.169.254) for cloud-init provisioning? How are DHCP scopes managed -- through NSX Manager UI, API automation, or IPAM integration?"

Why this matters: DHCP and metadata dependencies are easily overlooked. If 50 segments use NSX's built-in DHCP, those DHCP scopes must be recreated in the new platform or migrated to an external DHCP server.

Hidden SpoofGuard and Port Security Issues

"Have we ever encountered SpoofGuard-related connectivity issues (e.g., a VM that added a secondary IP address was blocked because SpoofGuard did not learn it)? Are there VMs with SpoofGuard in manual mode that have explicitly configured IP/MAC bindings? Do any VMs require promiscuous mode or forged transmits (which SpoofGuard would normally block)?"

Why this matters: Some workloads (nested virtualization, network appliances, containers running inside VMs) require relaxed port security settings. These must be identified and accommodated in the new platform.

Team Skill Assessment

"How many team members are NSX-certified or have deep NSX operational experience? Is there a single person who is the 'NSX expert' (single point of failure)? Has the team operated OVN/OVS, Microsoft SDN, or any non-VMware SDN platform? What is the team's comfort level with CLI-based troubleshooting vs. UI-based troubleshooting?"

Why this matters: The migration will succeed or fail based on the team's ability to learn and operate the new platform. If the team is heavily dependent on NSX Manager UI for troubleshooting and has no Linux/CLI experience, OVE's OVN/OVS stack will require significant training. This does not mean OVE is the wrong choice -- it means training must be budgeted and started early.