Is it possible to use the NSX bridge within NSX Federation to extend a network for a single global segment to a VLAN in multiple locations?
VCFnested labvmwarenetworknsxMulti SiteNSX FederationNSX bridge
2177 Words Words // ReadTime 9 Minutes, 53 Seconds
2025-02-12 12:20 +0100
Introduction
NSX Federation is a multi site solution for NSX available since NSX 3.0. After Federation was released, the feature set was very limited and many features like VRF and NSX bridge were not supported with Federation.
Overview of some enhancements with NSX 4.x:
- Support of Physical Servers in Federated Environments (4.0.0.1)
- Support of DFW Exclusion List (4.0.1.1)
- DFW Time-based rules (4.0.1.1)
- Overall Enable/Disable of Location DFW (4.0.1.1)
- L2 Bridge (4.1.0.0)
- Higher latency allowed between locations: Increased from 150 msec to 500 msec (4.1.0.0)
- VRF Lite feature can now be used with global T0 routers as parent T0 (4.2.1)
In this article we will focus on the feature of NSX bridge
and validate if it is possible to stretch the same layer2 network simultaneously in both locations.
This article requires a basic understanding of NSX and the difference between a VLAN and an Overlay network as well as a understanding of NSX Federation.
What is a NSX Bridge?
A bridge will be used to connect dedicated layer2 networks to one single layer2 network. It has the ability to build a connection between VLAN backed and overlay backed networks as well as two dedicated overlay networks between NSX-V and NSX-T. Why should we connect two dedicated layer2 networks to a single broadcast domain? The most common use case is for migrations, where you have the requirement to migrate VMs between different technologies or platforms. For example NSX-V to NSX-T migrations or the migration of VMs from legacy VLAN backed networks to overlay backed networks. As soon as you have a combined broadcast domain, it is very easy to migrate the VMs one by one. Baed on this procedure you have the following advantages.
- Gateway is available for VMs already migrated and also for the VMs not migrated yet
- Gateway can be changed after all workload is migrated
- Migrated and not migrated VMs are still able to communicate between eachother without the need of IP renumbering
A second use case is to implement NSX including all overlay and routing capabilities, while you still have some physical servers in the same layer2 network. The VMs can be migrated and added to the overlay network, but for the physical servers this might be either very complex or not possible. The second use case is supported as well as the first use case, but you need to keep in mind that the use of NSX bridges does also have some disadvantages. The most important disadvantage is that the physical server will get dependent of the virtual environment. Not only from availability point of view, but also based on the available throughput. The NSX brdige will be implemented over the NSX edge nodes and is therfore dependent on the sizing of the specific edge node. Most of the customers are using VM type edges and the throughput defers based on available memory and CPU shown in the list below.
- Small Edge VM (2 vCPUs and 4 GiB Memory): < 2 Gbps
- Medium Edge VM (4 vCPUs and 8 GiB Memory): 2 Gbps
- Large Edge VM (8 vCPUs and 32 GiB Memory): 2-10 Gbps
- Extra Large Edge VM (16 vCPUs and 64 GiB Memory): > 10 Gbps (maximum I saw is around 20 Gbps)
- Bare metal edges: much higher throughput as mentioned in the VMware blog “VMware NSX Bare Metal Edge Performance”
Based on the expected load created by the physical server you should decide, if it is the right choise to use NSX bridge as long term solution. As an alrenative you can also validate if a service interface connected to a vlan backed segmnet is the better solution. This could be the case, if you do want to use a T1 router as an gateway, but the physical server is just used within the same layer2 network and will not create load over the T1 router.
There are still more options to discuss, but those will not be covered in this article.
Overview of the Lab environment
The lab environment is based on two VCF instances with a single availability zone per instance. In addition NSX Federation is implemented to have multisite network setup with a single point of glass for the security groups and firewall rules. The following list gives an overview of the componets available in each site.
- SDDC Manager 5.2.0
- NSX Manager: 4.2.0.0
- ESXi: 8u3
- vCenter: 8u3
- Number of Hosts: 4 per site
- Number of local NSX Managers: 1 per site
- Number of Gloabal NSX Managers: 1
- NSX Segments (overlay backed):
- seg-global-test: 172.16.0.0/24
- seg-global-test2: 172.16.1.0/24
- Distributed port groups (VLAN backed)
- bridge-vlan29 (Site-A): 172.16.0.0/24
- bridge-vlan30 (Site-B): 172.16.0.0/24
- bridge-trunk (Site-A and Site-B): Used as a trunk to be connected to edge nodes to forward the VLANs that should be bridged
In addition the following table shows an overview of the VMs used for the tests.
VM Name | Location | Network | IP | MAC |
---|---|---|---|---|
a-lpine1 | Site-A | seg-global-test | 172.16.0.10 | 00:50:56:af:1b:7a |
a-lpine2 | Site-A | seg-global-test2 | 172.16.1.10 | 00:50:56:af:98:87 |
a-alpine3 | Site-A | bridge-vlan29 | 172.16.0.11 | 00:50:56:af:c6:c4 |
b-lpine1 | Site-B | seg-global-test | 172.16.0.100 | 00:50:56:8e:1f:35 |
b-lpine2 | Site-B | seg-global-test2 | 172.16.1.100 | 00:50:56:8e:59:22 |
b-alpine3 | Site-B | bridge-vlan30 | 172.16.0.101 | 00:50:56:8e:ff:cf |
To give you a better overview of the planned setup, I created the following drawing. This drawing shows two sites with a bridge configured on each site and a connection between the sites trhough Federation by using Remote Tunnel Endpoints (RTEP).
Creating the NSX bridge
For NSX bridges it is required to use a NSX segment from type VLAN or a vSphere distributed portgroup, which is used as a trunk and will be connected to the edge nodes later on. I decided to use a vSphere port group, since the setup is slightly easier and there is no advatage of using a NSX segment from type VLAN for this use case.
As a requirement the security settings of the trunk port group needs to be configured as shown in the picture below.
It is also possible to set promiscous mode and forged transmits to Accept
and keep Mac learning disabled, but this created some packet loss in my tests. Using the promiscous mode does have the disadvantage that a connected VM will receive all the frames sent within the portgroup, which has a negative impact from security and performance point of view.
A good blog article which explains this behavior is “MAC Learning is your friend”.
The trunk can be either configured for all VLANs with the range “0-4094” or you can add VLAN per VLAN for each required bridge in future. I started with a single VLAN and VLAN type set to VLAN Trunking
, which adds the flexibility to add more VLANs without any downtime.
For Site-A
I am using VLAN 29 and for Site-B
VLAN 30. You could also have the same VLAN on both sites, but in my nested lab I was forced to use dedicated VLANs. Otherwhise there were already some connection over the VLANs IDs, since all of the nested components are running on a single physical host with a single VDS.
Further it is recommended to create a dedicated transport zone from type VLAN. This should be done, to limit the visibility of the networks that will be initialized for the bridges.
After the trunk and the transport zone are created it is time to add them to the edge nodes which should be used as bridge nodes. I decided to use the dedicated fastpath interface fp-eth2
to have the communication isolated on a second NVDS.
The adjustment of the edges have to be done in the location managers (System -> Fabric -> Nodes -> Edge Transport Nodes
) for all edges where the bridges should be used.
Under the mentioned menu just select the desired edge node and hit the edit button, afterwards you should select Add Switch
.
The screenshot below shows the required configuration for the dedicated NVDS including the dedicated transport zone and the DVPG (Distributed vSphere Portgroup) bridge-trunk
.
These settings should be done in both locations!
After the preperations are done, we can jump back to the global NSX manager and start the configuration of the bridge profiles under Networking -> Segments -> Profiles -> Edge Bridge Profiles
.
By using the button Add Edge Bridge Profile
you can create the edge bridge profile for the specific locations.
Here you will select for which location the specific profile should be created. Based on this you can select the desired primray and backup edge node, as well as the edge cluster and the failover mode.
If you select Preemptive
the primary edge node will be recovered as primary edge node, after an outage and as soon as the failed edge node is available again. If you select Non Preemptive
the backup edge node will be kept primary after the primary edge node failed for any reason, also if the failed edge node is back online.
The HA mode is always Active/ Standby and cannot be Active/ Active.
The following screenshot shows the example configuration for two locations.
As last step the overlay segement which should be connected to the VLANs on both locations, needs to be configured with the bridge profiles per location.
This configuration needs to be done under the global NSX manager. Therfore you should switch to the segments under Networking -> Segments
and select the desired segment.
In my case this is the segment seg-global-test
and the configuration for the bridges is under the section Additional Settings
.
Here you need to select the specific site and define the previously created edge bridge profiles as shown in the screenshot below.
Important is to configure the VLAN which is applicable for the VLAN backed network on the specific site.
Testing the comunication
The first ICMP test you can follow in the screenshot below is from VM a-alpine3
to VM a-alpine1
and includes the arp table as well as the local ip configuration to show you the mac adresses mentioned in the VM overview table.
The second and third ICMP tests are showing the communication from VM a-alpine3
to VM b-alpine1
and VM b-alpine3
As shown in the screenshot below a communication from VM a-alpine3
to VM b-alpine2
connected to a dedicated segment routed over the T1 router is also possible. This proves that the VM connected to the VLAN backed network is able to use the T1 router through the bridge.
You can see that by inspecting the added ARP table, where the default MAC of the NSX T1 router gateway IP is highlighted in red, as well as the default route configutred and shown below.
In the last test I show you the packet loss you might recognize by setting promiscous mode and forged transmits
to Accept
instead of enabling MAC Learning
and setting forged transmits
to Accept
.
Further you will see duplicated packets, which are also caused by setting promiscous mode
to Accept
.
Summary
The NSX bridge feature is a great enhancement within NSX federation and adds new opportunities to fulfill different customer requirements. It might be a new option to stretch VLANs between two locations in case you have NSX in place, but the underlay is not capable to stretch VLANs between the available locations. This could be the case if you have two locations which are connected trough the internet instead of a MPLS or any other layer2 connection.
Beside this pretty cool feature and all the option it adds, you need to keep an eye on the throughput requirements compared to limitions the NSX bridge does have. It is very important to validate these requirements and to ensure the sizing is sufficient to satisfy those.
I wish you all happy bridging!