AVI NSX Cloud - Is it possible to preserve the client IP for traffic forwarded by a One-Arm LoadBalancer?


Introduction

LoadBalancing is a crucial feature to increase the availability and reduce the time to complete a failover for IT services. In the first moment it seems to be not to complex. You have multiple instances of your service and you forward the traffic to both of those instances to share the load. But there a dozens of different applications and services with totally different requirements. Some of those applications are very simple and are dependent to local sessions or states inside a database. Other applications are stateless or the state is synced between the different instances. Based on those requirements the following LoadBalancing features are a subset of the important features to understand.

  • Session Persistency based on Source IP or HTTP cookies
  • Content Switching to manipulate L7 information like the HTTP header
  • Active Monitors to track the availability of a appliaction and mark these application as UP or DOWN
  • NAT to hide the IP of the User (Client) behind the IP of the LoadBalancer
  • Forwarding the real client IP of the User
  • Port Translation to forward traffic to a different Port in the backend compared to the port used by the user

Further a LoadBalancer is a active component where all the traffic from the clients is sent to, before the traffic is getting forwarded to the application instances in the backend. To get this done the Loadbalancer requires network connectivity in at leat one subnet, but it can also be the case that the LoadBalancer does have multiple subnets assigned. Therefore the following two deployment options of a LoadBalancer are commonly used.

  • One-Arm Loadbalancer (one subnet is assigned)
  • Two-Arm LoadBalancer (two or more subnets are assigned)

If you are using a One-Arm LoadBalancer the packets forwarded to the backend applications should be tranlated, so the IP of the LoadBalancer is visible in the IP header of the packet and the reponses are sent over the LoadBalancer. Otherwhise the response traffic would be sent toto the default gateway and the LoadBalancer is unable to track the state of the communication streams. Instead of translating the the source IP of the communication it would be an option to use the LoadBalancer as default gateway for the backend servers hosting the applications. For an One-Arm LoadBalancer this is commonly not possible, since the applications are running in different subnets and the gateway cannot be set to the LoadBalancer IP.

One solution for this challenge is to assign multiple subnets to the LoadBalancer, where one subnet is used for the communication comming from the clients and the second interface is in the same subnet as the backend applications.

For AVI the solution of a Two-Arm LoadBalancer can be used for integartions with the vCenter cloud, but not if the integration is done by using the NSX cloud. The only supported setup for AVI NSX cloud implementations is using One-Arm Loadbalancers. Beside this limitation the NSX Cloud does have multiple advantages and the following list shows some examples.

  • Automated statice rout creation to forward traffic from a T1 router to the desired service engine.
  • Easier implementation of Active/Acive Loadbalancers
    • vCenter Cloud Scaling Limit (native scaleout): 4 Service Engines
    • vCenter Cloud Scaling Limit (BGP for AVI): 64 Service Engines –> High complexity
    • NSX Cloud Scaling limit: 8 Service Engines –> Automated by using automated static routes

Since AVI with the NSX cloud is mandadory for the integration of specific VMware solutions like Cloud Director, it would be important to have the opportunity to forward the client IP to the bakend applications. There might be some other features like X-Forwarded-For which can be used to get the client IP visible in log files, but sometimes forwarding the original client IP as part of the IP header is just required.

Good news, there is a solution for this challenge by using the feature “Preserve Client IP for NSX-T Overlay”, which will be the focus of this blog post. This blog post does not explain how to setup AVI with NSX Cloud from scratch, it just shows you how to implement the preserve client IP feature.

If you are interested how the integration of the NSX cloud is done, you can check the blog post “vSphere with Tanzu AKO integration for Ingress”. This blog is about the AKO integration for tanzu, but also describes how to configure the NSX cloud. for AVI.

Lab environment

  • Three nested ESXi hosts of version 8.0.3, 24022510
  • vCenter server of version 8.0.3, 24022515
  • NSX Datacenter version 4.2.1.1
  • AVI version 22.1.7
  • VyOS Router
  • Alpine test VM

The following drawing shows an overview of the network connections, where the AVI Service Engines and the backend applications are connected. The drawing shows two backend services, but I am using a single Alpine VM as backend which is sufficient to validate that the real client IP is forwarded.

architecture-overview

How is preserve client ip working for NSX Cloud?

The challenge is to get the real client IP forwarded to the backend application and force the response traffic to be sent back to the LoadBalancer. Under normal circumstances this will not work for a One-Arm LoadBalancer, since the LoadBalancer is not in the datapath of the backend applications. Inested the backend application does use the T1 router as default gateway and the T1 router forwards the traffic to the T0 router. Based on this behavior the AVI Service Engines will see the packets sent from the client, but not the response. As solution the T1 router will be forced to sent the reponse traffic to a Floating IP configured on the AVI Service Engines, which will be taken from the AVI Data segment (NSX ALB Data from the drawing above). This enforcment is done by the service insertion framework of NSX and takes care that the response traffic comming from the defined backend application is sent to the Floating IP of the avtive Service Engine.

Requirements to configure preserve client IP for NSX-T Overlay

The useage of preserve client IP is dependent on very specific requirements which are shown in the following list.

  • The Service Engine Group used have to be configure for HA mode Active/ Standby
  • The NSX-T user for configuring NSX-T cloud should have additional permissions of Netx Partner Admin and Security Admin for the preserve client IP functionality apart from the Network Admin requirement for other use cases (See Configuring NSX-T Roles for more details).
  • Set URPF Mode to None for the VIP data segments in which the preserve client IP feature will be enabled.
  • Ensure that Virtual Services (for which Preserve Client IP is configured) have Pools defined using NSX Network Security Groups. The Network Security Group will be configured as the Source match in the redirect rule. This ensures consistency between the redirect rule in NSX and the Pool membership in Avi Load Balancer. It is not supported to use IP addresses, ranges, IP Groups or DNS names for these Pools.
  • AVI Data Segment and backend application needs to be connected to the same T1 router

The requirements are also visible under the official documentation “Pre-Requisites to Configure Preserve Client IP for NSX-T Overlay”

Configuring Preserve Client IP

To enable the service insertion, you need to gather the free IPs of the NSX ALB data segmenet first. Those IPs should not be used in any static IP or DHCP pool. The following command can be used to discover the DHCP range used for this segment

[admin:10-0-1-18]: > show nsxt segment seg-preserveclientip-sedata
+--------------------+---------------------------------------------+
| Field              | Value                                       |
+--------------------+---------------------------------------------+
| uuid               | segmentruntime-0897fb3696cf                 |
| segment_id         | /infra/segments/seg-preserveclientip-sedata |
| name               | seg-preserveclientip-sedata                 |
| subnet             | 192.168.255.16/28                           |
| dhcp_enabled       | True                                        |
| nw_ref             | seg-preserveclientip-sedata                 |
| nw_name            | seg-preserveclientip-sedata                 |
| vrf_context_ref    | t1-preserveclientip-test                    |
| tier1_id           | /infra/tier-1s/t1-preserveclientip-test     |
| opaque_network_id  | d509b42e-5db6-4458-9543-17a582fe1e40        |
| segment_gw         | 192.168.255.17/28                           |
| dhcp_ranges[1]     | 192.168.255.19-192.168.255.30               |
| segname            | seg-preserveclientip-sedata                 |
| tenant_ref         | admin                                       |
| cloud_ref          | nsx.lab.home                                |
| security_only_nsxt | False                                       |
+--------------------+---------------------------------------------+

The next output shows the desired configuration for the network service configuration, where 192.168.255.18 is the floating IP. Other parameters like the SE Group reference, the VRF reference and the Cloud reference needs to be configured as well for the network service.

[admin:10-0-1-18]: > show networkservice nsxtpreserveclientip_sedata
+--------------------------------+-----------------------------------------------------+
| Field                          | Value                                               |
+--------------------------------+-----------------------------------------------------+
| uuid                           | networkservice-26edbb45-5524-41a9-8abb-87250ab31494 |
| name                           | nsxtpreserveclientip_sedata                         |
| se_group_ref                   | seg-preserve-clientip                               |
| vrf_ref                        | t1-preserveclientip-test                            |
| service_type                   | ROUTING_SERVICE                                     |
| routing_service                |                                                     |
|   enable_routing               | False                                               |
|   routing_by_linux_ipstack     | False                                               |
|   floating_intf_ip[1]          | 192.168.255.18                                      |
|   enable_vmac                  | False                                               |
|   enable_vip_on_all_interfaces | True                                                |
|   advertise_backend_networks   | False                                               |
|   graceful_restart             | False                                               |
|   enable_auto_gateway          | False                                               |
| tenant_ref                     | admin                                               |
| cloud_ref                      | nsx.lab.home                                        |
+--------------------------------+-----------------------------------------------------+

To execute the configuration, you need to switch to the config context of the network service by using the following command.

configure networkservice <somename>

There you can configure the desired configuration parameters.

After the network service is created you can proceed configure the LoadBalancing Service to enable the preserve client IP feature. Therefore it is required to configure a new application profile with preserve client IP enabled. YOu are also allowed to adjust a existing application profile, but I would recommend to create a dedicated profile. For the application profile you need to take care, if you want to enable preserve Client IP for L7 or L4 services. If you want to enable it for an L4 service, you need to configure the L4 application profile. If you are trying to configure a L7 service, just create a L7 application profile.

The following screenshots visulizes the configuration parameters for a L4 application profile.

application-profile

After the application profile is created, you need to assign it in the specific virtual service, weher preserve client Ip should be enabled.

virtual-service

As last step you should validate, if you backend pool is configured with a NSX security group, which discovers the logical port of the backend applications. If the IP for the pool member is added manually in AVI, the service insertion cannot be applied. The following screenshot shows a example configuration of a well configured pool assignment.

virtual-service

Testing

For my tests I have teh following steup:

  • LoadBalancer VIP: 10.100.11.10
  • Alpine IP: 10.100.12.2
  • ClientIP: 192.168.178.226
  • Test Protocol: SSH
  • Validating which IP arrives at the backend application: tcpdump -i eth0 port 22

The following screenshot proved that the client IP is forwarded to the alpine.

virtual-service

Further the following command shows the ssh connection in the session table of the T1 service router.

nsxe01(tier1_sr[3])> get firewall connection state
Sat Jan 11 2025 UTC 02:27:58.672
Connection count: 1
192.168.178.26:59870  -> 10.100.11.10:22  dir in protocol tcp state ESTABLISHED:ESTABLISHED f-8168 n-0

Troubleshooting

Under normal circumstances the configuration should work, if all of the requirements are fulfilled, but at some point you might have the situation it will not work. In such a situation it is important to understand how the specific setup can be analyzed to check, if the service insertion is working or not.

I spent some time to find how to get some insights regarding the packet hits or where a packet capture would be helpful. So far, I just found the following command to see the packet hits for the configured service insertion.

nsxe01> get service-insertion
Sat Jan 11 2025 UTC 02:14:11.958
Service Insertion Policy:

Policy UUID                                  : ab15ceaf-5c0c-4aeb-9efb-1b28d760fe2b
Transport type                               : L3_ROUTED
Is EW policy                                 : 0
Redirected packet count                      : 10
Nexthop IP                                   : 192.168.255.18

As you can see, there is a parameter for Redirected packet count which increases, if the connection is working as expected. If the backend pool member is not configured properly, the packet count will not be increased after testing the communication.

Observations

As soon as preserve client IP is configured for a LoadBalancing service based on service insertion for a NSX-T Overlay implementation with the NSX cloud, the backend application is no longer directly available over the same protocol and port. In my example I configured the LoadBalancing service for SSH and the command ssh 10.100.12.2 is running in an timeout, but ICMP is still working.