Cisco Fabric Extenders (FEX) have been a tremendous success since their introduction more than 7 years ago. In essence, a FEX acts as a remote linecard or slot that is configured and controlled from another switch. For the purposes of this post, we will call that remote switch the FEX Parent. This allows a network administrator to combine the management benefits of a single switch with the simplified cabling of a Top of Rack environment.
FEX is standardized
As a technology, FEX came to Cisco through the acquisition of Nuova Systems. It is one of the many cool products that stem from the project that brought us the Virtual Interface Card (VIC), Nexus 5000 and the Unified Computing System (UCS). For years this was a technology that would only be available on Cisco manufactured products. Like many Cisco innovations, the technology concept was ratified into a standard through the IEEE and its components are now a part of the 802.1BR and 802.1Qbh standard. You can now find similar technologies on switches made using Cisco components or components from other ASIC manufacturers e.g. Broadcomm.
FEX across Cisco products
I will spare you my take on the internal operations of a FEX and its parent switch. If you would like to review there are multiple blogs out there covering the internal operations. Here is a good example of that. Over the years, many Cisco products have added the software and hardware necessary to control FEXes including the Nexus 5X00, Nexus 7X00, and most recently the flagship Nexus 9000 family of products.
Over my years supporting Cisco Data Center products the question of which FEX parent to use is a key discussion topic with our customers. The question of managing the FEX from a modular switch or a fixed platform always comes up first. Below is an excerpt from an email received from one of the lead architects at a customer:
“Is there a value to purchasing all the extra Nexus 9300’s rather than simply attaching all the FEXes to the Nexus 9500’s?”
Let’s review the different FEX parent models and go through the thought process of how I answered this particular question.
FEX Parent models
In broad spectrum I consider two models for managing your FEXes, centralized or distributed. What i call the centralized model (below), collapses the FEX management layer onto a modular device that might be serving as the aggregation or core for the Data Center.
The distributed model (below) maintains the typical Cisco layered design approach and virtualizes the access layer using a combination of fixed port parent switches like a Nexus 5X00 or a Nexus 9200/9300. This virtualized switch then connects to an aggregation layer that typically serves as your L2/L3 demarcation point.
Considerations for selecting your FEX Parent
When discussing pros and cons of each solution with my customers, I always boil down the discussion to the points below:
1. Number of devices to manage
2. Failure Domain
3. Control plane implications
4. Cost per FEX Network Interface (NIF) on Parent switch
Number of devices to manage
The modular or centralized FEX parent model has the benefit of reducing the number of devices to manage to potentially two across the whole data center. Traditionally we see this in our smaller customers where there is a set limit of scale that they are designing to. We also see this model on Disaster Recovery Data Centers that are typically smaller in size. In the end you would be limited by the number of FEXes and server interfaces supported by a modular platform. This number is posted in the verified scalability guides for the platform that you are implementing. If at full scale we expect to go beyond those numbers, we should only consider the distributed model since we would end up moving in that direction anyways.
For reference, below is a table with the current supported maximums by platform.
Disclaimer 1: Always validate the supported number of Fabric Extenders and Interfaces based on the software release you are running. Below are links to the verified scalability guides:
Disclaimer 2: When running the Nexus 9000 in ACI mode, the system is built using a spine/leaf distributed model. All Fabric Extenders connect to a leaf in the environment. This enforces the distributed model.
From a failure domain standpoint, any outage or maintenance performed when using the centralized model will impact most if not all of your workloads and could half your capacity. This is one of the main reasons for the industry push to a distributed model with smaller spine/cores and leafs being ToR or ToR+ FEX. In the distributed model, only a subset of your workloads would be impacted if the parent switch goes down.
Control plane implications
When using the centralized model, I am always concerned about collapsing all of the services onto two switches from a control plane standpoint. Pushing access/leaf services on the Core/Spine would definitely add scalability implications in terms of hardware (TCAM) resource usage for things like ACLs and QoS as well as software in terms of number of port-channels or virtual port-channels supported. You might run into scenarios where you are not able to enable some features due to a TCAM space exhaustion. I have witnessed this first hand at a recent customer that decided to go this path.
Cost per FEX Network Inferface (NIF) on Parent switch
IT departments typically make their hardware buying decisions under a set or restricted budget. One element to take into consideration is the per port cost of the interface on the parent switch that the FEX would connect to. At 10/40G, this cost is reduced in the fixed platforms that are typically used in the distributed model. Once you amortize the investment on the redundant components (power supplies, supervisors and fan trays) for platforms that are typically used on the centralized model, the per port cost tends to be higher that on fixed switches used in the distributed model. It is not until you fill most slots on that modular platform that you are seeing the cost approach the per port cost of fixed platforms.
In the end, there is no right or wrong FEX parent model. I always tend to lead with the distributed model as it provides a great balance of distributing the control plane and a cost effective option for our customers. Your use case might lend itself better to using the centralized model. When selecting a parent for your FEX is the question, the answer like for many technology questions is IT DEPENDS!