Basic Router Hardware Concept - Switch Fabric
Contents
Contents
Switch Fabric ..................................................................................................................................... 3 1.1 Overview .......................................................................................................................................................... 3 1.2 Switch Fabric Indicators................................................................................................................................... 4 1.2.1 Backplane Capacity................................................................................................................................. 4 1.2.2 Switching Capacity ................................................................................................................................. 5 1.2.3 Speedup Factor........................................................................................................................................ 5 1.2.4 Backup Mode of SFUs ............................................................................................................................ 6 1.2.5 Fabric Throughput................................................................................................................................... 7 1.2.6 Fabric Latency ........................................................................................................................................ 7 1.2.7 Fabric Scalability .................................................................................................................................... 7 1.2.8 Unicast, Multicast, and Backpressure ..................................................................................................... 7 1.2.9 Switch Fabric Performance Indicators .................................................................................................... 8 1.3 Switch Fabric Classification............................................................................................................................. 8 1.3.1 Based on the Type of Packets Sent to the Switch Fabric......................................................................... 8 1.3.2 Based on the Location of the Memory on the Switch Fabric .................................................................. 9 1.3.3 Based on the Times of Data Exchanges ................................................................................................ 10 1.3.4 Based on the Mode in Which Data Packets Pass the Switch Fabric ..................................................... 10 1.4 History and Trend of Switch Fabric ............................................................................................................... 11 1.4.1 Shared Bus Switch (1st Generation) ..................................................................................................... 11 1.4.2 Shared Memory Switch (2nd Generation) ............................................................................................ 12 1.4.3 Crossbar Switch (3rd Generation) ......................................................................................................... 13 1.4.4 Trend of Switch Fabric .......................................................................................................................... 17 1.5 Introduction to Huawei Switch Fabric ........................................................................................................... 18 1.5.1 NE80E/NE40E SFU.............................................................................................................................. 18 1.5.2 NE5000E SFU ...................................................................................................................................... 19 1.6 FAQ ................................................................................................................................................................ 20 1.6.1 How to Calculate the LPU Capacity Based on the SFU Capacity?....................................................... 20 1.6.2 Can the SFUs on the NE40E, NE80E, and NE5000E Be Installed on the Same Device? .................... 21 1.6.3 Can the SFUs not Be Fully Configured on an NE40E/NE80E/NE5000E? ........................................... 21 1.6.4 Do the SFUs on an NE40E/NE80E/NE5000E Support Hot Swapping? ............................................... 21 1.6.5 Does an NE40E-X3 house SFUs? ......................................................................................................... 22
2014-08-26
Huawei confidential. No spreading without permission.
Page 1 of 22
Basic Router Hardware Concept - Switch Fabric
Contents
1.7 Document Change History ............................................................................................................................. 22
2014-08-26
Huawei confidential. No spreading without permission.
Page 2 of 22
Basic Router Hardware Concept - Switch Fabric
Contents
Switch Fabric 1.1 Overview Switch fabric, one of the most important modern communication technologies, is used to transmit information from the sender to the receiver. The simplest method of transmitting information from one point to another is to connect the two points with a single link. When multiple terminals need to communicate with each other in point-to-point mode, any two of the terminals must be connected. As the number of terminals to be interconnected increases, the required links multiply. If N terminals need to be interconnected, the number of required links is N x (N-1) / 2. For example, if 100 terminals need to be interconnected, up to 4950 links are required. To relieve the workload of installing links, a device that can automatically connect terminals was introduced. After a terminal is connected to the device, the device automatically connects the terminal to other terminals. This device is called a switch. A switch reduces the number of required links from N x (N-1) / 2 to N, greatly reducing the costs in installing links. Routers are the core devices on IP networks. The switch fabric unit (SFU) is the core component that determines the performance of a router. Generally a switch fabric technology is determined before a new router is designed. In the more than twenty years' history of routers, the switch fabric technology plays a significantly important role in expanding routers' capacity and upgrading routers' performance. The switch fabric technology has gone through three phases: shared bus switch, shared memory switch, and crossbar switch. Correspondingly, routers have gone through the three phases: shared bus router, shared memory router, and crossbar router. With information explosion on networks, core routers are required to have increasingly larger capacity. To meet such requirements, a single router is evolved into a cluster, and the single stage switch is evolved into the multi-stage switch.
The word fabric itself means a cloth produced especially by knitting, weaving, or felting fibers. Fabric in the switch fabric technology refers to the switch unit or chip. The words switching, switching fabric, switch fabric, and fabric in this document have the same meaning and all refer to the SFU on routers.
NE80Es/NE40Es are used as an example in this document to illustrate the principles of the switch fabric. The principles of the SFUs on the CX600-8 and NE40E-8 are similar, so are those on the CX600-16 and NE80E, on the CX600-X3/ME60-X3 and NE40E-X3, on the CX600-X8/ME60-X8 and NE40E-X8, and on the CX600-X16/ME60-X16 and NE40E-X16.
This document describes the switch fabric on routers from the following aspects:
2014-08-26
Switch fabric indicators
Switch fabric classification
Introduction to various switch fabric technologies
Introduction to Huawei switch fabric
Huawei confidential. No spreading without permission.
Page 3 of 22
Basic Router Hardware Concept - Switch Fabric
History and trend of switch fabric
FAQs
Contents
1.2 Switch Fabric Indicators This section describes the switch fabric indicators to help you better understand switch fabric.
1.2.1 Backplane Capacity The backplane is an important component that connects internal units on a router. The backplane capacity refers the total bandwidth of channels between service slots and switching units. The backplane capacity is generally greater than the capacity that is calculated based on the router throughput and actual performance test results. The backplane capacity reflects the techniques of manufacturers and the further capacity capability of the router. The backplane capacity cannot be directly tested. A router with a 400G backplane means that the 400 Gbit/s bandwidth is supported on each slot of the backplane. A 100G or 200G backplane may be used on the live network at the beginning. After the technology is improved, if a 400G backplane is required, engineers do not need to replace the original backplane but upgrade the bandwidth of each slot to 400 Gbit/s on the original backplane. In actual design, to provide 400G access capability, the backplane capacity needs to be much greater than 400G. As the main road of a highway carries various vehicles coming to and fro, the backplane carries various services transmitted to and fro. Therefore, the backplane must provide satisfying scalability. Planning 8 lanes based on a 4-lane requirement is reasonable. With the traffic development, the reserved 4 lanes will be used one day.
Currently SerDes links are used to transmit data on the backplane in this industry. The rate of SerDes links varies with design. For example, the rate of SerDes links can be 2.5 Gbit/s, 3.125 Gbit/s, 6.25 Gbit/s, and 12.5 Gbit/s. Due to engineering restriction, engineers cannot freely add physical links on the backplane. To extend the link capacity, upgrade the link rate. Backplane capacity = Number of SerDes links between LPUs and SFUs x Rate of each SerDes link
In Figure 1-1, the backplane houses 16 LPUs and four SFUs. An LPU is connected to an SFU through 18 SerDes links (nine SerDes links for data receiving and the other nine SerDes links for data sending). The rate of each SerDes link is 6.25 Gbit/s. The backplane capacity is calculated as 7.2 Tbit/s in the format of [2 x (9 x 4 x 16)] x 6.25 Gbit/s. The value 2 indicates the bidirectional (receiving and sending) capacity. The value 9 indicates the number of SerDes links connecting each LPU and SFU. The value 4 indicates the number of SFUs. The value 16 indicates the number of 16 LPUs. The value 6.25 Gbit/s indicates the rate of each SerDes link.
2014-08-26
Huawei confidential. No spreading without permission.
Page 4 of 22
Basic Router Hardware Concept - Switch Fabric
Contents
Figure 1-1 Backplane capacity/switching capacity (9+9)xSerDes
SFUN0 LPU0 SFUN1 Backplane
SFUN2 LPU15
SFUN3
The backplane connects the LPUs to the SFUs, transmits signals along various control channels, and provides power output by power modules for LPUs.
A Serializer/Deserializer (SerDes pronounced sir-deez) is a pair of functional blocks commonly used in high speed communications to compensate for limited input/output. These blocks convert data between serial data and parallel interfaces in each direction. The term "SerDes" generically refers to interfaces used in various technologies and applications. The primary use of a SerDes is to provide data transmission over a single/differential line in order to minimize the number of I/O pins and interconnects.
1.2.2 Switching Capacity The switching capacity refers to the maximum capacity of an SFU on routers. The switching capacity of a non-blocking SFU is equal to the sum of all the interfaces' capacities. After a packet is transmitted to a router through an LPU, the router adds some additional information (Overhead) to the packet for protocol switchover. Therefore, the volume of traffic processed by the SFU is greater than that processed by the LPU. The switching capacity is not directly associated with the user bandwidth, but is an indicator that combines the user bandwidth, Overhead, and Speedup factor to reflect the overall performance of an SFU. The switching capacity of a router is equal to the sum of all SFU's switching capacities. As an analogy, the SFU on the backplane functions the same as the toll gate on the highway. The toll gate stops, charges, and shows green lights on vehicles to effectively relieve traffic congestion. Switching capacity of a device = Number of interfaces on SFUs x Rate of each SerDes link x SerDes coding efficiency
In Figure 1-1, the switching capacity is calculated as 5.76 Tbit/s in the format of [2 x (9 x 4 x 16)] x 6.25 Gbit/s x 0.8. The value 0.8 is the SerDes coding efficiency.
1.2.3 Speedup Factor The speedup factor, S for short, is an important indicator that evaluates the switch fabric performance. The format of calculating the speedup factor is as follows:
2014-08-26
Huawei confidential. No spreading without permission.
Page 5 of 22
Basic Router Hardware Concept - Switch Fabric
Contents
S = SFU interface bandwidth / Physical interface bandwidth = (Number of SerDes links on the SFU x Rate of each SerDes link x SerDes coding efficiency) / (Number of physical interfaces x Rate of each physical interface) A greater speedup factor indicates smaller internal block or egress block on the SFU. Therefore, a great speedup factor usually indicates higher SFU performance. The SFU with a better performance achieves more in the support for multicast and QoS operations. In Figure 1-2, a router houses one LPU with 100 1GE interfaces and 4 SFUs. Each LPU is connected to an SFU through nine SerDes links at the rate of 6.25 Gbit/s. The SerDes coding efficiency is 0.8. Therefore the speedup factor is calculated in the following format: S = (6.25G x 9 x 4 x 0.8) / (1G x 100) = 1.8 Figure 1-2 Speedup factor
Backplane 9x6.25G Serdes
SFU0
LPU GE0
SFU1 Packet Processor
Fabric Interface
SFU2
GE99
SFU3
A greater speedup factor not necessarily indicates higher SFU performance. Sometimes a great speedup factor is calculated due to an over-simple switching algorithm.
1.2.4 Backup Mode of SFUs SFUs are core components on routers. All LPUs connect to SFUs through the backplane. If an SFU fails, service interruption occurs on all the LPUs connecting to the SFU. Therefore, backing up SFUs is designed to improve router reliability. The common backup modes include the master/slave mode and load balancing mode. In master/slave mode, when the master SFU is working, the slave SFU does not work but stands by. Once the system detects that the master SFU fails, the fault SFU becomes the slave SFU, and the original slave SFU becomes the master SFU. The master/slave mode is also called the N:M mode. N indicates the number of the master SFUs, and M indicates the number of the slave SFUs. The process of switching between the master and slave SFUs is called the master/slave SFU switchover. In load balancing mode, all SFUs work at the same time. If one SFU fails, the other SFUs automatically take over the services on the faulty SFU to prevent service interruption. The load balancing mode is also called the N+M mode. M indicates the maximum number of faulty SFUs that the system can tolerate. N indicates the number of SFUs that can take over
2014-08-26
Huawei confidential. No spreading without permission.
Page 6 of 22
Basic Router Hardware Concept - Switch Fabric
Contents
the services on the faulty SFUs to prevent service interruption. If the number of faulty SFUs exceeds M, the switch fabric performance of the system deteriorates.
1.2.5 Fabric Throughput The fabric throughput is an indicator used to evaluate the capability of a cell-based SFU in processing cells in one second or a packet-based SFU in processing packets in one second. The fabric throughput, expressed in Million Cells per second (Mcps) or Million Packets per second (Mpps), is greater than the router forwarding capability.
1.2.6 Fabric Latency The fabric latency indicates the period from the time when an SFU applies for data exchange to the time when the destination interface outputs the data. SFUs are public components on a router, and the LPUs on the router have to compete for SFU resources. Therefore, the fabric latency of the router refers to the sum of the periods on all LPUs. The scheduling algorithm of the SFU will affect the system bandwidth allocation. Therefore, the fabric latency will affect the QoS. The shorter the fabric latency, the higher the switch fabric performance.
1.2.7 Fabric Scalability The number of interfaces on an SFU is usually recorded as N x M. N indicates the number of inbound interfaces, and M indicates the number of outbound interfaces. Generally N is equal to M on an SFU. The fabric scalability is generally evaluated in terms of the interface rate, system capacity, and service scalability.
1.2.8 Unicast, Multicast, and Backpressure Unicast Unicast means that each data packet sent from a server can be forwarded to only one client. An SFU transmits unicast traffic from an interface to the specified interface.
Multicast Multicast allows a router to copy data packets to multiple channels. In multicast mode, a server can forward one data packet to a large number of clients that request the data packet at the same time. A data packet can be forwarded to any client to greatly reduce the number of data packets transmitted on networks. Therefore, multicast improves network usage and reduces transmission costs. Multicast in chips on an SFU is also called spatial multicast. In spatial multicast mode, the SFU copies one data packet from one interface to multiple interfaces based on the multicast group ID.
Backpressure Backpressure is a method of unidirectional flow control. By notifying the upstream interface of traffic congestion on the downstream interface, backpressure prevents traffic congestion from deteriorating. For example, interfaces A and B are communicating with each other. If interface A detects traffic congestion in its memory, interface A sends a special data frame, a backpressure frame, to interface B. After receiving the backpressure frame, interface B does not send data packets to interface A until the memory resources of interface A are available again. As a public component on a router, the SFU is prone to traffic congestion. Different
2014-08-26
Huawei confidential. No spreading without permission.
Page 7 of 22
Basic Router Hardware Concept - Switch Fabric
Contents
internal backpressure mechanisms are designed for SFUs with different implementations. In addition, the backpressure mechanism applies to the upstream or downstream LPUs connected to the SFU. Backpressure cannot prevent traffic congestion, but functions as a response to traffic congestion. Actually, when backpressure occurs on a router, traffic congestion has occurred on the router. Backpressure is used to prevent traffic congestion from deteriorating and help the upstream interfaces to process traffic based on the traffic status on the downstream interface. As an analogy, fever is a response to virus invasion. When a man got a fever, the viruses, indeed, had invaded. Fever is suppressing viruses by means of high temperature and helps a man to recover.
1.2.9 Switch Fabric Performance Indicators The switch fabric performance indicators include the following:
Fabric throughput
Fabric latency
Fabric scalability (interface number, interface rate, and service types)
Speedup factor
Costs
Multi-stage switching
QoS scalability
1.3 Switch Fabric Classification Switch fabrics can be classified based on different dimensions, such as based on the type of packets sent to the switch fabric, location of the memory on the switch fabric, times of data exchanges, and mode in which data packets pass the switch fabric.
1.3.1 Based on the Type of Packets Sent to the Switch Fabric Based on the type of packets sent to the switch fabric, the switch fabric can be classified as cell switch fabric and packet switch fabric. Like ATM switch, cell switch allows data packets to be fragmented into cells of specific length. After being added specific cell headers, the cells are sent to the switch fabric. The switch fabric uses high-speed hardware units to process the cells of specific length and is prevailing on routers. After the cells reach the outbound interface, the outbound interface reassembles the cells into data packets. Packet switch allows the switch fabric to forward the data packets with additional fields, without fragmenting. Package switch is generally used for shared memory switch. Instead of fragmenting or reassembling, the switch fabric directly forwards data packets with additional fields. Package switch is generally applied to Ethernet switches.
2014-08-26
Huawei confidential. No spreading without permission.
Page 8 of 22
Basic Router Hardware Concept - Switch Fabric
Contents
Figure 1-3 Switch fabric classification based on the type of packets sent to the switch fabric
Cell switch Based on the type of packets sent to the switch fabric Packet switch
1.3.2 Based on the Location of the Memory on the Switch Fabric The switch fabric is a public component on a router, and all LPUs have to compete for switch fabric resources. The switch fabric uses a specific algorithm to schedule switch fabric resources for LPUs. Data packets are cached on the inbound or outbound interface of a switch fabric and wait to be scheduled. Based on the location of the memory on a switch fabric, switch fabrics can be classified as output queuing (OQ) fabric, input queuing (IQ) fabric, and combined input and output queuing (CIOQ) fabric.
OQ fabric All cells are cached in the memory of the outbound interface on the SFU and wait to be forwarded. On the OQ fabric, the memory of the outbound interface must be allocated sufficient bandwidth so that the outbound interface can process the traffic burst if all traffic is sent to the same outbound interface.
IQ fabric All cells are cached in the memory of the inbound interface on the SFU. Once the cells are scheduled, they are sent to the outbound interface for forwarding. The inbound interfaces on the IQ fabric are prone to traffic congestion such as Head of Line (HOL). Virtual output queuing: Cells destined for different outbound interfaces are put in different queues on the inbound interfaces, protecting cells destined for different outbound interfaces against HOL. Virtual output queuing is not a new caching method but an improvement for IQ.
CIOQ fabric Cells are cached partly on the inbound interface and partly on the outbound interface of an SFU. The CIOQ fabric resolves the HOL problem on the inbound interface and does not require high for the memory of the outbound interface. Therefore, the CIOQ fabric is widely used.
2014-08-26
Huawei confidential. No spreading without permission.
Page 9 of 22
Basic Router Hardware Concept - Switch Fabric
Contents
Figure 1-4 Based on the location of the memory on the switch fabric
Based on the location of the memory on the switch fabric
OQ
IQ
CIOQ
1.3.3 Based on the Times of Data Exchanges Based on the times of data exchanges, switch fabrics can be classified as single-stage switch fabric and multi-stage switch fabric. The multi-stage switch fabric can be further classified based on different implementations, as shown in Figure 1-5. Figure 1-5 Based on the times of data exchanges Shared Memory Based on the times of data exchanges
Single switch fabric
Crossbar
Benes switch
Switch fabric classification Two-stage switch
Multi-stage switch with buffer Clos switch
Multi-stage switch fabric
Multi-stage switch for dynamic routes Multi-stage switch without buffer
Tandem Banyan
Three-or-morestage switch
Multi-stage switch for static routes
1.3.4 Based on the Mode in Which Data Packets Pass the Switch Fabric Based on the mode in which data packets pass the switch fabric, switch fabrics can be classified as cut through switching fabric and store and forward switching fabric. The cut through switching fabric does not wait for the completion of receiving data packets (cells of specific length or packets with additional fields) but sends received data packets to the outbound interface. Theoretically, the cut through switching fabric boasts high forwarding rate and low switch latency.
2014-08-26
Huawei confidential. No spreading without permission.
Page 10 of 22
Basic Router Hardware Concept - Switch Fabric
Contents
The store and forward switching fabric sends the received data packets to the outbound interface only after verifying these data packets. The store and forward switching fabric boasts excellent fault tolerance performance. The cut through switching fabric has defects in interface rate adaptation and fault tolerance. Therefore, the store and forward switching fabric is more commonly used.
1.4 History and Trend of Switch Fabric 1.4.1 Shared Bus Switch (1st Generation) The shared bus switch, the first-generation switch architecture, is developed based on the shared bus technology in the computing industry. All inbound and outbound interfaces are located on the same shared bus, and a specific algorithm is used to allow only one pair of inbound and outbound interfaces at one time to use the bus for communication. To be specific, the inbound and outbound interfaces apply for the use of the bus, and the CPU allocates the resources to interfaces to prevent conflicts.Figure 1-6 shows the shared bus switch architecture. Figure 1-6 Shared bus switch architecture
CPU
Routing table unit
Shared bus
LPU 1
LPU 2
LPU N
In shared bus switch, non-blocking switch means that the sum of all interface bandwidth must be smaller than the shared bus bandwidth. In other words, the switch performance of a router is determined by the shared bus bandwidth. In addition, the switch performance of a router is affected by the CPU capability. In Figure 1-6, when LPU 1 is communicating with LPU N,
2014-08-26
Huawei confidential. No spreading without permission.
Page 11 of 22
Basic Router Hardware Concept - Switch Fabric
Contents
shared bus resources are occupied. As a result, LPU 2 cannot communicate with other LPUs. Therefore, the router performance is determined by the shared bus capacity. The 1st generation router generally uses the shared bus switch technology, such as the Huawei NE16E.
1.4.2 Shared Memory Switch (2nd Generation) With increasing requirements for interface bandwidth, the shared bus switch technology fails to meet such requirements. Firstly, the shared bus switch technology cannot resolve the internal conflicts. Secondly, the load of the shared bus increases the design difficulty. In 1990s, a shared-memory-based switch architecture is burgeoning. The shared-memory-based switch architecture allows a separate memory controller to control access to each memory. The memory controller stores data in the memory of the inbound interface on the SFU based on a specific algorithm, and instructs the outbound interface to obtain data in a specific memory. A data switch is thus complete. In the shared memory switch, a data switch consists of a memory write operation and a memory read operation. Figure 1-7 shows a typical shared memory switch architecture. The system houses K memory controllers, each of which controls a memory. Each memory controller connects to all the inbound and outbound interfaces. After an inbound interface receives data, the memory controller writes the data into the shared memory and reads the data from the shared memory if the destination outbound interface is available, and then sends the data to the outbound interface. This is a data switch process. Figure 1-7 Shared memory switch architecture
Memory
Inbound interface
Memory controller
Outbound interface
Memory
Memory controller Inbound interface
Outbound interface
Figure 1-8 shows a typical shared memory router architecture. After LPU 1 receives data, the distributed memory management unit writes the data into the distributed memory. Based on the result of searching the routing table for a destination outbound interface, LPU 1 sends the data to LPU N.
2014-08-26
Huawei confidential. No spreading without permission.
Page 12 of 22
Basic Router Hardware Concept - Switch Fabric
Contents
Figure 1-8 Shared memory router architecture
Data receiving LPU 1
Distributed memory
Backplane
Data sending LPU 2
LPU N
Distributed memory
Distributed memory
Distributed memory management unit
Routing table
For a non-blocking shared memory switch architecture, the bandwidth for writing data into the memory must be greater than the sum bandwidth of all inbound interfaces, and the bandwidth for reading data from the memory must be greater than the sum bandwidth of all outbound interfaces.
1.4.3 Crossbar Switch (3rd Generation) At present the crossbar switch technology is widely used on the core routers in the industry.
Single-Stage Crossbar Switch fabric The crossbar switch uses a type of switch to connect N inbound interfaces to N outbound interfaces. The crossbar controller checks the status of queues on the inbound interface, and determines the connection between inbound interfaces and outbound interfaces in each scheduling period, and enables or disables the switch to control the access of inbound interfaces to the outbound interfaces. The crossbar controller is a core component on the crossbar switch, and the processing rate of the crossbar controller determines the crossbar switch performance. The crossbar switch can also connect one inbound interface to multiple outbound interfaces. Therefore, multicast is easily implemented in this manner. Crossbar switch advantages: Compared with other switch architectures, the crossbar switch can concurrently transmit N cells within a cell period and boasts high throughput. In addition, the cable routing is rather simple. Therefore, the crossbar switch is widely applied. Crossbar switch disadvantages: The crossbar controller complexity increases with the interface number. When the number of inbound interfaces is smaller than or equal to 64, the
2014-08-26
Huawei confidential. No spreading without permission.
Page 13 of 22
Basic Router Hardware Concept - Switch Fabric
Contents
crossbar switch is a good choice. With the number of inbound interfaces increases, the crossbar controller complexity grows in proportion to N². Figure 1-9 Single-stage crossbar switch Control signals: control application and authorization
Cross-point for N x N switch matrix
Switch matrix controller
Switch matrix reconfiguration
Crossbar controller
The crossbar switch technology is generally used on single-stage switch fabrics, especially in a single chassis, such as the Huawei NE80E.
Multi-stage Crossbar Switch fabric The multi-stage crossbar switch consists of multiple single-stage crossbar switches. Each switch has its own inbound and outbound interfaces and provides connections between inbound and outbound interfaces. Multiple single-stage crossbar switches are connected to form a large-scale multi-stage crossbar switch. Figure 1-10 shows a common multi-stage switch fabric that houses Am inbound interfaces and Bm outbound interfaces.
2014-08-26
Huawei confidential. No spreading without permission.
Page 14 of 22
Basic Router Hardware Concept - Switch Fabric
Contents
Figure 1-10 Multi-stage crossbar switch
Inter-stage connection
Stage 1
Inter-stage connection
Inter-stage connection
Stage 2
Stage N
The single-stage crossbar switch is simple in design and reduces costs, but cannot meet requirements for next-generation Internet expansion. The multi-stage crossbar switch is complex in operation, but supports thousands of interfaces. The multi-stage crossbar is necessary for multi-chassis routers. Different single-stage crossbar switch fabrics and inter-stage connections mark different multi-stage crossbar switch fabrics. The most common multi-stage crossbar switch fabrics are Benes and Clos, both of which are named after the inventors.
Benes switch fabric The Benes switch fabric was invented by Benes in 1964. In the Benes switch architecture, each single-stage switch fabric uses the N x N matrix, and N/d connections are provided between inbound and outbound interfaces. The middle stage can provide nonstop maintenance. The Benes switch fabric, however, cannot ensure that cells are transmitted in sequence. Therefore, additional packet sequence controlling operations are required. On a common 3-stage Benes switch fabric, N = d². The first stage fragments packets into cells, and the second and third stages send the cells to the destination interfaces.
2014-08-26
Huawei confidential. No spreading without permission.
Page 15 of 22
Basic Router Hardware Concept - Switch Fabric
Contents
Figure 1-11 Multi-stage Benes switch Each switch unit uses the d x d matrix
Clos switch fabric The Clos switch fabric was invented by Charles Clos in 1953. Charles Clos uses mathematics to prove important Clos switch features, strictly non-blocking and rearrangeably non-blocking. Unlike the Benes switch fabric, the Clos switch fabric uses square switch units in stage 2, and uses non-square switch units in stages 1 and 3. Nonsquare switch indicates that the number of inbound interfaces can be different from that of outbound interfaces. To create a switch with a specific capacity, the Clos switch needs fewer cross-points than the crossbar switch and boasts better scalability. For example, the 100 x 100 non-blocking crossbar switch needs 10000 cross-points, whereas the Clos switch needs only 5700 cross-points. With the growth of networks, the Clos switch reduces more cross-points. Therefore, the Clos switch is widely applied.
2014-08-26
Huawei confidential. No spreading without permission.
Page 16 of 22
Basic Router Hardware Concept - Switch Fabric
Contents
Figure 1-12 Multi-stage Clos switch fabric
Plane algorithm
Non-square switch matrix
Square switch matrix
The Clos switch fabric is applied to Huawei NE5000Es.
1.4.4 Trend of Switch Fabric With increasingly higher requirements of network services for physical interface capacity, routers also require upgrade of SFU interfaces as follows:
The rate of each interface is upgraded from 3.125 Gbit/s to 6.25 Gbit/s, and then from 6.25 Gbit/s to 10 Gbit/s, and even higher.
The number of interfaces is extended from 16 to 64, and even much more.
Various switch fabric technologies are combined and developed. For example, the shared memory switch and crossbar switch are combined to develop various new crossbar switch with internal memories. The crossbar switch without internal memories includes the Tandem Banyan. The crossbar switch without internal memories is equipped with a small number of logics, but requires a large number of intra-chip or inter-chip connections if it needs to achieve the same performance as that of the crossbar switch with internal memories.
2014-08-26
Huawei confidential. No spreading without permission.
Page 17 of 22
Basic Router Hardware Concept - Switch Fabric
Contents
Figure 1-13 Trend of switch fabric Single-stage switching NE40E/NE80E
Multi-plane switching
Extend the interface rate
Extend the interface quantity
Single-stage multi-plane switching
Single-plane switching
Single-stage single-plane switching
Multi-stage multi-plane switching
NE5000E
Multi-stage switching
1.5 Introduction to Huawei Switch Fabric Specific SFUs are used on Huawei routers. The SFUs work in N+M load balancing mode. Before being sent to the SFU, the Fabric Interface Controller (FIC) fragments the packets into cells.
1.5.1 NE80E/NE40E SFU The NE80E/NE40E uses the single-stage multi-plane crossbar architecture. Crossbar is implemented on the SFU, and FIC is implemented on the LPU. The switch fabric consists of eight planes, which are located on four SFUs. A data packet passes an SFU as follows: 1.
After an IP data packet reaches a physical interface on an LPU, the FIC on the LPU fragments the packet into cells and then caches the cells in the memory and sends them to the crossbar switch unit on the SFU. Each FIC is connected to all switch planes to ensure that cells can be evenly allocated to each switch plane. This not only facilitates load balancing but also is conducive to system fault tolerance.
2014-08-26
2.
After the cells reach the crossbar unit, the crossbar schedules the cells to the outbound interfaces of the SFU and sends them to the FIC on the LPU. Then the cell switch is complete.
3.
After the cells reach the FIC, the FIC reassembles the cells into IP data packets and sends them to the outbound interface on the LPU. Then the single-stage IP data packet switch is complete on the router.
Huawei confidential. No spreading without permission.
Page 18 of 22
Basic Router Hardware Concept - Switch Fabric
Contents
On the NE80E/NE40E, the FIC is a chip on an LPU.
The NE40E-8 houses only two separate SFUs, and the other two SFUs are located on two switch and routing units (SRUs). Therefore, altogether four SFUs that work in 3+1 load balancing mode are located on the NE40E-8.
Figure 1-14 NE40E/NE80E SFU architecture
1.5.2 NE5000E SFU The NE5000E uses the CIOQ fabric with an internal memory. A single NE5000E and the NE5000E CCC-0 system use the single-stage multi-plane switch fabric. The NE5000E CCC-1 and CCC-2 systems use the multi-stage multi-plane switch fabric, as shown in Figure 1-15. A 3-stage switch fabric, with the switch units named S1, S2, and S3, processes data packets as follows:
2014-08-26
Huawei confidential. No spreading without permission.
Page 19 of 22
Basic Router Hardware Concept - Switch Fabric
Contents
Figure 1-15 NE5000E switch fabric architecture
1.
After a data packet reaches a physical interface of the LPU on the CLC A, the data processing unit on the LPU processes the packet and sends it to the traffic manager (TM). The TM fragments the packet into cells. After being cached and scheduled in queues, the cells are sent to S1 (SFU on the CLC A). Each TM is connected to the switch planes through one or more connections so that cells can be evenly allocated to various switch planes.
2.
After cells reach S1, the switch fabric evenly allocates the cells to S2 (SFU on the CCC). S2 sends the cells to the SFU on the destination CLC, S3 on CLC B. After receiving the cells, S3 sends the cells to the destination LPU. The principle of processing data packets on the single-stage switch is similar to that on the multi-stage switch. The multi-stage switch is easier to establish a large-capacity switching network and improve the switching performance of the system.
3.
After the cells reach the TM on the destination LPU, the TM reassembles the cells into IP data packets and sends them to the outbound interface on the LPU. Then the 3-stage IP data packet switch is complete on the router.
1.6 FAQ 1.6.1 How to Calculate the LPU Capacity Based on the SFU Capacity? Generally the backplane capacity of a router is greater than the SFU capacity, and the SFU capacity is greater than the LPU capacity. The ratio of the SFU capacity to the LPU capacity is the speedup factor.
2014-08-26
Huawei confidential. No spreading without permission.
Page 20 of 22
Basic Router Hardware Concept - Switch Fabric
Contents
To reduce costs and improve system scalability, the backplane capacity is generally planned quite large for further LPU expansion. Usually the SFU capacity is barely enough for the current LPU specifications and further LPU expansion. For example, if the NE40E-8 houses two SFUDs and two SRUs, the SFU capacity is calculated as 327.68 Gbit/s in the format of 4 (number of SFUs) x 8 (number of interfaces on each SFU) x 4 (number of SerDes links on each interface) x 3.2 Gbit/s (rate of each SerDes link) x 0.8 (SerDes coding efficiency). The capacity allocated to each LPU is 40.96 Gbit/s in the format of 327.68G / 8. Use the speedup factor 2 as an example. Each slot of the SFU on the NE40E-8 supports the LPU with a maximum of 20G capacity. For example, if the NE40E8 houses four SFUGs, the SFU capacity is calculated as 1.31 Tbit/s in the format of 4 (number of SFUs) x 16 (number of interfaces on each SFU) x 8 (number of SerDes links on each interface) x 3.2 Gbit/s (rate of each SerDes link) x 0.8 (SerDes coding efficiency). The capacity allocated to each LPU is 81.92 Gbit/s in the format of 1.31T / 16. Use the speedup factor 2 as an example. Each slot of the SFU on the NE40E-8 supports the LPU with a maximum of 40G capacity.
The capacity generally refers to the bidirectional capacity. For example, the receiving and sending rates of a common GE interface are respectively 1 Gbit/s. Therefore, the GE interface capacity is 2 Gbit/s in terms of the bidirectional capacity and is 1 Gbit/s in terms of the unidirectional capacity. If not specially described, the capacity in this document refers to the unidirectional capacity.
If the unidirectional capacity is to be calculated, the backplane, SFU, and LPU capacity that will be used in calculation must be unidirectional. If the bidirectional capacity is to be calculated, the backplane, SFU, and LPU capacity that will be used in calculation must be bidirectional. The capacity standard must be consistent.
1.6.2 Can the SFUs on the NE40E, NE80E, and NE5000E Be Installed on the Same Device? The SFUs on the NE80E and NE5000E are almost the same in appearance. As the previous description says, however, the switch architectures of the NE40E, NE80E, and NE5000E are different, and the SFUs on the three models cannot be installed on the same device.
1.6.3 Can the SFUs not Be Fully Configured on an NE40E/NE80E/NE5000E? The SFUs on an NE40E-8/NE80E/NE5000E work in 3+1 load balancing mode. If one SFU fails or is removed, the other three SFUs automatically take over the services on the faulty SFU. If two or more SFUs fail or are removed, the system can still work, but the switching performance deteriorates. Therefore, fully configuring four SFUs in 3+1 load balancing mode is recommended on an NE40E-8/NE80E/NE5000E. The SFUs on an NE40E-X8 work in 2+1 load balancing mode.
1.6.4 Do the SFUs on an NE40E/NE80E/NE5000E Support Hot Swapping? The SFUs on an NE40E/NE80E/NE5000E support hot swapping in terms of both software and hardware designs. To ensure the system reliability when an SFU is removed and installed and reduce the packet loss during switching, an OFFLINE design is added on the SFU. Before removing an SFU, press the OFFLINE button to instruct the system to start necessary reliability mechanism. After the OFFLINE indicator comes on, you can remove the SFU.
2014-08-26
Huawei confidential. No spreading without permission.
Page 21 of 22
Basic Router Hardware Concept - Switch Fabric
Contents
1.6.5 Does an NE40E-X3 house SFUs? An NE40E-X3 uses the full-mesh architecture and houses no separate SFUs. The LPUs and subcards on an NE40E-X3 are fully meshed through high-speed buses. Due to full-mesh architecture limitations, the routers in full-mesh mode support a limited number of LPUs.
1.7 Document Change History
2014-08-26
Version
Published Date
Change History
01
2013-09-12
Initial Release.
02
2014-08-26
5.76Gbps is corrected to 5.76Tbps.
Huawei confidential. No spreading without permission.
Page 22 of 22