Performance Metrics
PLX Technology, Inc.
8.2
Non-Blocking Switch
In switch literature, non-blocking is used to indicate that a packet can be routed from an ingress port to
an egress port, provided that not more than one packet is received by the same ingress port and not more
than one packet is destined to the same egress port. A non-blocking switch is expected to fully route all
packets for independent ingress traffic streams, with the destination uniformly distributed.
The PEX 8532 is a non-blocking switch.
8.2.1
Queuing Topology
Three major queuing topologies are used in switch architecture:
• Output Queuing (OQ) – When a packet arrives at an ingress port, it is immediately placed into a
buffer that resides in the corresponding egress port. If, in the worst case, there are N ingress ports
simultaneously attempting to transmit packets to the same egress port, the output buffer is required
to enqueue traffic N times faster than the egress port’s dequeuing rate.
• Input Queuing (IQ) – In this architecture, ingress port packets have a set of Virtual Output
Queues (VOQ). One of the packets, among all head packets in different VOQs to the same egress
port, is allowed to be scheduled out of that ingress port during a given time slot. The key factor in
achieving high performance using VOQ is the global scheduling algorithm, which is responsible
for the selection of packets to transmit from the ingress ports to the egress ports in each time unit.
2
The complexity of such scheduling algorithm is O(N ).
• Combined Input-Output Queuing (CIOQ) – This approach adopts a queuing structure that is a
combination of input and output queuing. It provides VOQ buffers at the ingress side, and also
provides O(1) bandwidth buffers at the egress side. The design goal is to achieve the same level of
throughput and non-blocking nature as an OQ switch, but without requiring O(N) times bandwidth
to buffers as an OQ switch and without building a centralized scheduler whose complexity is
2
proportional to O(N ) as an IQ switch. To achieve this goal, moderate internal fabric speedup is
required in the CIOQ approach, to compensate for transient conflict.
The PEX 8532 uses CIOQ as its internal switching topology to process traffic arriving from different
stations. Packets from one or more ports are aggregated first into a station, whose data path is
sufficiently wide to accommodate traffic from all ports within it at any time. The PEX 8532
implementation includes two stations. In the future, this architecture will be directly scaled up, to deal
with more than two stations.
For independent ingress traffic, it is possible for the CIOQ approach to achieve complete
egress throughput with internal fabric to issue a speedup of only 2 – 1/N. That is, for the two-station
PEX 8532 implementation, the internal speedup factor of 1 (no speedup), is sufficient to achieve non-
blocking status.
After extensive simulation to consider standard switching performance factors including input traffic
distribution, packet size distribution, output throughput, port-to-port latency, latency jitter, egress-to-
ingress backpressure, as well as PCI Express-specific performance factors such as Physical Layer, Data
Link Layer overhead, and packet-to-packet dependency caused by PCI ordering, PLX determined that
using an internal speed-up factor of 1.25 allows the PEX 8532 to be non-blocking.
98
ExpressLane PEX 8532AA/BA/BB/BC 8-Port/32-Lane Versatile PCI Express Switch Data Book
Copyright © 2007 by PLX Technology, Inc. All Rights Reserved – Version 1.6