Performance Metrics
PLX Technology, Inc.
8.4.5
Data Link Layer Considerations
8.4.5.1
Arbitration between DLLP and TLP
To reduce DLLP overhead on the wire, the PEX 8532 uses the following fixed-priority scheme to
determine what transmits next:
1. Completion of a TLP or DLLP currently in transmission.
2. Initialization Flow Control (FC-Init) DLLPs.
3. NAK DLLP.
4. ACK DLLP, due to receipt of a duplicate TLP or ACK Latency Timer expiration. *
5. Update FC DLLP, due to the FC Update Pending Timer expiration. *
6. Retry Buffer TLP, due to received NAK or Retry timeout.
7. New TLPs. *
8. Update FC DLLP, due to change in available credits. *
9. Power Management DLLP.
10. ACK DLLP for the last received TLP. *
Among these ten categories, the five most frequently seen new packets are noted with an asterisk (*).
Updated-FC, InitFC, and ACK DLLPs appear twice – once as higher priority than TLPs, and once as
lower priority than TLPs. A regular DLLP turns into a higher priority DLLP based on a programmable
timer. The basic idea is to reduce the number of DLLPs, the timers provide the opportunity to collapse
multiple DLLPs into 1. The timers are discussed in Sections 8.4.5.2 and 8.4.5.3.
8.4.5.2
DLLP ACK Frequency Control
The ACK Transmission Latency Limit register (offset 1F8h) indicates a minimum amount of time
(in 4 ns clocks) that the switch waits before prioritizing an ACK. By setting this register to the minimum
value of 2 (refer to note below), ACKs are typically always transmitted with high priority, allowing the
most DLLP traffic and the smallest possible Retry buffer in the other device on the link.
Note: 2 is the minimum value that has an effect; 0 or 1 wait for 255 clocks.
The larger the number written into this CSR, the larger the chance of ACK collapse, and the more
efficient the outgoing TLP throughput can be.
However, by setting the ACK Transmission Latency Limit to the maximum (255), 255 symbol times
(4 ns each) to occur before prioritizing an ACK. If the Retry buffer in the external device is not
sufficiently deep, it can slow the incoming TLP rate. On a x4 link, 1,020B can be transmitted in
255 symbol times, which is 51 20B packets. The external device would need to have a Retry buffer that
could store more than 51 TLPs, so as not to impact the back-to-back burst of incoming TLPs.
Because programming a smaller value into this CSR decreases egress TLP throughput but can increase
the ingress TLP throughput, a tradeoff must be addressed.
If there is no TLP traffic, an ACK can be transmitted earlier than the timer indicates as a low-priority
DLLP.
The initial value depends on the programmed link width. However, the value can be overwritten by
serial EEPROM or a regular CSR Write.
112
ExpressLane PEX 8532AA/BA/BB/BC 8-Port/32-Lane Versatile PCI Express Switch Data Book
Copyright © 2007 by PLX Technology, Inc. All Rights Reserved – Version 1.6