Signal Description
registers are overridden with the transaction log information for the fatal error, and the fatal error is
logged. However, a non-fatal error cannot override a previous fatal error. Note also that there is a
single set of transaction-log registers that are shared between the non-fatal and first-error-fatal
errors.
The Intel® 6700PXH 64-bit PCI Hub can signal a message on the PCI Express* bus for any of the
first or next error conditions. Software (either from platform BIOS or a system management
controller using SMBus) must deal with fatal errors that override non-fatal errors. The following
algorithm is suggested:
• Check the non-fatal status bits in the first error register (RAS_FEPCI) to see if it is a non-fatal
error.
• If one of these bits is set, set a variable to remember that this error could be overridden.
Hardware will ensure that only one of these bits is set.
• Check the fatal status bits (in RAS_FEPCI) to see if it is a fatal error.
• If one of these bits is set, clear the “could be overridden” variable. This implies that between
the time software read the non-fatal status bits and the fatal status bits, a fatal error occurred
that overrode the non-fatal error. Hardware will ensure that only one of these bits is set.
• Read the RAS registers to determine the address and data of the error, based upon the status
bits.
• If the “could be overridden” status bit is set, read the fatal error status bits again. If one of these
is now set, it means between the time software started reading the RAS registers and now, a
fatal error occurred, and the RAS registers cannot be trusted because they could have been
overwritten. Re-read the RAS registers.
• Clear the status bit that caused the failure by writing a ‘1’.
• If the first error register is all clear (neither fatal nor non-fatal), then check the next error
register (RAS_NEPCI).
• If one of the fatal or non-fatal bits is set, then clear the error by writing a 1. There is no log
register that can be read for next error beyond the status bits.
RAS logging is simplified into three rules, and two terms. The terms are:
• Context Data: The address/data of the cycle that caused the error. For example, on a cycle that
is split, the context address is the address of the cycle on the original request, not on the
completion.
• Live Data: The value of the pins (address, data, byte enables, header) that caused the error.
The rules are:
• Cycle Errors: Target Abort and Master Abort are cycle errors. In these types of errors, the
context data is stored along with the error indication. This is stored as opposed to live data
because there is nothing fundamentally wrong with the live data – it is the context data that
resulted in the error.
• Address Parity Errors: Live data is stored in these types of errors, because the Intel®
6700PXH 64-bit PCI Hub does not have enough information as to what the intended address
was supposed to be, and the live data is needed to decode the parity error.
• Data Parity Errors: Live data is stored for the erroneous data, and context address is stored
for the address. The live data is needed to decode the parity error, and the context address is
needed in case software can recover.
76
Intel® 6700PXH 64-bit PCI Hub Datasheet