Intel SL6NQ Specification Update - Page 35

Bus invalidate line requests that returns unexpected data may result in L1

Page 35 highlights

Errata P24 Problem: Bus invalidate line requests that returns unexpected data may result in L1 cache corruption When a bus invalidate line (BIL) request receives unexpected data from a deferred reply, and a store operation write combines to the same address, there is a small window where the L1 cache is corrupt, and loads can retire with this corrupted data. This erratum occurs in the following scenario: • A RFO transaction is issued by the processor and hits a line in shared state in the L2 cache. • The RFO is then issued on the system bus as a 0 length read-invalidate (BIL), since it doesn't need data, just ownership of the cache line. • This transaction is deferred by the chipset. • At some later point, the chipset sends a deferred reply for this transaction with an implicit write-back response. For this erratum to occur, no snoop of this cache line can be issued between the BIL and the deferred reply. • The processor issues a write-combining store to the same cache line while data is returning to the processor. This store straddles an 8-byte boundary. Note: Due to an internal boundary condition, a time window exists where the L1 cache contains corrupt data, which could be accessed by a load. Implication: The L1 cache may contain corrupted data. No known commercially available chipsets trigger the failure conditions. Workaround: The chipset could issue a BIL (snoop) to the deferred processor to eliminate the failure conditions. Status: For the steppings affected, see the Summary Table of Changes. P25 Multiprocessor boot protocol may not complete with an IOQ depth of one Problem: When the in-order queue (IOQ) depth is managed by the chipset to be one entry deep, the system may hang during the multiprocessor boot protocol. This hang occurs when the chipset drives BNR# in such a way that the processors are continually throttled off the bus then released to access the bus in alternating cycles which never allows the multiprocessor boot protocol to complete execution. Implication: The system may hang during the multiprocessor boot protocol. Workaround: If the chipset drives BNR# in such a way that the processors are continually throttled off the bus then released to access the bus in alternating cycles, do not use IOQ de-pipelining. Status: For the steppings affected, see the Summary Table of Changes. P26 The processor signals page-fault exception (#pf) instead of alignment check exception (#ac) on an unlocked cmpxchg8b instruction Problem: If a page-fault exception (#pf) and alignment check exception (#ac) both occur for an unlocked cmpxc8b instruction, then #PF will be flagged. Implication: Software that depends the #AC before the #PF will be affected since #PF is signaled in this case. Workaround: Remove the software's dependency on the fact that #AC has precedence over #PF. Alternately, correct the page-fault in the page-fault handler and then restart the faulting instruction. Status: For the steppings affected, see the Summary Table of Changes. Intel® Xeon® Processor Specification Update 35

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56

Intel
®
Xeon
®
Processor Specification Update
35
Errata
P24
Bus invalidate line requests that returns unexpected data may result in L1
cache corruption
Problem:
When a bus invalidate line (BIL) request receives unexpected data from a deferred reply, and a
store operation write combines to the same address, there is a small window where the L1 cache is
corrupt, and loads can retire with this corrupted data. This erratum occurs in the following
scenario:
A RFO transaction is issued by the processor and hits a line in shared state in the L2 cache.
The RFO is then issued on the system bus as a 0 length read-invalidate (BIL), since it doesn’t
need data, just ownership of the cache line.
This transaction is deferred by the chipset.
At some later point, the chipset sends a deferred reply for this transaction with an implicit
write-back response. For this erratum to occur, no snoop of this cache line can be issued
between the BIL and the deferred reply.
The processor issues a write-combining store to the same cache line while data is returning to
the processor. This store straddles an 8-byte boundary.
Note:
Due to an internal boundary condition, a time window exists where the L1 cache contains corrupt
data, which could be accessed by a load.
Implication:
The L1 cache may contain corrupted data. No known commercially available chipsets trigger the
failure conditions.
Workaround:
The chipset could issue a BIL (snoop) to the deferred processor to eliminate the failure conditions.
Status:
For the steppings affected, see the
Summary Table of Changes
.
P25
Multiprocessor boot protocol may not complete with an IOQ depth of one
Problem:
When the in-order queue (IOQ) depth is managed by the chipset to be one entry deep, the system
may hang during the multiprocessor boot protocol. This hang occurs when the chipset drives BNR#
in such a way that the processors are continually throttled off the bus then released to access the
bus in alternating cycles which never allows the multiprocessor boot protocol to complete
execution.
Implication:
The system may hang during the multiprocessor boot protocol.
Workaround:
If the chipset drives BNR# in such a way that the processors are continually throttled off the bus
then released to access the bus in alternating cycles, do not use IOQ de-pipelining.
Status:
For the steppings affected, see the
Summary Table of Changes
.
P26
The processor signals page-fault exception (#pf) instead of alignment check
exception (#ac) on an unlocked cmpxchg8b instruction
Problem:
If a page-fault exception (#pf) and alignment check exception (#ac) both occur for an unlocked
cmpxc8b
instruction, then #PF will be flagged.
Implication:
Software that depends the #AC before the #PF will be affected since #PF is signaled in this case.
Workaround:
Remove the software’s dependency on the fact that #AC has precedence over #PF. Alternately,
correct the page-fault in the page-fault handler and then restart the faulting instruction.
Status:
For the steppings affected, see the
Summary Table of Changes
.