(e) Flow dependency
Zero-cycle latency
The following instruction, ADD, is not
stalled when executed after an instruction
with zero-cycle latency, even if there is
dependency.
EX
EX
I
I
D
D
NA
NA
S
S
MOV
ADD
R0,R1
R2,R1
1-cycle latency
ADD and MOV.L are not executed in
parallel, since MOV.L references the result
of ADD as its destination address.
EX
D
I
I
D
i
NA
EX
S
MA
ADD
MOV.L @R1,R1
next
R2,R1
S
I
...
1 stall cycle
2-cycle latency
S
Because MOV.L and ADD are not fetched
simultaneously in this example, ADD is
stalled for only 1 cycle even though the
latency of MOV.L is 2 cycles.
EX
D
...
I
I
I
D
I
I
MA
MOV.L @R1,R1
EX
ADD
next
R0,R1
NA
S
1 stall cycle
2-cycle latency
1-cycle increase
EX
D
...
Due to the flow dependency between the
load and the SHAD/SHLD shift amount,
the latency of the load is increased to 3
cycles.
D
I
I
MA
S
d
MOV.L @R1,R1
SHAD R1,R2
next
EX
NA
S
2 stall cycles
4-cycle latency for FPSCR
S
F1
D
I
FADD FR1,FR2
D
I
F2
EX
FS
NA
STS
STS
FPUL,R1
FPSCR,R2
D
EX
NA
S
2 stall cycles
7-cycle latency for lower FR
8-cycle latency for upper FR
F1
d
I
D
F2
F1
d
FS
F2
F1
d
FADD DR0,DR2
FS
F2
F1
d
FS
F2
F1
FS
F2
F1
FR3 write
FS
F2
D
FR2 write
FS
EX
I
NA
EX
S
NA
FMOV FR3,FR5
FMOV FR2,FR4
I
D
S
3-cycle latency for upper/lower FR
FR1 write
FR0 write
FS
F1
d
I
I
D
D
F2
F1
FLOAT FPUL,DR0
FMOV.S FR0,@-R15
FS
F2
EX
MA
S
Zero-cycle latency
3-cycle increase
EX
EX
I
I
D
D
NA
MA
S
d
FLDI1 FR3
F0
3 stall cycles
F1
F2
FS
FIPR
FV0,FV4
2-cycle latency
1-cycle increase
I
I
D
D
S
d
FMOV @R1,XD14
FTRV XMTRX,FV0
F0
d
F1
F0
d
F2
F1
F0
d
FS
F2
FS
F2
F1
3 stall cycles
F1
F0
FS
F2
FS
Figure 8.3 Examples of Pipelined Execution (cont)
Rev. 6.0, 07/02, page 208 of 986