(a) Serial execution: non-parallel-executable instructions
1 issue cycle
SHAD R0,R1
EX-group SHAD and EX-group ADD
cannot be executed in parallel. Therefore,
SHAD is issued first, and the following
ADD is recombined with the next
instruction.
EX
D
1 stall cycle
I
I
D
I
NA
EX
S
NA
ADD
next
R2,R3
S
D
...
(b) Parallel execution: parallel-executable and no dependency
1 issue cycle
EX-group ADD and LS-group MOV.L can
be executed in parallel. Overlapping of
stages in the 2nd instruction is possible.
EX
EX
ADD
MOV.L @R4,R5
R2,R1
I
I
D
D
NA
MA
S
S
(c) Issue rate: multi-step instruction
AND.B and MOV are fetched
4 issue cycles
S
simultaneously, but MOV is stalled due to
resource locking. After the lock is released,
MOV is refetched together with the next
instruction.
AND.B#1,@(R0,GBR)
SX
D
I
I
D
MA
SX
D
S
NA
SX
D
i
I
NA
SX
D
S
MA
E
S
A
MOV R1,R2
next
S
...
4 stall cycles
(d) Branch
No stall occurs if the branch is not taken.
EX
EX
D
I
I
D
D
I
NA
NA
EX
S
S
NA
BT/S L_far
ADD R0,R1
SUB R2,R3
S
2-cycle latency for I-stage of branch destination
If the branch is taken, the I-stage of the
branch destination is stalled for the period
of latency. This stall can be covered with a
delay slot instruction which is not parallel-
executable with the branch instruction.
BT/S L_far
ADD R0,R1
EX
EX
1 stall cycle
I
I
D
D
NA
NA
S
S
L_far
I
D
...
EX
—
Even if the BT/BF branch is taken, the I-
stage of the branch destination is not
stalled if the displacement is zero.
I
I
D
D
I
NA
—
S
—
BT L_skip
ADD #1,R0
L_skip:
D
...
No stall
Figure 8.3 Examples of Pipelined Execution
Rev. 6.0, 07/02, page 207 of 986