Home > Pipeline Timing


Pipeline Timing

ALU Instructions

ALU-type instructions (add, addi, nand, lui) have the following timing:

ß    1-Cycle     à        
         



I-FETCH
INTO FETCHBUF
[Fe]

[IQ ENTRY]

ENQUEUE

[iQ]
[OPERANDS]

SCHEDULE &
ISSUE

[ID]


EXECUTE
LATCH RESULT

[Ex]
[HEAD PTR]

COMMIT
RESULT

[Cm]

The words in square brackets represent potential reasons for stalling. In the enqueue phase (second cycle), an instruction can wait arbitrarily long for an open instruction-queue entry, and, once an entry is available (tagged as invalid), the enqueue process takes one cycle. In the issue phase (third cycle), an instruction can wait arbitrarily long for its operands, and, once the operands are valid, the issue process takes one cycle. In the commit phase (last cycle), an instruction can wait arbitrarily long for the head pointer to come around, signifying that the instruction is in the next block of instructions to commit, and, once the instruction is marked “done” and all preceding instructions have committed, the commit process takes one cycle.

These are simple instructions, requiring a single cycle of execution in an ALU. They all update the register file. They can have two register operands (add, nand), one register operand (addi), or no register operands (lui). As with any other type of instructions, more than one may be executed and committed simultaneously if there are no inter-instruction dependencies.

Memory Instructions

Load instructions have the following timing:



FETCH INTO
FETCHBUF

[Fe]

[IQ ENTRY]

ENQUEUE

[iQ]
[OPERANDS]

SCHEDULE
& ISSUE

[ID]

EXECUTE
LATCH ADDR
IN a1, SET g BIT
[Ex]
[CHECK FOR
ADDRESS
CONFLICTS]

[  ]


SEND TO
MEMQ

[MQ]


IDLE

[W1]


IDLE

[W2]


IDLE

[MR]


LATCH RESULT
SET d BIT

[LR]
[HEAD PTR]

COMMIT
RESULT

[Cm]
           

WAIT 1

WAIT 2

MEMORY
READ
   

After the target address is generated, the memory-scheduling logic compares the target address to that of every instruction earlier in the queue. This phase (“check for address conflicts”) can take an arbitrary amount of time until the instruction queue is free of conflicts. When conflicts have been resolved, the request is sent to the memory queue, illustrated by a separation of the two instruction paths. The memory system returns the requested data three cycles later, to be latched on the fourth cycle. When the data returns, the done bit is set and the instruction can commit.

Store instructions have slightly different timing:



FETCH INTO
FETCHBUF

[Fe]

[IQ ENTRY]

ENQUEUE

[iQ]
[OPERANDS]

SCHEDULE
& ISSUE

[ID]

EXECUTE
LATCH ADDR
IN a1, SET g BIT
[Ex]
[CHECK FOR
ADDRESS
CONFLICTS]

[  ]
[HEAD PTR]

SEND TO
MEMQ

[MQ]


SET d BIT

[..]



COMMIT
RESULT

[Cm]

 
            WAIT 1

WAIT 2

MEMORY
WRITE

The primary difference is that stores wait to send anything to the memory system until it is known that they will definitely commit. Therefore, the wait for the head pointer to come around happens earlier in the life cycle. Once the request has been handed off to the memory queue, there is no reason for the instruction to remain in the instruction queue, and so it is committed immediately.

Branches and Jumps

Conditional branches that are predicted not-taken look just like ALU-type instructions: they collect their operands and are issued to the functional units to verify the prediction. Meanwhile, the program counter simply increments as with a regular instruction. Assuming the prediction is correct, pipeline timing is not affected.

Conditional branches that are predicted taken are resolved in the enqueue phase. While the beq instruction is sitting in one of the fetch buffers, the predicted target address is generated and placed into the program counter. Instruction fetch down the predicted path begins on the following cycle. Thus, there is a one-cycle penalty for predicted-taken branches. This is pretty standard for architectures without branch-target buffers.

Next

Home

ISA

Phases Timing Branch Hazards Synthesis

References