Home > Pipeline Phases


Pipeline Phases

Fetch Phase

During the fetch phase, exactly two instructions are read from the instruction memory and placed into one of the two fetch buffers, each of which is wide enough to hold an instruction-fetch packet of two 16-bit instructions, plus the associated PC values for the instructions fetched. If no fetch buffer is available (e.g., if both of them are full or part full), then fetch stalls. Like the Alpha’s fetch/enqueue mechanism, there are two buffers, each as wide as a fetch packet, and the enqueue mechanism does not move on to the second fetch buffer until the first is completely drained.

Enqueue Phase

During the enqueue phase, up to two instructions are taken from one of the fetch buffers and placed at the tail of the instruction queue. The slots are designated tail0 and tail1 in the queue. If the first instruction in the buffer is a BEQ that is predicted taken (in this implementation, it is a simple backward-taken/forward-not-taken algorithm), the second instruction is squashed. If there are instructions in the other fetch buffer (the one that is currently not being considered for enqueue), those instructions are squashed as well. If the second instruction in the fetch buffer is a taken branch, only instructions in the alternate fetch buffer are squashed.

When an instruction in one of the fetch buffers is enqueued or squashed (by a predict-taken branch), it is marked as invalid in the fetch buffer. Once both instructions are invalid, the enqueue mechanism looks at the alternate fetch buffer.

Data-Incoming Phase

When an instruction is enqueued, it may or may not have all of its operands. If it does, it can be sent immediately to the appropriate functional unit on the following cycle. Otherwise, it must wait in the instruction queue for its operands. During this phase, instructions scan the various result buses for their data. These buses include the ALU0/ALU1 Buses which have values returning from the ALUs, the Memory Bus which has load values returning from data memory, and the Commit Bus which has the values that are currently being written to the register file. If an instruction sees that one or more of its operands is invalid and its src (Source) tag matches the ID of the data on one of these buses, it gates the associated data into its local operand storage.

Issue Phase

Once all of an instruction’s operands are valid, its operands, opcode, and ID are sent to the input-registers of the appropriate functional unit. The instruction’s opcode directs it to the appropriate bus: LW/SW instructions go to the memory queue; all other opcodes go to an ALU.

Memory operations are a special case of issue, because they actually issue twice: once to an ALU to generate the target address, and then to the memory queue when the address is known. Therefore, the issue logic considers an instruction ready to issue if all of its operands are valid, or if it is a memory operation and its ARG1 operand is valid. In this instance, the instruction is issued to an ALU as an addi instruction, and its results are gated-in to the ARG1 operand upon completion, thereby overwriting the previous valid contents. Upon completion of the addi instruction, the g (Address generated) bit in the instruction-queue entry is set, indicating that there is no need to re-issue the address-generation portion of the instruction. For load-word operations, the instruction is ready to be issued to the data memory immediately. For store-word operations, the ARG2 operand must also be valid, as it contains the value to be stored. Also, as described below, a store instruction must be issued at the time of commit.

Execute Phase

Issue and execute do not happen in the same cycle: it is a two-cycle process to obtain a result once all of an instruction’s operands are valid. As described in the previous section, the instruction’s operands, opcode, and ID are moved to the ALU’s input registers during the issue phase. On the following cycle the instruction is executed, and the resulting values are sent out on the various result buses to be latched at the end of the cycle. Memory operations take longer than a single cycle, so load results appear on the memory bus several cycles after they are initiated (this is an arbitrary choice — it is not inherent to the architecture).

Table 1. Fields of an instruction-queue entry

v valid bit: signifies whether or not the entry’s contents are valid
d done bit: signifies whether or not the instruction has completed execution
o out bit: signifies whether or not the instruction has been sent to a functional unit
g address generated bit: signifies that the instruction’s memory address
has been generated (only applies to LW/SW operations)
res the register-file value that the instruction produces, if any
op instruction opcode
bt branch bit: signifies that the instruction was predicted taken
tgt the instruction’s register target, or zero if the instruction has no target (BEQ, SW)
a0 the instruction’s extended/shifted immediate value
a1/a2 the instruction’s first/second register operand
a1_v/a2_v valid bit for a1/a2: indicates whether the value in a1/a2 is valid
a1_s/a2_s source bit for a1/a2: specifies the data-source for a1/a2
(the location in the instruction queue of the instruction that will produce the value)
im the instruction’s third register operand or it's immediate value
pc the address of the instruction, to be used if the instruction causes an exception
or a branch-mispredict
exc the exception code, or zero if the instruction has caused no exception
it the instruction’s type which indicates whether or not it touches memory (a or m),
and whether or not it can cause a change in control-flow (b) ... note that,
since this can be deduced from the opcode, it is not strictly necessary
I issue bit: signifies that the instruction is ready to be issued
F flush bit: signifies that the instruction will be flushed
S source bit: (See iqentry_n_livetarget in the IP core )
Mr memory-ready bit: signifies that the memory instruction could be sent to
the memory queue
Mi memory-issue bit: signifies that the instruction is ready to be sent to the data memory

In the instruction-queue entries, results are gated into the various operand registers as described above, but they are also gated into the res (Result) registers of those instructions whose IDs are on the result buses. When this happens, the instruction is marked done. The exception to this rule is the ALU result for the effective-address generation component of a memory operation: when this result appears on the bus, the instruction is not yet done.

Commit Phase

When an instruction has completed, it gates the contents of one of the various result buses into its res  register and sets the d (done) bit. This signifies that the instruction is ready to commit its result to the “permanent” machine state: the register file and/or data memory.

As described in the section above on reorder buffers, this mechanism protects the machine from instructions that would be squashed because of exceptions or mispredicted branches: if an instruction commits its result to the register file and is later found to have immediately followed an instruction that causes an exception or a branch instruction that was mispredicted, it is very difficult to un-do the register-file update. It is even more difficult to un-do changes to main memory. Therefore, changes of this nature to the “permanent” machine state (as opposed to the contents of the instruction queue) are only allowed to occur once it is known that the instruction is non-speculative and definitely causes no exceptions.

On every cycle, the commit logic considers the top two instructions in the instruction queue: those at the head of the queue, labeled head0 and head1. If there is a mispredicted branch in the machine, commit does not proceed, unless the mispredicted branch is in the head1 slot or later. If head0 is ready to commit, it does so. If head0 and head1 are both ready to commit, both do so. Otherwise, nothing happens.

When an instruction commits, its result is sent to the register file and made available to other instructions needing operands. If the instruction is a SW, it is sent to the memory system.

If an instruction that would otherwise be allowed to commit causes an exception, indicated by a non-zero value in the exc field of its instruction-queue entry, the machine reacts just like a branch-mispredict event: the program counter is redirected; the exceptional instruction’s program counter, held in the instruction-queue entry, is saved in a hardware register; the pipeline is flushed from the exceptional instruction to just before tail0; and execution begins with the first instruction in the exception handler.

Next

Home

ISA

Phases Timing Branch Hazards Synthesis

References