Speculative Execution Support
Introduction: Memory loads are one type of instruction that can benefit greatly from speculative execution. Memory loads are quite common, of course. They have relatively long execution latencies, addresses used in the loads are commonly available in advance, and the result can be stored in a new temporary variable without destroying the value of any other variable. Unfortunately, memory loads can raise exceptions if their addresses are illegal, so speculatively accessing illegal addresses may cause a correct program to halt unexpectedly. Besides, mispredicted memory loads can cause extra cache misses and page faults, which are extremely costly.
Example: In the fragment
i f (p != n u l l)
q = *P;
dereferencing p speculatively will cause this correct program to halt in error if p is null .
Many high-performance processors provide special features to support speculative memory accesses. We mention the most important ones next.
Prefetching: The prefetch instruction was invented to bring data from memory to the cachebefore it is used. A prefetch instruction indicates to the processor that theprogram is likely to use a particular memory word in the near future. If thelocation specified is invalid or if accessing it causes a page fault, the processorcan simply ignore the operation. Otherwise, the processor will bring the datafrom memory to the cache if it is not already there.
Poison Bits: Another architectural feature called poison bits was invented to allow speculativeload of data from memory into the register file. Each register on themachine is augmented with a poison bit. If illegal memory is accessed or theaccessed page is not in memory, the processor does not raise the exception immediatelybut instead just sets the poison bit of the destination register. Anexception is raised only if the contents of the register with a marked poison bitare used.
Predicated Execution: Because branches are expensive, and mispredicted branches are even more so(see Section 10.1), predicated instructions were invented to reduce the numberof branches in a program. A predicated instruction is like a normal instructionbut has an extra predicate operand to guard its execution; the instruction isexecuted only if the predicate is found to be true.
As an example, a conditional move instruction CMOVZ R2,R3,R1 has the semantics that the contents of register R3 are moved to register R2 only if register Rl is zero. Code such as
i f (a == 0)
b = c d;
can be implemented with two machine instructions, assuming that a, b, c, and d are allocated to registers Rl, R2, R4, R5, respectively, as follows:
ADD R3, R4, R5
CMOVZ R2, R3, Rl
This conversion replaces a series of instructions sharing a control dependence with instructions sharing only data dependences. These instructions can then be combined with adjacent basic blocks to create a larger basic block. More importantly, with this code, the processor does not have a chance to mispredict, thus guaranteeing that the instruction pipeline will run smoothly. Predicated execution does come with a cost.
Predicated instructions are fetched and decoded, even though they may not be executed in the end. Static schedulers must reserve all the resources needed for their execution and ensure that all the potential data dependences are satisfied. Predicated execution should not be used aggressively unless the machine has many more resources than can possibly be used otherwise.