Partial-Redundancy Elimination
Introduction: In this section, we consider in detail how to minimize the number of expression evaluations. That is, we want to consider all possible execution sequences in a flow graph, and look at the number of times an expression such as x y is evaluated. By moving around the places where x y is evaluated and keeping the result in a temporary variable when necessary, we often can reduce the number of evaluations of this expression along many of the execution paths, while not increasing that number along any path. Note that the number of different places in the flow graph where x y is evaluated may increase, but that is relatively unimportant, as long as the number of evaluations of the expression x y is reduced.
Applying the code transformation developed here improves the performance of the resulting code, since, as we shall see, an operation is never applied unless it absolutely has to be. Every optimizing compiler implements something like the transformation described here, even if it uses a less "aggressive" algorithm than the one of this section. However, there is another motivation for discussing the problem. Finding the right place or places in the flow graph at which to evaluate each expression requires four different kinds of data-flow analyses. Thus, the study of "partial-redundancy elimination," as minimizing the number of expression evaluations is called, will enhance our understanding of the role data-flow analysis plays in a compiler.
Redundancy in programs exists in several forms. As discussed in Section 9.1.4, it may exist in the form of common subexpressions, where several evaluations of the expression produce the same value. It may also exist in the form of a loop-invariant expression that evaluates to the same value in every iteration of the loop. Redundancy may also be partial, if it is found along some of the paths, but not necessarily along all paths. Common subexpressions and loop invariant expressions can be viewed as special cases of partial redundancy; thus a single partial-redundancy-elimination algorithm can be devised to eliminate all the various forms of redundancy.
In the following, we first discuss the different forms of redundancy, in order to build up our intuition about the problem. We then describe the generalized redundancy-elimination problem, and finally we present the algorithm. This algorithm is particularly interesting, because it involves solving multiple dataflow problems, in both the forward and backward directions.
The Sources of Redundancy: Figure 9.30 illustrates the three forms of redundancy: common subexpressions,loop-invariant expressions, and partially redundant expressions. The figureshows the code both before and after each optimization.
Global Common Subexpressions: In Fig. 9.30(a), the expression b c computed in block B4 is redundant; it has already been evaluated by the time the flow of control reaches B4 regardless of the path tsEken to get there. As we observe in this example, the value of the expression may be different on different paths. We can optimize the code by storing the result of the computations of b c in blocks B2 and B3 in the same temporary variable, say t, and then assigning the value of t to the variable e in block B4, instead of reevaluating the expression. Had there been an assignment to either b or c after the last computation of 6 c but before block B4, the expression in block B4 would not be redundant.
Formally, we say that an expression b c is (fully) redundant at point p, if it is an available expression, in the sense of Section 9.2.6, at that point. That is, the expression b c have been computed along all paths reaching p, and the variables b and c were not redefined after the last expression was evaluated. The latter condition is necessary, because even though the expression b c is textually executed before reaching the point p, the value of b c computed at point p would have been different, because the operands might have changed.
Loop-Invariant Expressions: Fig. 9.30(b) shows an example of a loop-invariant expression. The expressionb c is loop invariant assuming neither the variable b nor c is redefined withinthe loop. We can optimize the program by replacing all the re-executions ina loop by a single calculation outside the loop. We assign the computation toa temporary variable, say t, and then replace the expression in the loop by t.There is one more point we need to consider when performing "code motion"optimizations such as this. We should not execute any instruction that wouldnot have executed without the optimization. For example, if it is possible toexit the loop without executing the loop-invariant instruction at all, then weshould not move the instruction out of the loop. There are two reasons.