Meaning of a Data-Flow Solution
Introduction: We now know that the solution found using the iterative algorithm is the maximum fixedpoint, but what does the result represent from a program-semantics point of view? To understand the solution of a data-flow framework (D, F, V, A), let us first describe what an ideal solution to the framework would be. We show that the ideal cannot be obtained in general, but that Algorithm 9.25 approximates the ideal conservatively.
The Ideal Solution: Without loss of generality, we shall assume for now that the data-flow frameworkof interest is a forward-flowing problem. Consider the entry point of a basicblock B. The ideal solution begins by finding all the possible execution pathsleading from the program entry to the beginning of B. A path is "possible"only if there is some computation of the program that follows exactly that path.
The ideal solution would then compute the data-flow value at the end of each possible path and apply the meet operator to these values to find their greatest lower bound. Then no execution of the program can produce a smaller value for that program point. In addition, the bound is tight; there is no greater data-flow value that is a gib for the value computed along every possible path to B in the flow graph.
We now try to define the ideal solution more formally. For each block B in a flow graph, let fB be the transfer function for B. Consider any path
P — ENTRY ->• B\ ->• B2 -»• • • • -> B k - l ->• Bk
from the initial node ENTRY to some block Bk. The program path may have cycles, so one basic block may appear several times on the path P. Define the transfer function for P, / p , to be the composition of fBl ,fB2--- > /B«,_I Note that / B H is not part of the composition, reflecting the fact that this path is taken to reach the beginning of block Bk, not its end. The data-flow value created by executing this path is thus /P(VENTRY) , where GENTRY is the result of the constant transfer function representing the initial node ENTRY. The ideal result for block B is thus
We claim that, in terms of the lattice-theoretic partial order < for the framework in question,
• Any answer that is greater than IDEAL is incorrect.
• Any value smaller than or equal to the ideal is conservative, i.e., safe.
Intuitively, the closer the value to the ideal the more precise it is.8 To see why solutions must be < the ideal solution, note that any solution greater than IDEAL for any block could be obtained by ignoring some execution path that the program could take, and we cannot be sure that there is not some effect along that path to invalidate any program improvement we might make based on the greater solution. Conversely, any solution less than IDEAL can be viewed as including certain paths that either do not exist in the flow graph, or that exist but that the program can never follow. This lesser solution will allow only transformations that are correct for all possible executions of the program, but may forbid some transformations that IDEAL would permit.
The Meet-Over-Paths Solution: However, as discussed in Section 9.1, finding all possible execution paths isundecidable. We must therefore approximate. In the data-flow abstraction, weassume that every path in the flow graph can be taken. Thus, we can definethe meet-over-paths solution for B to be
Note that, as for IDEAL, the solution MOP[B] gives values for IN[B] in forwardflow frameworks. If we were to consider backward-flow frameworks, then we would think of MOP [B] as a value for OUT[B].
The paths considered in the MOP solution are a superset of all the paths that are possibly executed. Thus, the MOP solution meets together not only the data-flow values of all the executable paths, but also additional values associated with the paths that cannot possibly be executed. Taking the meet of the ideal solution plus additional terms cannot create a solution larger than the ideal. Thus, for all B we have MOP[B] < IDEAL[B], and we will simply say that MOP < IDEAL.
The Maximum Fixedpoint versus the MOP Solution: Notice that in the M O P solution, the number of paths considered is still unboundedif the flow graph contains cycles. Thus, the M O P definition does notlend itself to a direct algorithm. The iterative algorithm certainly does not firstfind all the paths leading to a basic block before applying the meet operator.Rather,
1. The iterative algorithm visits basic blocks, not necessarily in the order of execution.
2. At each confluence point, the algorithm applies the meet operator to the data-flow values obtained so far. Some of these values used were introduced artificially in the initialization process, not representing the result of any execution from the beginning of the program.