The Dynamic Programming Algorithm
Introduction: The dynamic programming algorithm proceeds in three phases (suppose the target machine has r registers):
1. Compute bottom-up for each node n of the expression tree T an array C of costs, in which the zth component C[i] is the optimal cost of computing the subtree S rooted at n into a register, assuming i registers are available for the computation, for 1 < i < r.
2. Traverse T, using the cost vectors to determine which subtrees of T must be computed into memory.
3. Traverse each tree using the cost vectors and associated instructions to generate the final target code. The code for the subtrees computed into memory locations is generated first.
Each of these phases can be implemented to run in time linearly proportional to the size of the expression tree.
The cost of computing a node n includes whatever loads and stores are necessary to evaluate S in the given number of registers. It also includes the cost of computing the operator at the root of S. The zeroth component of the cost vector is the optimal cost of computing the subtree S into memory. The contiguous evaluation property ensures that an optimal program for S can be generated by considering combinations of optimal programs only for the subtrees of the root of S. This restriction reduces the number of cases that need to be considered.
In order to compute the costs C[i] at node n, we view the instructions as tree-rewriting rules, as in Section 8.9. Consider each template E that matches the input tree at node n. By examining the cost vectors at the corresponding descendants of n, determine the costs of evaluating the operands at the leaves of E. For those operands of E that are registers, consider all possible orders in which the corresponding subtrees of T can be evaluated into registers. In each ordering, the first subtree corresponding to a register operand can be evaluated using i available registers, the second using i -1 registers, and so on. To account for node n, add in the cost of the instruction associated with the template E. The value C[i] is then the minimum cost over all possible orders.
The cost vectors for the entire tree T can be computed bottom up in time linearly proportional to the number of nodes in T. It is convenient to store at each node the instruction used to achieve the best cost for C[i] for each value of i. The smallest cost in the vector for the root of T gives the minimum cost of evaluating T.
Example: Consider a machine having two registers RO and Rl, and the following instructions, each of unit cost:
In these instructions, Ri is either RO or Rl, and Mi is a memory location. The operator op corresponds to arithmetic operators.
Let us apply the dynamic programming algorithm to generate optimal code for the syntax tree in Fig 8.26. In the first phase, we compute the cost vectors shown at each node. To illustrate this cost computation, consider the cost vector at the leaf a. C[0], the cost of computing a into memory, is 0 since it is already there. C[l], the cost of computing a into a register, is 1 since we can load it into a register with the instruction LD RO, a. C[2], the cost of loading a into a register with two registers available, is the same as that with one register available. The cost vector at leaf a is therefore (0,1,1).
Consider the cost vector at the root. We first determine the minimum cost of computing the root with one and two registers available. The machine instruction ADD RO, RO, M matches the root, because the root is labeled with the operator . Using this instruction, the minimum cost of evaluating the root with one register available is the minimum cost of computing its right subtree into memory, plus the minimum cost of computing its left subtree into the register, plus 1 for the instruction. No other way exists. The cost vectors at the right and left children of the root show that the minimum cost of computing the root with one register available is 5 2 1 = 8.
Now consider the minimum cost of evaluating the root with two registers available. Three cases arise depending on which instruction is used to compute the root and in what order the left and right subtrees of the root are evaluated.
Dynamic programming techniques have been used in a number of compilers, including the second version of the portable C compiler, PCC2. The technique facilitates retargeting because of the applicability of the dynamic programming technique to a broad class of machines.