Optimal Code Generation for Expressions
Introduction: We can choose registers optimally when a basic block consists of a single expression evaluation, or if we accept that it is sufficient to generate code for a block one expression at a time. In the following algorithm, we introduce a numbering scheme for the nodes of an expression tree (a syntax tree for an expression) that allows us to generate optimal code for an expression tree when there is a fixed number of registers with which to evaluate the expression.
We begin by assigning to the nodes of an expression tree a number that tells how many registers are needed to evaluate that node without storing any temporaries. These numbers are sometimes called Ershov numbers, after A. Ershov, who used a similar scheme for machines with a single arithmetic register. For our machine model, the rules are:
1. Label any leaf 1.
2. The label of an interior node with one child is the label of its child.
3. The label of an interior node with two children is
a) The larger of the labels of its children, if those labels are different.
b) One plus the label of its children if the labels are the same.
Example: In Fig. 8.23 we see an expression tree (with operators omitted) that might be the tree for expression (a — b) e x (c d) or the three-address code:
t l = a - b
t 2 = c d
t 3 = e * t2
t 4 = t l t3
Each of the five leaves is labeled 1 by rule (1). Then, we can label the interior node for tl = a - b , since both of its children are labeled. Rule (3b) applies, so it gets label one more than the labels of its children, that is, 2. The same holds for the interior node for t2 = c d.
Now, we can work on the node for t3 = e * t 2 . Its children have labels 1 and 2, so the label of the node for t3 is the maximum, 2, by rule (3a). Finally, the root, the node for t4 = tl t 3 , has two children with label 2, and therefore it gets label 3.
Generating Code from Labeled Expression Trees : It can be proved that, in our machine model, where all operands must be in registers, and registers can be used by both an operand and the result of anoperation, the label of a node is the fewest registers with which the expressioncan be evaluated using no stores of temporary results. Since in this model, weare forced to load each operand, and we are forced to compute the result correspondingto each interior node, the only thing that can make the generatedcode inferior to the optimal code is if there are unnecessary stores of temporaries.
The argument for this claim is embedded in the following algorithm for generating code with no stores of temporaries, using a number of registers equal to the label of the root.
Algorithm:Generating code from a labeled expression tree
INPUT: A labeled tree with each operand appearing once (that is, no common subexpressions).
OUTPUT: An optimal sequence of machine instructions to evaluate the root into a register.
METHOD: The following is a recursive algorithm to generate the machine code. The steps below are applied, starting at the root of the tree. If the algorithm is applied to a node with label k, then only k registers will be used. However, there is a "base" 6 > 1 for the registers used so that the actual registers used are Rb,Rb i,- --Rb k-i- The result always appears in Rb k-i-
1. To generate machine code for an interior node with label k and two children with equal labels (which must be k — 1) do the following:
a. Recursively generate code for the right child, using base 6 1. The result of the right child appears in register Rb k.
b. Recursively generate code for the left child, using base 6; the result appears in Rb k-i-
c. Generate the instruction OP Rb k,Rb k-i,Rb k, where OP is the appropriate operation for the interior node in question.
2. Suppose we have an interior node with label k and children with unequal labels. Then one of the children, which we'll call the "big" child, has label k, and the other child, the "little" child, has some label m < k. Do the
3. following to generate code for this interior node, using base 6:
a. Recursively generate code for the big child, using base b; the result appears in register Rb k-i-
b. Recursively generate code for the small child, using base 6; the result appears in register Rb m-\. Note that since m < k, neither Rb k-I nor any higher-numbered register is used.
c. Generate the instruction OP Rb k-i, Rb m-i, Rb k-i or the instruction OP Rb k-i, Rb k-i, Rb m-i5 depending on whether the big child is the right or left child, respectively.
4. For a leaf representing operand x, if the base is 6 generate the instruction LD Rb, x.