Introduction: In three-address code, there is at most one operator on the right side of an instruction; that is, no built-up arithmetic expressions are permitted. Thus a source-language expression like x y*z might be translated into the sequence of three-address instructions
t i = y * z
t 2 = x ti
where ti and t2 are compiler-generated temporary names. This unraveling of multi-operator arithmetic expressions and of nested flow-of-control statements makes three-address code desirable for target-code generation and optimization, as discussed in Chapters 8 and 9. The use of names for the intermediate values computed by a program allows three-address code to be rearranged easily.
Example: Three-address code is a linearized representation of a syntax tree or a DAG in which explicit names correspond to the interior nodes of the graph. The DAG in Fig. 6.3 is repeated in Fig. 6.8, together with a corresponding three-address code sequence.
Addresses and Instructions: Three-address code is built from two concepts: addresses and instructions. In object-oriented terms, these concepts correspond to classes, and the various kinds of addresses and instructions correspond to appropriate subclasses. Alternatively, three-address code can be implemented using records with fields for the addresses; records called quadruples and triples are discussed briefly in Section 6.2.2.
An address can be one of the following:
- A name. For convenience, we allow source-program names to appear as addresses in three-address code. In an implementation, a source name is replaced by a pointer to its symbol-table entry, where all information about the name is kept.
- A constant. In practice, a compiler must deal with many different types of constants and variables. Type conversions within expressions are considered in Section 6.5.2.
- A compiler-generated temporary. It is useful, especially in optimizing compilers, to create a distinct name each time a temporary is needed. These temporaries can be combined, if possible, when registers are allocated to variables.
We now consider the common three-address instructions used in the rest of this book. Symbolic labels will be used by instructions that alter the flow of control. A symbolic label represents the index of a three-address instruction in the sequence of instructions. Actual indexes can be substituted for the labels, either by making a separate pass or by "back patching," discussed in Section 6.7.
Here is a list of the common three-address instruction forms:
1. Assignment instructions of the form x = y op z, where op is a binary arithmetic or logical operation, and x, y, and z are addresses.
2.Assignments of the form x = op y, where op is a unary operation. Essential unary operations include unary minus, logical negation, shift operators, and conversion operators that, for example, convert an integer to a floating-point number.
3. Copy instructions of the form x = y, where x is assigned the value of y.
4. An unconditional jump goto L. The three-address instruction with label L is the next to be executed.
5. Conditional jumps of the form if x goto L and if F a l s e x goto L. These instructions execute the instruction with label L next if x is true and false, respectively. Otherwise, the following three-address instruction in sequence is executed next, as usual.
6. Conditional jumps such as if x relop y goto L, which apply a relational operator (<, ==, >=, etc.) to x and y, and execute the instruction with label L next if x stands in relation relop to y. If not, the three-address instruction following if x relop y goto L is executed next, in sequence.
7. Procedure calls and returns are implemented using the following instructions: param x for parameters; call p , n and y = call p , n for procedure and function calls, respectively; and return y, where y, representing a returned value, is optional. Their typical use is as the sequence of three address instructions
call p, n
Generated as part of a call of the procedure p(x\,x2, . . . ,xn). The integer n, indicating the number of actual parameters in " call p, n," is not redundant because calls can be nested. That is, some of the first param statements could be parameters of a call that comes after p returns
- its value; that value becomes another parameter of the later call.
8. Indexed copy instructions of the form x = y[il and xlil = y. The instruction x = y lil sets x to the value in the location i memory units beyond location y. The instruction x[i"] =y sets the contents of the location I units beyond x to the value of y.
9. Address and pointer assignments of the form x = &y, x = * y, and * x - y. The instruction x = ky sets the r-value of x to be the location (/-value) of y? Presumably y is a name, perhaps a temporary, that denotes an expression with an /-value such as A[i] [ j ] , and # is a pointer name or temporary. In the instruction x = *y, presumably y is a pointer or a temporary whose r-value is a location. The r-value of x is made equal to the contents of that location. Finally, * x = y sets the r-value of the object pointed to by x to the r-value of y.