The Code-Generation Algorithm
Introduction: An essential part of the algorithm is a function getReg(I), which selects registers for each memory location associated with the three-address instruction I. Function getReg has access to the register and address descriptors for all the variables of the basic block, and may also have access to certain useful data-flow information such as the variables that are live on exit from the block. We shall discuss getReg after presenting the basic algorithm. While we do not know the total number of registers available for local data belonging to a basic block, we assume that there are enough registers so that, after freeing all available registers by storing their values in memory^ there are enough registers to accomplish any three-address operation.
In a three-address instruction such as x = y z, we shall treat as a generic operator and ADD as the equivalent machine instruction. We do not, therefore, take advantage of commutatively of . Thus, when we implement the operation, the value of y must be in the second register mentioned in the ADD instruction, never the third. A possible improvement to the algorithm is to generate code for both x = y z and x = z y whenever is a commutative operator, and pick the better code sequence.
Machine Instructions for Operations: For a three-address instruction such as x = y z, do the following:
1. Use getReg(x = y z) to select registers for x, y, and z. Call these Rx, Ry, and Rz.
2. If y is not in Ry (according to the register descriptor for Ry), then issue an instruction LD Ry,y', where y' is one of the memory locations for y (according to the address descriptor for y).
3. Similarly, if z is not in Rz, issue and instruction LD Rz,z', where z' is a location for z.
4. Issue the instruction ADD Rx,Ry,Rz.
Machine Instructions for Copy Statements
There is an important special case: a three-address copy statement of the form x = y. We assume that getReg will always choose the same register for both x and y. If y is not already in that register Ry, then generate the machine instruction LD Ry, y. If y was already in Ry, we do nothing. It is only necessary that we adjust the register description for Ry so that it includes x as one of the values found there.
Ending the Basic Block : As we have described the algorithm, variables used by the block may wind upwith their only location being a register. If the variable is a temporary usedonly within the block, that is fine; when the block ends, we can forget aboutthe value of the temporary and assume its register is empty. However, if thevariable is live on exit from the block, or if we don't know which variables arelive on exit, then we need to assume that the value of the variable is neededlater. In that case, for each variable x whose location descriptor does not saythat its value is located in the memory location for x, we must generate theinstruction ST x, R, where R is a register in which #'s value exists at the end ofthe block.
Managing Register and Address Descriptors: As the code-generation algorithm issues load, store, and other machine instructions,it needs to update the register and address descriptors. The rules are asfollows:
1. For the instruction LD R:x
(a) Change the register descriptor for register R so it holds only x.
(b) Change the address descriptor for x by adding register R as an additional location.
2. For the instruction ST x, R, change the address descriptor for x to include its own memory location.
3. For an operation such as ADD Rx,Ry,Rz implementing a three-address instruction x — y z
(a) Change the register descriptor for Rx so that it holds only x.
(b) Change the address descriptor for x so that its only location is Rx. Note that the memory location for x is not now in the address descriptor for x.
(c) Remove Rx from the address descriptor of any variable other than x.
4. When we process a copy statement x = y, after generating the load for y into register Ry, if needed, and after managing descriptors as for all load statements (per rule 1):
(a) Add x to the register descriptor for Ry.
(b) Change the address descriptor for x so that its only location is Ry.