Register Allocation and Code Generation
Introduction: Let us begin by discussing register allocation for the software-pipelined loop in Example 10.14.
Example: In Example, the result of the multiply operation in the first iteration is produced at clock 3 and used at clock 6. Between these clock cycles, a new result is generated by the multiply operation in the second iteration at clock 5; this value is used at clock 8. The results from these two iterations must be held in different registers to prevent them from interfering with each other. Since interference occurs only between adjacent pairs of iterations, it can be avoided with the use of two registers, one for the odd iterations and one for the even iterations. Since the code for odd iterations is different from that for the even iterations, the size of the steady-state loop is doubled. This code can be used to execute any loop that has an odd number of iterations greater than or equal to 5.
if (N >= 5)
N2 = 3 2 * floor((N-3)/2);
N2 = 0;
for (i = 0; i < N2; i )
D[i] = A[i]* B[i] c;
for (i = N2; i < N; i )
D[i] = A[i]* B[i] c;
Figure 10.21: Source-level unrolling of the loop from Example 10.12
To handle loops that have fewer than 5 iterations and loops with an even number of iterations, we generate the code whose source-level equivalent is shown in Fig. 10.21. The first loop is pipelined, as seen in the machine-level equivalent of Fig. 10.22. The second loop of Fig. 10.21 need not be optimized, since it can iterate at most four times.
Do-Across Loops: Software pipelining can also be applied to loops whose iterations share data dependences. Such loops are known as do-across loops.
Goals and Constraints of Software Pipelining: The primary goal of software pipelining is to maximize the throughput of along-running loop. A secondary goal is to keep the size of the code generatedreasonably small. In other words, the software-pipelined loop should have asmall steady state of the pipeline. We can achieve a small steady state byrequiring that the relative schedule of each iteration be the same, and that theiterations be initiated at a constant interval. Since the throughput of the loop issimply the inverse of the initiation interval, the objective of software pipeliningis to minimize this interval.
A software-pipeline schedule for a data-dependence graph G — (N, E) can be specified by
1. An initiation interval T and
2. A relative schedule 5" that specifies, for each operation, when that operation is executed relative to the start of the iteration to which it belongs.
Thus, an operation n in the ith iteration, counting from 0, is executed at clock i x T S(n). Like all the other scheduling problems, software pipelining has two kinds of constraints: resources and data dependences. We discuss each kind in detail below.
Modular Resource Reservation: Let a machine's resources be represented by R = [ri,r2, • . . ] , where ri is thenumber of units of the ith kind of resource available. If an iteration of a looprequires ni units of resource i, then the average initiation interval of a pipelinedloop is at least m.axi(rii/ri) clock cycles. Software pipelining requires that theinitiation intervals between any pair of iterations have a constant value. Thus,the initiation interval must have at least maxi\ni/ri] clocks. If max^(rii/ri) isless than 1, it is useful to unroll the source code a small number of times.