Importance of Interprocedural Analysis
Introduction: Given how hard interprocedural analysis is, let us now address the important problem of why and when we wish to use interprocedural analysis. Although we used constant propagation to illustrate interprocedural analysis, this interprocedural optimization is neither readily applicable nor particularly beneficial when it does occur. Most of the benefits of constant propagation can be obtained simply by performing intraprocedural analysis and inlining procedure calls of the most frequently executed sections of code.
However, there are many reasons why interprocedural analysis is essential. Below, we describe several important applications of interprocedural analysis.
Virtual Method Invocation: As mentioned above, object-oriented programs have many small methods. If we only optimize one method at a time, then there are few opportunities for optimization. Resolving method invocation enables optimization. A language like Java dynamically loads its classes. As a result, we do not know at compile time to which of (perhaps) many methods named m a use of "m" refers in an invocation such as x.mQ.
Many Java implementations use a just-in-time compiler to compile its bytecodes at run time. One common optimization is to profile the execution and determine which the common receiver types are. We can then inline the methods that are most frequently invoked. The code includes a dynamic check on the type and executes the inlined methods if the run-time object has the expected type.
Another approach to resolving uses of a method name m is possible as long as all the source code is available at compile time. Then, it is possible to perform an interprocedural analysis to determine the object types. If the type for a variable x turns out to be unique, then a use of x.mQ can be resolved. We know exactly what method m refers to in this context. In that case, we can in-line the code for this m, and the compiler does not even have to include a test for the type of x.
Pointer Alias Analysis: Even if we do not wish to perform interprocedural versions of the common dataflowanalyses like reaching definitions, these analyses can in fact benefit frominterprocedural pointer analysis. All the analyses presented in Chapter 9 applyonly to local scalar variables that cannot have aliases. However, use of pointersis common, especially in languages like C.
Example: Consider the following sequence of three statements, which might form a basic block:
Without knowing if p and q can point to the same location — that is, whether they can be aliases — we cannot conclude that x is equal to 1 at the end of the block.
Parallelization: The most effective way to parallelize an application is to find the coarsest granularity of parallelism, such as that found in the outermost loops of a program. For this task, interprocedural analysis is of great importance. There is a significant difference between scalar optimizations (those based on values of simple variables, as discussed in Chapter 9) and parallelization. In parallelization, just one spurious data dependence can render an entire loop not parallelizable, and greatly reduce the effectiveness of the optimization. Such amplification of inaccuracies is not seen in scalar optimizations. In scalar optimization, we only need to find the majority of the optimization opportunities. Missing one opportunity or two seldom makes much of a difference.
Detection of Software Errors and Vulnerabilities: Interprocedural analysis is not only important for optimizing code. The sametechniques can be used to analyze existing software for many kinds of codingerrors. These errors can render software unreliable; coding errors that hackerscan exploit to take control of, or otherwise damage, a computer system canpose significant security vulnerability risks.
Static analysis is useful in detecting occurrences of many common error patterns. For example, a data item must be guarded by a lock. As another example, disabling an interrupt in the operating system must be followed by a re-enabling of the interrupt. Since a significant source of errors is the inconsistencies that span procedure boundaries, interprocedural analysis is of great importance. PREfix and Metal are two practical tools that use interprocedural analysis effectively to find many programming errors in large programs. Such tools find errors statically and can improve software reliability greatly. However, these tools are both incomplete and unsound, in the sense that they may not find all errors, and not all reported warnings are real errors. Unfortunately, the interprocedural analysis used is sufficiently imprecise that, were the tools to report all potential errors, the large number of false warnings would render the tools unusable. Nevertheless, even though these tools are not perfect, their systematic use has been shown to greatly improve software reliability.
When it comes to security vulnerabilities, it is highly desirable that we find all the potential errors in a program. In 2006, two of the "most popular" forms of intrusions used by hackers to compromise a system were
1. Lack of input validation on Web applications: SQL injection is one of the most popular forms of such vulnerability whereby hackers gain control of a database by manipulating inputs accepted by web applications.
2. Buffer overflows in C and C programs. Because C and C do not check if accesses to arrays are in bounds, hackers can write well-crafted strings into unintended areas and hence gain control of the program's execution.
In the next section, we shall discuss how we can use interprocedural analysis to protect programs against such vulnerabilities.