## Datalog Implementation by BDD's

**Introduction:** Binary Decision Diagrams (BDD's) are a method for representing boolean functions by graphs. Since there are 22 " boolean functions of n variables, no representation method is going to be very succinct on all boolean functions. However, the boolean functions that appear in practice tend to have a lot of regularity. It is thus common that one can find a succinct BDD for functions that one really wants to represent.

It turns out that the boolean functions that are described by the Datalog programs that we have developed to analyze programs are no exception. While succinct BDD's representing information about a program often must be found using heuristics and/or techniques used in commercial BDD-manipulating packages, the BDD approach has been quite successful in practice. In particular, it outperforms methods based on conventional database-management systems, because the latter are designed for the more irregular data patterns that appear in typical commercial data.

It is beyond the scope of this book to cover all of the BDD technology that has been developed over the years. We shall here introduce you to the BDD notation. We then suggest how one represents relational data as BDD's and how one could manipulate BDD's to reflect the operations that are performed to execute Datalog programs by algorithms such as Algorithm 12.18. Finally, we describe how to represent the exponentially many contexts in BDD's, the key to the success of the use of BDD's in context-sensitive analysis.

**Binary Decision Diagrams: **A BDD represents a boolean function by a rooted DAG. The interior nodes ofthe DAG are each labeled by one of the variables of the represented function.At the bottom are two leaves, one labeled 0 the other labeled 1. Each interiornode has two edges to children; these edges are called "low" and "high." Thelow edge is associated with the case that the variable at the node has value 0,and the high edge is associated with the case where the variable has value 1.

Given a truth assignment for the variables, we can start at the root, and at each node, say a node labeled x, follow the low or high edge, depending on whether the truth value for x is 0 or 1, respectively. If we arrive at the leaf labeled 1, then the represented function is true for this truth assignment; otherwise it is false.

**Transformations on BDD's: **We alluded, in the discussion above, to two simplifications on BDD's that helpmake them more succinct:

1. Short-Circuiting: If a node N has both its high and low edges go to the same node M, then we may eliminate N. Edges entering N go to M instead.

2. Node-Merging: If two nodes N and M have low edges that go to the same node and also have high edges that go to the same node, then we may merge N with M. Edges entering either N or M go to the merged node.

It is also possible to run these transformations in the opposite direction. In particular, we can introduce a node along an edge from N to M. Both edges from the introduced node go to M, and the edge from TV" now goes to the introduced node. Note, however, that the variable assigned to the new node must be one of those that lies between the variables of N and M in the order. Figure 12.32 shows the two transformations schematically.

**Representing Relations by BDD's: **The relations with which we have been dealing have components that are takenfrom "domains." A domain for a component of a relation is the set of possiblevalues that tuples can have in that component. For example, the relationpts(V, H) has the domain of all program variables for its first component andthe domain of all object-creating statements for the second component. If adomain has more than 2 n _ 1 possible values but no more than 2n values, thenit requires n bits or boolean variables to represent values in that domain.

A tuple in a relation may thus be viewed as a truth assignment to the variables that represent values in the domains for each of the components of the tuple. We may see a relation as a boolean function that returns the value true for all and only those truth assignments that represent tuples in the relation. An example should make these ideas clear.

**Example:** Consider a relation r(A, B) such that the domains of both A and B are {a, b, c, d}. We shall encode a by bits 00, b by 01, c by 10, and d by 11. Let the tuples of relation r be:

Let us use boolean variables wx to encode the first (A) component and variables yz to encode the second (B) component. Then the relation r becomes: