Execution of Datalog Programs
Introduction: Every set of Datalog rules defines relations for its IDB predicates, as a function of the relations that are given for its EDB predicates. Start with the assumption that the IDB relations are empty (i.e., the IDB predicates are false for all possible arguments). Then, repeatedly apply the rules, inferring new facts whenever the rules require us to do so. When the process converges, we are done, aijid the resulting IDB relations form the output of the program. This process is formalized in the next algorithm, which is similar to the iterative algorithms discussed in Chapter 9.
Algorithm: Simple evaluation of Datalog programs.
INPUT: A Datalog program and sets of facts for each EDB predicate.
OUTPUT: Sets of facts for each IDB predicate.
METHOD: For each predicate p in the program, let Rp be the relation of facts that are true for that predicate. If p is an EDB predicate, then Rp is the set of facts given for that predicate. If p is an IDB predicate, we shall compute Rp. Execute the algorithm in Fig. 12.16.
Example: The program in Example 12.12 computes paths in a graph. To apply Algorithm 12.15, we start with EDB predicate edge holding all the edges of the graph and with the relation for path empty. On the first round, rule (2) Vields nothing, since there are no path facts. But rule (1) causes all the edge facp to become path facts as well. That is, after the first round, we know path(a, 6) if and only if there is an edge from a to 6.
On the second round, rule (1) yields no new paths facts, because the EDB relation edge never changes. However, now rule (2) lets us put together two paths of length 1 to make paths of length 2. That is, after the second round, path(a, b) is true if and only if there is a path of length 1 or 2 from a to b. Similarly, on the third round, we can combine paths of length 2 or less to discover all paths of length 4 or less. On the fourth round, we discover paths of length up to to 8, and in general, after the ith round, path(a, b) is true if and only if there is a path from a to 6 of length 2l~1 or less.
Incremental Evaluation of Datalog Programs: There is an efficiency enhancement of Algorithm 12.15 possible. Observe that anew IDB fact can only be discovered on round i if it is the result of substitutingconstants in a rule, such that at least one of the subgoals becomes a fact thatwas just discovered on round i — The proof of that claim is that if all the factsamong the subgoals were known at round i — 2, then the "new" fact would havebeen discovered when we made the same substitution of constants on round
To take advantage of this observation, introduce for each IDB predicate p a predicate newP that will hold only the newly discovered p-facts from the previous round. Each rule that has One or more IDB predicates among its subgoals is replaced by a collection of rules. Each rule in the collection is formed by replacing exactly one occurrence of some IDB predicate q in the body by newQ. Finally, for all rules, we replace the head predicate h by newH. The resulting rules are said to be in incremental form.
The relations for each IDB predicate p accumulates all the p-facts, as in Algorithm 12.15. In one round, we
1. Apply the rules to evaluate the newP predicates.
2. Then, subtract p from newP, to make sure the facts in newP are truly new.
3. Add the facts in newP to p.
4. Set all the newX relations to 0 for the next round.
These ideas will be formalized in Algorithm 12.18. However, first, we shall give an example.
Example: The incremental form of the rules is given in Fig. 12.17. Rule (1) does not change, except in the head because it has no IDB subgoals in the body. However, rule (2), with two IDB subgoals, becomes two different rules. In each rule, one of the occurrences oipath in the body is replaced by newPath. Together, these rules enforce the idea that at least one of the two paths concatenated by the rule must have been discovered on the previous round.
Algorithm:Incremental evaluation of Datalog programs.
INPUT:A Datalog program and sets of facts for each EDB predicate.
OUTPUT:Sets of facts for each IDB predicate.
METHOD:For each predicate p in the program, let Rp be the relation of facts that are true for that predicate. If p is an EDB predicate, then Rp is the set of facts given for that predicate. If p is an IDB predicate, we shall compute Rp. In addition, for each IDB predicate p, let Rnewp be a relation of "new" facts for predicate p.
1. Modify the rules into the incremental form described above.
2. Execute the algorithm in Fig. 12.18.