Algorithm for Unification
Introduction: Unification is the problem of determining whether two expressions s and t can be made identical by substituting expressions for the variables in s and t. Testing equality of expressions is a special case of unification; if s and t have constants but no variables, then s and t unify if and only if they are identical.
The unification algorithm in this section extends to graphs with cycles, so it can be used to test structural equivalence of circular types.
We shall implement a graph-theoretic formulation of unification, where types are represented by graphs. Type variables are represented by leaves and type constructors are represented by interior nodes. Nodes are grouped into equivalence classes; if two nodes are in the same equivalence class, then the type expressions they represent must unify. Thus, all interior nodes in the same class must be for the same type constructor, and their corresponding children must be equivalent.
Example: Consider the two type expressions
((ai a2) x Ust(a3)) - Ust(ct2)
((as -»• 0:4) x list(a3)) -»• 0:5
The following substitution 5 is the most general unifier for these expressions
This substitution maps the two type expressions to the following expression
((ax -» 0:2) x /«si(cKi)) —y list(a2)
The two expressions are represented by the two nodes labeled -»: 1 in Fig. 6.31. The integers at the nodes indicate the equivalence classes that the nodes belong to after the nodes numbered 1 are unified.
Algorithm: Unification of a pair of nodes in a type graph.
INPUT: A graph representing a type and a pair of nodes m and n to be unified.
OUTPUT: Boolean value true if the expressions represented by the nodes m and n unify; false, otherwise.
METHOD: A node is implemented by a record with fields for a binary operator and pointers to the left and right children. The sets of equivalent nodes are maintained using the set field. One node in each equivalence class is chosen to be the unique representative of the equivalence class by making its set field contains a null pointer. The set fields of the remaining nodes in the equivalence class will point (possibly indirectly through other nodes in the set) to the representative. Initially, each node n is in an equivalence class by itself, with n as its own representative node.