## Paths in the Graphs

**Introduction:** A path is a sequence of edges that begins at a vertex of a graph and travels from vertex to vertex along edges of the graph. As the path travels along its edges, it visits the vertices along this path, that is, the endpoints of these edges.

**DEFINITION 1** Let n be a nonnegative integer and G an undirected graph. A path of length n from u to v in G is a sequence of n edges e1, . . . , en of G for which there exists a sequence x0 = u, x1, . . . , xn−1, xn = v of vertices such that ei has, for i = 1, . . . , n, the endpoints xi−1 and xi . When the graph is simple, we denote this path by its vertex sequence x0, x1, . . . , xn (because listing these vertices uniquely determines the path). The path is a circuit if it begins and ends at the same vertex, that is, if u = v, and has length greater than zero. The path or circuit is said to pass through the vertices x1, x2, . . . , xn−1 or traverse the edges e1, e2, . . . , en.

A path or circuit is simple if it does not contain the same edge more than once. When it is not necessary to distinguish between multiple edges, we will denote a path e1, e2, . . . , en, where ei is associated with {xi−1, xi } for i = 1, 2, . . . , n by its vertex sequence x0, x1, . . . , xn. This notation identifies a path only as far as which vertices it passes through. Consequently, it does not specify a unique path when there is more than one path that passes through this sequence of vertices, which will happen if and only if there are multiple edges between some successive vertices in the list. Note that a path of length zero consists of a single vertex.

**Remark:** There is considerable variation of terminology concerning the concepts defined in Definition 1. For instance, in some books, the term walk is used instead of path, where a walk is defined to be an alternating sequence of vertices and edges of a graph, v0, e1, v1, e2, . . . , vn−1, en, vn, where vi−1 and vi are the endpoints of ei for i = 1, 2, . . . , n. When this terminology is used, closed walk is used instead of circuit to indicate a walk that begins and ends at the same vertex, and trail is used to denote a walk that has no repeated edge (replacing the term simple path). When this terminology is used, the terminology path is often used for a trail with no repeated vertices, conflicting with the terminology in Definition 1. Because of this variation in terminology, you will need to make sure which set of definitions are used in a particular book or article when you read about traversing edges of a graph. The text [GrYe06] is a good reference for the alternative terminology described in this remark.

**EXAMPLE 1** In the simple graph shown in Figure 1, a, d, c, f , e is a simple path of length 4, because {a, d}, {d, c}, {c, f }, and {f, e} are all edges. However, d, e, c, a is not a path, because {e, c} is not an edge. Note that b, c, f , e, b is a circuit of length 4 because {b, c}, {c, f }, {f, e}, and {e, b} are edges, and this path begins and ends at b. The path a, b, e, d, a, b, which is of length 5, is not simple because it contains the edge {a, b} twice.

We now provide more general definitions.

**DEFINITION 2 **Let n be a nonnegative integer andGa directed graph.A path of length n from u to v inGis a sequence of edges e1, e2, . . . , en ofGsuch that e1 is associated with (x0, x1), e2 is associated with (x1, x2), and so on, with en associated with (xn−1, xn), where x0 = u and xn = v. When there are no multiple edges in the directed graph, this path is denoted by its vertex sequence x0, x1, x2, . . . , xn.A path of length greater than zero that begins and ends at the same vertex is called a circuit or cycle. A path or circuit is called simple if it does not contain the same edge more than once.

**Remark: **Terminology other than that given in Definition 2 is often used for the concepts defined there. In particular, the alternative terminology that uses walk, closed walk, trail, and path (described in the remarks following Definition 1) may be used for directed graphs. Note that the terminal vertex of an edge in a path is the initial vertex of the next edge in the path. When it is not necessary to distinguish between multiple edges, we will denote a path e1, e2, . . . , en, where ei is associated with (xi−1, xi) for i = 1, 2, . . . , n, by its vertex sequence x0, x1, . . . , xn. The notation identifies a path only as far as which the vertices it passes through.

There may be more than one path that passes through this sequence of vertices, which will happen if and only if there are multiple edges between two successive vertices in the list. Paths represent useful information in many graph models, as Examples 2–4 demonstrate.

**EXAMPLE 2** Paths in Acquaintanceship Graphs In an acquaintanceship graph there is a path between two people if there is a chain of people linking these people, where two people adjacent in the chain know one another. For example, in Figure 6 in Section 10.1, there is a chain of six people linking Kamini and Ching. Many social scientists have conjectured that almost every pair of people in the world are linked by a small chain of people, perhaps containing just five or fewer people. This would mean that almost every pair of vertices in the acquaintanceship graph containing all people in the world is linked by a path of length not exceeding four. The play Six Degrees of Separation by John Guare is based on this notion.

**EXAMPLE 3** Paths in Collaboration Graphs In a collaboration graph, two people a and b are connected by a path when there is a sequence of people starting with a and ending with b such that the endpoints of each edge in the path are people who have collaborated. We will consider two particular collaboration graphs here. First, in the academic collaboration graph of people who have written papers in mathematics, the Erd˝os number of a person m is the length of the shortest path between m and the extremely prolific mathematician Paul Erd˝os (who died in 1996). That is, the Erd˝os number of a mathematician is the length of the shortest chain of mathematicians that begins with Paul Erd˝os and ends with this mathematician, where each adjacent pair of mathematicians have written a joint paper. The number of mathematicians with each Erd˝os number as of early 2006, according to the Erd˝os Number Project, In the Hollywood graph two actors a and b are linked when there is a chain of actors linking a and b, where every two actors adjacent in the chain have acted in the same movie.

In the Hollywood graph, the Bacon number of an actor c is defined to be the length of the shortest path connecting c and the well-known actor Kevin Bacon. As new movies are made, including new ones with Kevin Bacon, the Bacon number of actors can change. In Table 2 we show the number of actors with each Bacon number as of early 2011 using data from the Oracle of Bacon website. The origins of the Bacon number of an actor dates back to the early 1990s, when Kevin Bacon remarked that he had worked with everyone in Hollywood or someone who worked with them.