Matchings

Written by Timon Gehr.

Contents

Computing Maximum Matchings in Bipartite Graphs
Implementation
The Minimum Vertex Cover Problem

Given a graph $G=(V,E)$ with a given set of vertices $V$ and a set of edges $E$ , a matching is a subset $M\subseteq E$ such that the edges in $M$ are pairwise disjoint (i.e., they do not have a common incident vertex).

For example, in the following graph, the highlighted edges form a matching:

A matching $M$ covers a vertex $v$ if there is some other vertex $u$ such that the edge $\{u,v\}$ is in $M$ . Each vertex is covered by zero or one edges in $M$ . In the example, all vertices that are covered by an edge are colored in grey.

A matching $M$ intersects an edge $e$ if it covers one of the two incident vertices of $e$ . In the example, the matching intersects all edges except $\{6,9\}$ , $\{8,9\}$ and $\{9,10\}$ .

If a matching $M$ covers each vertex, then we call $M$ a perfect matching. Not every graph has a perfect matching (for instance, the example graph cannot have a perfect matching as it has an odd number of vertices), but we can try to find a maximal matching.

A matching is called maximal with respect to inclusion if there is no matching $M'\neq M$ with $M\subset M'\subseteq E$ . In other words, it is impossible to add more edges to $M$ and still have a matching.

It is easy to find a matching that is maximal in this sense: We iterate over the edges once and add to our matching all edges that are not already incident to a covered vertex. This works because every matching can be extended to a maximal matching. For example, we can add the edge $\{8,9\}$ to our matching to obtain the following maximal matching:

Matchings $M$ that are maximal in this sense already enjoy a few useful properties. For example, the matching must intersect all edges. This means that each vertex $v$ is either covered by the matching $M$ , or all its incident vertices are covered by the matching $M$ . (In the example: each vertex is grey or all its neighbours are grey.)

A matching is called maximum (with respect to cardinality) if there is no matching $M'\subseteq E$ with $\lvert M\rvert < \lvert M'\rvert$ . In other words, there is no other matching with more edges.

For example, the following matching has cardinality $6$ , which is the largest possible in our graph, as any connected component can contain at most half as many matching edges as it contains vertices:

A maximum matching is a bit more tricky to find than one that is merely maximal, but already now it is easy, for any graph $G$ , to give lower and upper bounds on the size of a maximum matching which differ by at most a factor of two. On one hand, a maximum matching $M$ is at least as large as any inclusion-maximal matching $M'$ . On the other hand, $M$ cannot contain more than $2\cdot\lvert M'\rvert$ edges. This is because $M'$ intersects all edges of $M$ and each edge $e'$ in $M'$ intersects at most two edges of $M$ . (More formally, we have $M=\bigcup_{e'\in M'}\{e \in M\mid e\cap e'\neq \emptyset\}$ , therefore $\lvert M\rvert\le\sum_{e'\in M'}\lvert\{e\in M\mid e\cap e'\neq \emptyset\}\rvert\le\sum_{e'\in M'}2=2\lvert M'\rvert$ .)

But how do we actually find a maximum matching? Our idea will be to start with some matching $M$ (it could even be empty), and to progressively improve it until it has maximal cardinality. Note that while following this process, we may need to remove some edges from our matching that we already added to it in a previous step, because otherwise we can only hope to find an inclusion-maximal matching. We ensure progress by designing a single iteration of our algorithm such that it improves the cardinality of the matching by at least one.

Let’s now assume that $M$ is some matching which is not maximum. Consider some maximum matching $M'$ . (We do not know what $M'$ is and will therefore not be able to use it in our algorithm, but we know that it exists.) Let’s color the edges from $M$ blue and those from $M'$ red.

For example, we could have the following situation, where $M'$ is the maximum matching from above:

Consider the symmetric difference $M\triangle M'$ of $M$ and $M'$ . (The symmetric difference contains all edges that belong to either $M$ or $M'$ , but not to both.) Note that this symmetric difference contains more red edges than blue edges, because $M$ ’ is larger than $M$ . Now consider the graph $G'=(V,M\triangle M')$ .

For our example, this graph is as follows:

Each vertex in $G'$ has degree at most $2$ , because it can be incident to at most one blue and one red edge. Each connected component of a graph with maximum degree $2$ is either a path or a cycle. Even more is true: paths and cycles need to alternate between red and blue edges and therefore each cycle in $G'$ must have even length. In our example, there is one such alternating cycle: $(0,1,2,3,0)$ . It follows that each cycle in $G'$ contains equally many red and blue edges.

Therefore, there must exist a connected component in $G'$ which is a path that contains more red edges than blue edges, because overall, $G'$ contains more red edges than blue edges. Such a path must start with a red edge, then alternate between blue and red edges, until it stops with a red edge.

In our example, there is one such alternating path: $(7,8,9,10)$ .

Now note that we can improve the size of our matching $M$ by removing all blue edges from our path from it and adding all red edges from our path to it. In our example, this leads to the following improved matching $M$ :

Inspired by those observations, we say that for a matching $M$ , a path in $G$ is $M$ -augmenting if it starts and ends with vertices that are not covered by $M$ and alternates between edges in $M$ and edges that do not belong to $M$ . If there exists an $M$ -augmenting path, we can improve our matching by switching all edges on the path into or out of the matching $M$ . On the other hand, we have just shown that whenever $M$ is not maximum, there is an $M$ -augmenting path.

Therefore, our algorithm will proceed as follows: While there exists an $M$ -augmenting path, use it to improve $M$ . This algorithm terminates because $M$ is improved in each iteration and the maximal cardinality of a matching in $G$ is bounded. When the algorithm terminates, this means that there does not exist an $M$ -augmenting path in $G$ , which means that the returned matching $M$ is maximum.

Computing Maximum Matchings in Bipartite Graphs

It remains to find an algorithm that finds an $M$ -augmenting path or decides that it does not exist. We will only consider the case where our graph is bipartite. (In a general graph, we can still compute maximal matchings by repeatedly finding augmenting paths, but it is a bit more complicated and outside the scope of IOI.)

In this case, finding an augmenting path is in fact easy: we can use DFS or BFS, slightly adapted such that they alternate between edges outside and within the matching. We start the graph search from an uncovered vertex and try to find another uncovered vertex reachable on an alternating path. A path found this way is an augmenting path, and if we can’t find any such path, there is in fact no augmenting path.

Because the resulting matching has at most $\lvert V\rvert/2$ edges, we need to find at most $O(\lvert V\rvert)$ augmenting paths. The total running time of the algorithm is therefore $O(\lvert V\rvert\cdot(\lvert V\rvert+\lvert E\rvert))$ .

Implementation

We provide an implementation using DFS which combines finding augmenting paths and improving the matching in the same recursion. The first partition contains $a$ vertices and the second partition contains $b$ vertices. The graph is stored as an adjacency list, where the nodes in the first partition are numbered from $0$ to $a-1$ and the nodes in the second partition are numbered from $a$ to $a+b-1$ . The matching is stored as a vector of size a. The $i$ -th entry of the vector contains the number of the partner of node $i$ in the matching, or $-1$ if the matching does not cover the vertex.

The DFS is modified as follows: In the first partition, we may only use edges that are in the matching, and in the second partition, we may only use edges that are not in the matching. This can be interpreted as implicitly directing the graph, such that edges in the matching go from the first partition to the second partition, and edges outside the matching go from the second partition to the first partition. In this graph, any path from the second partition to the first partition between two non-covered vertices alternates between non-matching edges and matching edges and therefore is an augmenting path. This is also an easy way to see why DFS can indeed be used to decide whether there is an augmenting path.

int a, b;
vector<vector<int>> g; // bipartite graph on a+b nodes as adjacency list.
vector<int> matching; // the current matching, mapping nodes 0 to a-1 to their partner or -1

vector<bool> visited;
bool improveMatching(int i) {
    if (visited[i]) return false;
    visited[i] = true;
    if (i < a) { // in first partition, must use matching edge
        if (matching[i] == -1) return true; // found augmenting path
        if (improveMatching(matching[i]))
            return true;
    } else { // in second partition, may only use edges not in the matching
        for (int j : g[i]){
            if (matching[j] == i) continue; // matching edge
            if (improveMatching(j)){
                matching[j] = i; // update matching
                return true; // matching improved
            }
        }
    }
    return false;
}
vector<bool> covered;
bool improveMatching() {
    covered.assign(b, false);
    for (int i = 0; i < a; i++)
        if (matching[i] != -1)
            covered[matching[i] - a] = true;
    visited.assign(a + b, false);
    for (int i = a; i < a + b; i++)
        if (!covered[i - a] && improveMatching(i))
            return true;
    return false; // cannot improve matching, it is maximum
}
int maximumMatching() { // returns size of maximum matching
    matching.assign(a, -1);
    while (improveMatching()) {}
    return a - (int)count(matching.begin(), matching.end(), -1);
}

(Note that this algorithm would still compute the correct result even if, for the second partition, we didn’t track visited flags nor checked for matching edges.)

The Minimum Vertex Cover Problem

A vertex cover of a graph $G=(V,E)$ is a subset $C\subseteq V$ of the vertices such that each edge in $E$ is incident to at least one vertex in $C$ .

For example, the highlighted vertices in the following graph form a vertex cover; every edge connects two vertices such that at least one of them is highlighted in blue.

Because each edge in a matching needs to contain at least one vertex of a cover $C$ , each vertex cover is at least as large as each matching.

In particular, a vertex cover is at least as large as a maximum matching. For example, our vertex cover has cardinality $12$ , while a maximum matching has cardinality $8$ . Each red matching edge is incident to at least one blue cover vertex.

A minimum vertex cover is a vertex cover of minimal cardinality. As it is a vertex cover, even a minimum vertex cover is at least as large as a maximum matching.

Interestingly, even the opposite inequality holds: the size of a minimum vertex cover is exactly the size of a maximum matching. This means that we can select exactly one vertex from each matching edge and still obtain a vertex cover. For example, there is the following minimum vertex cover of cardinality $8$ :

For bipartite graphs, the fact that a minimum vertex cover has the same size as a maximum matching is known as Kőnig’s theorem. We can prove it by giving an algorithm which computes a minimum vertex cover from a given maximum bipartite matching.

Let $G=(A\cup B,E)$ be a bipartite graph with first partition $A$ and second partition $B$ . Let $M$ be a maximum matching in $G$ . Denote by $B'\subseteq B$ the set of vertices in $B$ that are not covered by $M$ . Let $R$ be the set of vertices that can be reached from the vertices in $B'$ using a path that alternates between edges not in the matching and edges in the matching. (Note that this is exactly the set of vertices that are visited by the last iteration of the improveMatching algorithm.)

Let $C=(A\cap R)\cup(B\setminus R)$ . I.e., a vertex in $A$ belongs to $C$ iff it is reachable by an alternating path, and a vertex in $B$ belongs to $C$ iff it is not reachable.

Those definitions are illustrated in the following example, where vertices in $A$ are circles while vertices in $B$ are squares. We have $B'=\{6,14\}$ and all vertices in $R$ are colored grey.

The set $C$ is in fact a vertex cover. To see this, we show that for each edge $e$ , either the incident vertex in $A$ is reachable using an alternating path from $B'$ or the incident vertex in $B$ is not reachable by such an alternating path. We argue for edges inside and outside $M$ separately.

If $e$ is in $M$ and its incident vertex in $A$ is not reachable from $B'$ by an alternating path, then neither is the incident vertex in $B$ , because such an alternating path would have to use $e$ . Note that the other direction is also true: if the vertex in $A$ is reachable, we can reach the one in $B$ by using the edge $e$ in the direction from $A$ to $B$ . Therefore, $C$ contains exactly one incident vertex of each edge in $M$ .

If $e$ is not in $M$ and its incident vertex in $B$ is reachable from $B'$ by an alternating path, then so is the incident vertex in $A$ (because we can just extend the path by using the edge $e$ in the direction from $B$ to $A$ ).

This shows that $C$ is a vertex cover. Note that every vertex of $C$ is covered by $M$ : By construction $C\cap B=B\setminus R\subseteq B\setminus B'$ does not contain any vertices that are not covered, because those vertices by definition belong to $B'$ . On the other hand, if there was some vertex in $C\cap A=A\cap R$ that was not covered by $M$ , then this vertex would be the end of an augmenting path, which is impossible because $M$ is maximum.

Therefore, each vertex in $C$ is incident to some edge in $M$ . On the other hand, we know that $C$ contains exactly one incident vertex of each edge in $M$ , which implies that $\lvert C\rvert=\lvert M\rvert$ . Therefore, $C$ is a minimum vertex cover.

Assuming the above implementation of maximumMatching, we can compute a minimum vertex cover as follows:

vector<int> cover;
int minimumVertexCover() {
    int size = maximumMatching();
    for (int i = 0; i < a; i++)
        if (visited[i]) cover.push_back(i);
    for (int i = a; i < a + b; i++)
        if (!visited[i]) cover.push_back(i);
    assert((int)cover.size() == size);
    return size;
}