Matchings

Written by Timon Gehr.

Given a graph G=(V,E)G=(V,E) with a given set of vertices VV and a set of edges EE, a matching is a subset MEM\subseteq E such that the edges in MM are pairwise disjoint (i.e., they do not have a common incident vertex).

For example, in the following graph, the highlighted edges form a matching:

%3 0 0 1 1 0--1 5 5 0--5 2 2 1--2 6 6 1--6 3 3 2--3 3--0 4 4 4--5 7 7 4--7 5--6 8 8 5--8 9 9 6--9 7--8 8--9 10 10 9--10 11 11 12 12 11--12 13 13 12--13

A matching MM covers a vertex vv if there is some other vertex uu such that the edge {u,v}\{u,v\} is in MM. Each vertex is covered by zero or one edges in MM. In the example, all vertices that are covered by an edge are colored in grey.

A matching MM intersects an edge ee if it covers one of the two incident vertices of ee. In the example, the matching intersects all edges except {6,9}\{6,9\}, {8,9}\{8,9\} and {9,10}\{9,10\}.

If a matching MM covers each vertex, then we call MM a perfect matching. Not every graph has a perfect matching (for instance, the example graph cannot have a perfect matching as it has an odd number of vertices), but we can try to find a maximal matching.

A matching is called maximal with respect to inclusion if there is no matching MMM'\neq M with MMEM\subset M'\subseteq E. In other words, it is impossible to add more edges to MM and still have a matching.

It is easy to find a matching that is maximal in this sense: We iterate over the edges once and add to our matching all edges that are not already incident to a covered vertex. This works because every matching can be extended to a maximal matching. For example, we can add the edge {8,9}\{8,9\} to our matching to obtain the following maximal matching:

%3 0 0 1 1 0--1 5 5 0--5 2 2 1--2 6 6 1--6 3 3 2--3 3--0 4 4 4--5 7 7 4--7 5--6 8 8 5--8 9 9 6--9 7--8 8--9 10 10 9--10 11 11 12 12 11--12 13 13 12--13

Matchings MM that are maximal in this sense already enjoy a few useful properties. For example, the matching must intersect all edges. This means that each vertex vv is either covered by the matching MM, or all its incident vertices are covered by the matching MM. (In the example: each vertex is grey or all its neighbours are grey.)

A matching is called maximum (with respect to cardinality) if there is no matching MEM'\subseteq E with M<M\lvert M\rvert < \lvert M'\rvert. In other words, there is no other matching with more edges.

For example, the following matching has cardinality 66, which is the largest possible in our graph, as any connected component can contain at most half as many matching edges as it contains vertices:

%3 0 0 1 1 0--1 5 5 0--5 2 2 1--2 6 6 1--6 3 3 2--3 3--0 4 4 4--5 7 7 4--7 5--6 8 8 5--8 9 9 6--9 7--8 8--9 10 10 9--10 11 11 12 12 11--12 13 13 12--13

A maximum matching is a bit more tricky to find than one that is merely maximal, but already now it is easy, for any graph GG, to give lower and upper bounds on the size of a maximum matching which differ by at most a factor of two. On one hand, a maximum matching MM is at least as large as any inclusion-maximal matching MM'. On the other hand, MM cannot contain more than 2M2\cdot\lvert M'\rvert edges. This is because MM' intersects all edges of MM and each edge ee' in MM' intersects at most two edges of MM. (More formally, we have M=eM{eMee}M=\bigcup_{e'\in M'}\{e \in M\mid e\cap e'\neq \emptyset\}, therefore MeM{eMee}eM2=2M\lvert M\rvert\le\sum_{e'\in M'}\lvert\{e\in M\mid e\cap e'\neq \emptyset\}\rvert\le\sum_{e'\in M'}2=2\lvert M'\rvert.)

But how do we actually find a maximum matching? Our idea will be to start with some matching MM (it could even be empty), and to progressively improve it until it has maximal cardinality. Note that while following this process, we may need to remove some edges from our matching that we already added to it in a previous step, because otherwise we can only hope to find an inclusion-maximal matching. We ensure progress by designing a single iteration of our algorithm such that it improves the cardinality of the matching by at least one.

Let’s now assume that MM is some matching which is not maximum. Consider some maximum matching MM'. (We do not know what MM' is and will therefore not be able to use it in our algorithm, but we know that it exists.) Let’s color the edges from MM blue and those from MM' red.

For example, we could have the following situation, where MM' is the maximum matching from above:

%3 0 0 1 1 0--1 5 5 0--5 2 2 1--2 6 6 1--6 3 3 2--3 3--0 4 4 4--5 7 7 4--7 5--6 8 8 5--8 9 9 6--9 7--8 8--9 10 10 9--10 11 11 12 12 11--12 11--12 13 13 12--13

Consider the symmetric difference MMM\triangle M' of MM and MM'. (The symmetric difference contains all edges that belong to either MM or MM', but not to both.) Note that this symmetric difference contains more red edges than blue edges, because MM’ is larger than MM. Now consider the graph G=(V,MM)G'=(V,M\triangle M').

For our example, this graph is as follows:

%3 0 0 1 1 0--1 2 2 1--2 3 3 2--3 3--0 4 4 5 5 4--5 6 6 5--6 7 7 8 8 7--8 9 9 8--9 10 10 9--10 11 11 12 12 13 13

Each vertex in GG' has degree at most 22, because it can be incident to at most one blue and one red edge. Each connected component of a graph with maximum degree 22 is either a path or a cycle. Even more is true: paths and cycles need to alternate between red and blue edges and therefore each cycle in GG' must have even length. In our example, there is one such alternating cycle: (0,1,2,3,0)(0,1,2,3,0). It follows that each cycle in GG' contains equally many red and blue edges.

Therefore, there must exist a connected component in GG' which is a path that contains more red edges than blue edges, because overall, GG' contains more red edges than blue edges. Such a path must start with a red edge, then alternate between blue and red edges, until it stops with a red edge.

In our example, there is one such alternating path: (7,8,9,10)(7,8,9,10).

Now note that we can improve the size of our matching MM by removing all blue edges from our path from it and adding all red edges from our path to it. In our example, this leads to the following improved matching MM:

%3 0 0 1 1 0--1 5 5 0--5 2 2 1--2 6 6 1--6 3 3 2--3 3--0 4 4 4--5 7 7 4--7 5--6 8 8 5--8 9 9 6--9 7--8 8--9 10 10 9--10 11 11 12 12 11--12 13 13 12--13

Inspired by those observations, we say that for a matching MM, a path in GG is MM-augmenting if it starts and ends with vertices that are not covered by MM and alternates between edges in MM and edges that do not belong to MM. If there exists an MM-augmenting path, we can improve our matching by switching all edges on the path into or out of the matching MM. On the other hand, we have just shown that whenever MM is not maximum, there is an MM-augmenting path.

Therefore, our algorithm will proceed as follows: While there exists an MM-augmenting path, use it to improve MM. This algorithm terminates because MM is improved in each iteration and the maximal cardinality of a matching in GG is bounded. When the algorithm terminates, this means that there does not exist an MM-augmenting path in GG, which means that the returned matching MM is maximum.

Computing Maximum Matchings in Bipartite Graphs

It remains to find an algorithm that finds an MM-augmenting path or decides that it does not exist. We will only consider the case where our graph is bipartite. (In a general graph, we can still compute maximal matchings by repeatedly finding augmenting paths, but it is a bit more complicated and outside the scope of IOI.)

In this case, finding an augmenting path is in fact easy: we can use DFS or BFS, slightly adapted such that they alternate between edges outside and within the matching. We start the graph search from an uncovered vertex and try to find another uncovered vertex reachable on an alternating path. A path found this way is an augmenting path, and if we can’t find any such path, there is in fact no augmenting path.

Because the resulting matching has at most V/2\lvert V\rvert/2 edges, we need to find at most O(V)O(\lvert V\rvert) augmenting paths. The total running time of the algorithm is therefore O(V(V+E))O(\lvert V\rvert\cdot(\lvert V\rvert+\lvert E\rvert)).

Implementation

We provide an implementation using DFS which combines finding augmenting paths and improving the matching in the same recursion. The first partition contains aa vertices and the second partition contains bb vertices. The graph is stored as an adjacency list, where the nodes in the first partition are numbered from 00 to a1a-1 and the nodes in the second partition are numbered from aa to a+b1a+b-1. The matching is stored as a vector of size a. The ii-th entry of the vector contains the number of the partner of node ii in the matching, or 1-1 if the matching does not cover the vertex.

The DFS is modified as follows: In the first partition, we may only use edges that are in the matching, and in the second partition, we may only use edges that are not in the matching. This can be interpreted as implicitly directing the graph, such that edges in the matching go from the first partition to the second partition, and edges outside the matching go from the second partition to the first partition. In this graph, any path from the second partition to the first partition between two non-covered vertices alternates between non-matching edges and matching edges and therefore is an augmenting path. This is also an easy way to see why DFS can indeed be used to decide whether there is an augmenting path.

int a, b;
vector<vector<int>> g; // bipartite graph on a+b nodes as adjacency list.
vector<int> matching; // the current matching, mapping nodes 0 to a-1 to their partner or -1

vector<bool> visited;
bool improveMatching(int i) {
    if (visited[i]) return false;
    visited[i] = true;
    if (i < a) { // in first partition, must use matching edge
        if (matching[i] == -1) return true; // found augmenting path
        if (improveMatching(matching[i]))
            return true;
    } else { // in second partition, may only use edges not in the matching
        for (int j : g[i]){
            if (matching[j] == i) continue; // matching edge
            if (improveMatching(j)){
                matching[j] = i; // update matching
                return true; // matching improved
            }
        }
    }
    return false;
}
vector<bool> covered;
bool improveMatching() {
    covered.assign(b, false);
    for (int i = 0; i < a; i++)
        if (matching[i] != -1)
            covered[matching[i] - a] = true;
    visited.assign(a + b, false);
    for (int i = a; i < a + b; i++)
        if (!covered[i - a] && improveMatching(i))
            return true;
    return false; // cannot improve matching, it is maximum
}
int maximumMatching() { // returns size of maximum matching
    matching.assign(a, -1);
    while (improveMatching()) {}
    return a - (int)count(matching.begin(), matching.end(), -1);
}

(Note that this algorithm would still compute the correct result even if, for the second partition, we didn’t track visited flags nor checked for matching edges.)

The Minimum Vertex Cover Problem

A vertex cover of a graph G=(V,E)G=(V,E) is a subset CVC\subseteq V of the vertices such that each edge in EE is incident to at least one vertex in CC.

For example, the highlighted vertices in the following graph form a vertex cover; every edge connects two vertices such that at least one of them is highlighted in blue.

%3 0 0 1 1 0--1 5 5 0--5 2 2 1--2 6 6 1--6 3 3 2--3 3--0 4 4 4--5 7 7 4--7 5--6 8 8 5--8 9 9 6--9 7--8 8--9 10 10 9--10 11 11 12 12 11--12 13 13 12--13 14 14 15 15 14--15 16 16 15--16 17 17 15--17 18 18 17--18 19 19 17--19

Because each edge in a matching needs to contain at least one vertex of a cover CC, each vertex cover is at least as large as each matching.

In particular, a vertex cover is at least as large as a maximum matching. For example, our vertex cover has cardinality 1212, while a maximum matching has cardinality 88. Each red matching edge is incident to at least one blue cover vertex.

%3 0 0 1 1 0--1 5 5 0--5 2 2 1--2 6 6 1--6 3 3 2--3 3--0 4 4 4--5 7 7 4--7 5--6 8 8 5--8 9 9 6--9 7--8 8--9 10 10 9--10 11 11 12 12 11--12 13 13 12--13 14 14 15 15 14--15 16 16 15--16 17 17 15--17 18 18 17--18 19 19 17--19

A minimum vertex cover is a vertex cover of minimal cardinality. As it is a vertex cover, even a minimum vertex cover is at least as large as a maximum matching.

Interestingly, even the opposite inequality holds: the size of a minimum vertex cover is exactly the size of a maximum matching. This means that we can select exactly one vertex from each matching edge and still obtain a vertex cover. For example, there is the following minimum vertex cover of cardinality 88:

%3 0 0 1 1 0--1 5 5 0--5 2 2 1--2 6 6 1--6 3 3 2--3 3--0 4 4 4--5 7 7 4--7 5--6 8 8 5--8 9 9 6--9 7--8 8--9 10 10 9--10 11 11 12 12 11--12 13 13 12--13 14 14 15 15 14--15 16 16 15--16 17 17 15--17 18 18 17--18 19 19 17--19

For bipartite graphs, the fact that a minimum vertex cover has the same size as a maximum matching is known as Kőnig’s theorem. We can prove it by giving an algorithm which computes a minimum vertex cover from a given maximum bipartite matching.

Let G=(AB,E)G=(A\cup B,E) be a bipartite graph with first partition AA and second partition BB. Let MM be a maximum matching in GG. Denote by BBB'\subseteq B the set of vertices in BB that are not covered by MM. Let RR be the set of vertices that can be reached from the vertices in BB' using a path that alternates between edges not in the matching and edges in the matching. (Note that this is exactly the set of vertices that are visited by the last iteration of the improveMatching algorithm.)

Let C=(AR)(BR)C=(A\cap R)\cup(B\setminus R). I.e., a vertex in AA belongs to CC iff it is reachable by an alternating path, and a vertex in BB belongs to CC iff it is not reachable.

Those definitions are illustrated in the following example, where vertices in AA are circles while vertices in BB are squares. We have B={6,14}B'=\{6,14\} and all vertices in RR are colored grey.

%3 0 0 1 1 0--1 5 5 0--5 2 2 1--2 6 6 1--6 3 3 2--3 3--0 4 4 4--5 7 7 4--7 5--6 8 8 5--8 9 9 6--9 7--8 8--9 10 10 9--10 11 11 12 12 11--12 13 13 12--13 14 14 15 15 14--15 16 16 15--16 17 17 15--17 18 18 17--18 19 19 17--19

The set CC is in fact a vertex cover. To see this, we show that for each edge ee, either the incident vertex in AA is reachable using an alternating path from BB' or the incident vertex in BB is not reachable by such an alternating path. We argue for edges inside and outside MM separately.

If ee is in MM and its incident vertex in AA is not reachable from BB' by an alternating path, then neither is the incident vertex in BB, because such an alternating path would have to use ee. Note that the other direction is also true: if the vertex in AA is reachable, we can reach the one in BB by using the edge ee in the direction from AA to BB. Therefore, CC contains exactly one incident vertex of each edge in MM.

If ee is not in MM and its incident vertex in BB is reachable from BB' by an alternating path, then so is the incident vertex in AA (because we can just extend the path by using the edge ee in the direction from BB to AA).

This shows that CC is a vertex cover. Note that every vertex of CC is covered by MM: By construction CB=BRBBC\cap B=B\setminus R\subseteq B\setminus B' does not contain any vertices that are not covered, because those vertices by definition belong to BB'. On the other hand, if there was some vertex in CA=ARC\cap A=A\cap R that was not covered by MM, then this vertex would be the end of an augmenting path, which is impossible because MM is maximum.

Therefore, each vertex in CC is incident to some edge in MM. On the other hand, we know that CC contains exactly one incident vertex of each edge in MM, which implies that C=M\lvert C\rvert=\lvert M\rvert. Therefore, CC is a minimum vertex cover.

Assuming the above implementation of maximumMatching, we can compute a minimum vertex cover as follows:

vector<int> cover;
int minimumVertexCover() {
    int size = maximumMatching();
    for (int i = 0; i < a; i++)
        if (visited[i]) cover.push_back(i);
    for (int i = a; i < a + b; i++)
        if (!visited[i]) cover.push_back(i);
    assert((int)cover.size() == size);
    return size;
}