Hello and welcome. My name is William and today we're going to start diving a little deeper into network flow. We're going to talk about unweighted bipartite graph matching, and specifically how we can use max flow to find a matching for us. Before we get started though, I should mention what a bipartite graph is. a bipartite graph is one whose vertices can be split into two independent groups, u and v, such that every edge connects between u and v. Other definitions exists, such as the graph is to colorable or there is a cycle with an odd like bipartite graphs often arise when we're trying to match one group of items to another in some way. Think of situations such as matching suitable candidates to jobs.
There could be multiple jobs and multiple candidates, but not every candidate is suitable for each job. If jobs are red nodes and candidates are white nodes, then there would be an edge between the two if the candidate is good fit. Another situation could be matching surfers to surfboards. Suppose there are multiple servers and multiple surfboards, but the surfers have preferences and requirements for the boards, such as color size and so on. Then the same thing happens we placed an edge between the surfer and the surfboard to indicate that they are able to be matched. Generally when we're setting up a bipartite graph, we're interested in what's called a maximum cardinality bipartite matching.
This is when we've maximized the pairs that can be matched with each other. For example, we've maximize the number of candidates that can be matched to jobs or the number of surfers to surfboards. Finally, a matching is not unique to bipartite graphs. However, you can also have a matching on a non bipartite graph, this variant is a lot harder to solve and also much less common. Another variant is finding a maximum matching on a weighted graph where you can either maximize or minimize the cost of the matching. This variant is also much harder to solve than the unweighted version and the unweighted version.
No edge is better in any sense than any other edge. So it makes finding a matching much, much easier. We're mostly going to focus on the top black box, which is the easiest of the four variants, but hopefully we'll get to poke around and some of the other boxes as well. So if you want to find a maximum matching on an unweighted bipartite graph, you Have lots of options. You can either set up the graph as a flow problem and push flow through it, which is what we'll look at in this video. But you can also repeatedly find augmenting paths which maximize the matching using a depth first search.
Or you can use the specialized Hopcroft Karp algorithm to do the same thing a lot faster. If your edges are weighted, and your graph is still bipartite. You also have a lot of options. You can use a min cost max flow algorithm, or you can run the Hungarian algorithm. And lastly, there's the more sophisticated networks simplex algorithm which uses linear programming. If however, you graph is not bipartite, but your edges are unweighted, you can use Edmonds blossom algorithm.
And lastly, the hardest of the four variants is when your graph is non bipartite. And the edges are weighted. I didn't find much information about this one online, but The recommendation seems to be to use dynamic programming on small graphs. Now let's look at an example. This is going to be for the unweighted, bipartite case, the easiest of the four variants. So I want you to imagine that there are five people and five books in the library and that some people express interest in some of the books.
This results in a bipartite graph with people on one side and books on the other. So far, so good. Now suppose we want to find the maximum cardinality bipartite matching, or in other words, we want to match as many people with as many books as we can. Let's try the greedy approach to this matching problem. Let's start with person green. Their first edge connects to the second book on the right side.
The second book is unallocated so person green is matched with what is now book green. Next up Is person orange. The first book they want is the same book as person green, which is already matched. So we cannot select person greens book. Their next choice is the third book which is unallocated. So they get matched to that one.
Next up is person purple, they instantly matched to an unallocated book on the right hand side. Now person Read, read only has one edge, meaning that they're only willing to read that one book. However, that book has already been allocated to person orange so person red cannot have it. Next up is person Brown. They also want person oranges book but they also cannot have it. Fortunately, they have other options of books they're willing to read.
So person brown gets one of those. So in the end, the greedy approach only found a matching For only four people were able to be matched with books. But can we do any better? Is this the true maximum cardinality matching. Turns out that it's not a greedy approach to the maximum matching problem will not work as we just saw, we need a more sophisticated approach to ensure that we are able to get that maximum matching. So we're going to solve this maximum matching problem by turning our problem into a network flow problem and finding the max flow.
The first thing we're going to do is make every edge directed and add one unit capacity to each edge. The zero slash one besides each edge means zero flow and a maximum capacity of one. Next we're going to introduce two new nodes, the source and the sink and hook up as you Judges outwards from the source to the people with a capacity of one and hook up edges from books to the sink also with a capacity of one once that's all set up, use any maximal algorithm to push flow through the network. What this will do is show us what edges get populated with flow with that information, we will be able to reconstruct the maximum matching. Here's a graph after the flow algorithm has ran, you can see that some of the edges have one unit of flow. Those were the edges selected by the max flow algorithm.
The most interesting edges are the middle edges with one unit of flow. These are the edges which form the maximum cardinality matching. If we color in the middle edges, which have one unit of flow. You can see that this time everybody goes home with a book and no one is left empty handed. Okay? So now we understand how this basic setup works and how it leads to a matching.
Let's play around with this model a little bit to truly understand what all the weeds here mean, we originally set the capacity of each edge from the source to each person to be one. But what constraint is that really enforcing? I'll let you pause the video and think about that for a second because it's so important. The answer is that that capacity of one ensures that each person can get up to one book and no more. If we increase this number. For some people, we can allow them to possibly pick up more than one book.
If we rerun the max flow algorithm through this network, we see that it's now possible for one person to be matched with multiple books. The next thing we want to do is changed the flow network to allow a book to be selected multiple times. Pause the video and think about how we can modify this flow graph to support having multiple copies of the same book in the library. I'll give you a short moment. The number of copies of a book is controlled by the capacity of the edges leading to the sink T. Increasing this value will allow more flow to run through those designated edges. This effectively limits or controls the number of copies of a book.
Let's change the capacity of those edges leading into the sink to allow having multiple copies of the same book and see what happens if we rerun the Maxwell algorithm. Once again through the network. We see that we now have people matched with the same book multiple times because multiple copies exist. For example, Book Three and book five, both have two people grabbing a copy of them. The actual assignment of people to books would be as follows. I'll let the animation play after the flow algorithm has ran, if you want to know how many copies of each book were actually given out, you can inspect the flow value on the edge leading to the sink.
Currently, each person is only allowed to pick up one copy of each book, even though there are multiple copies of each book. How can we modify the flow network to support this? You've guessed it. We need to modify the edge capacity between a person and the book to allow that person to pick up multiple copies of that book. Also nuts all I got for this video. Thank you so much for watching.
Please like this video if you learned something, and please subscribe for more mathematics and computer science videos. I'll catch you next time.