Problem
Input: directed graph , the capacity for for each edge , a source vertex , and a sink vertex
Output: a “feasible” flow, which is a map , where the flow value is maximized.
Feasible flow
A flow is considered feasible if and only if both constraints are satisfied:
- Capacity constraint: for every edge , we have . That is, the flow on an edge cannot be non-negative, and also cannot carry more than its given capacity.
- Flow conservation constraint: for every , we have . The flow of all the edges going into a vertex must equal the flow that comes out of it. The flow is conserved, and none can be created or destroyed. Only should have more flow going out than in, and only should have more flow going in than out.
Flow value
Flow value is represented by the notation and equals
where is the source vertex. It’s the sum of the flow from all the edges coming out of . Due to flow conservation, it’s also equivalent to the sum of the flow from all the edges going into .
- We subtract out the flow that is going back into for completeness’ sake, but if we want to maximize , this number should be zero.
- Same idea with , the flow that is leaving the sink vertex. To maximize flow value, ideally we want our map to assign these edges zero flow.
The proof for this is noticing that for every “internal” vertex , the flow on edges going into minus the flow on edges leaving is zero. Then we’re just left with the flow coming in and going out of and , and under flow conservation these two must be equal.
Again, in our flow ideally we want and to be zero for all .
Simple example: zero flow
Let be a directed graph with . Let be the flow such that we have . We would like to find the path in . Let equal the flow on the given path . That is, we have
where . Then, the flow on all other paths other than will be zero, and we can construct a new where . Furthermore, .
Flow algebra: lemma
When can we push flow along a particular path? We want to try adding additional flow to a preexisting feasible such that we increase the total amount we can send.
Let be a directed graph with , and we have a current feasible flow . If there exists a path such that for each edge , we have two cases:
- , and . We can send more flow along that path, because we have a strict non-zero amount of flow that can additionally be sent along .
- , where the reverse directed edge , and . does not actually exist in , only does. We say that the original flow sent along is , and the fact that means that we’re changing our mind about our previously sent flow , and instead we’re choosing to reset it by reversing it.
We can now calculate a such that
This equals the max capacity that we can send on this particular path , given our current flow map .
Consider
If all of the above is true, and such a path exists, then we can construct a new flow that remains feasible, and .
- If such a path exists, this means we can construct a new with additional flow
- Naturally leads to an algorithm that keeps trying to find such a path and updating the flow until it does not exist anymore.
Residual graph
We have a directed graph with capacity . We also have a preexisting flow that is feasible.
Then, let be the residual graph where the vertex set is identical, and the edges consist of the two cases from before. We either have
- , then we set .
- such that , then we set .
Ford-Fulkerson algorithm
- Initialize .
- Write residual graph using .
- While BFS finds a path in with non-zero , we keep pushing flow and updating . Then, update based on new .
- When the while loop ends, we return
Running time
We start with a flow where everything is zero. Since we define the capacity of the edges to be integers, and the flow on are also integers, on each run of BFS we will increase flow by at least 1 (because we look for a path where ). So the worst case runtime is how many iterations it takes to reach the maximum flow of the graph.
We can see as a “measure of progress” for our algorithm. We want to keep increasing until we reached the maximum flow . We multiply the number of iterations needed by the time we take within each iteration.
Running BFS takes , and creating the residual graph takes because we simply update the flow for each edge in the graph.
Let the max flow be . Let be the max capacity of any edge in the graph. At most, there are edges on the source vertex , so the size of the min cut is at most . Then to calculate also takes time.
Therefore, final runtime is given by .
Showing the runtime
An -cut in is a set such that source and sink . Consider the set of edges where and ; that is, all edges that cross . Let’s call the set (abuse of notation). Define the capacity of to be
We can then make this claim: for any feasible flow and every -cut it must be true that . The proof uses the definition of .
For each edge , we have four cases:
- If , then this edge contributes zero, because both vertices are in .
- If , then this edge also contributes zero, because both verts are not in .
- If , the flow contributed is .
- If , then the flow is “backwards” and this contributes .
Therefore, is equal to , and is upper bounded by , minus the negative flow of edges that are directed into .
Max-flow min-cut theorem
Then, our final claim: suppose is feasible and there exists no path in . Then, this implies there exists a -cut such that . This means is now equal to the maximum flow. Then, is also called the minimum cut.
Another way of saying it: we run Ford-Fulkerson until we create a residual graph that disconnects source and sink . At this point, we’ll have created a min cut consisting of vertices , and .
Which cut do we analyze? Let . This is a valid -cut because it’s impossible that , otherwise the algorithm would not have terminated (since this implies the existence of a path ).
We want to show
Consider . It must be true that for all the edges that cross the cut it must be true that . Otherwise, would have been in . Then . Furthermore, consider the edges where and . These are the edges flowing back into , but since is not in , then it must be true that . We have
Input size causes exponential runtime
This algorithm depends on the maximum capacity of the edge. If , for example, the runtime becomes exponential.
- Assume we use a matrix to represent out graph. Then we need bits to represent all possible edges.
- To represent the capacity of each edge, this takes bits.
- So, considering space requirements as well, we have .
So this algorithm is exponential with respect to the encoding of the input.
Capacity scaling
Algorithm
- Initialize flow and , the maximum flow.
- Repeat until stuck: create residual graph but only include edges where the residual capacity is at least . That is, . Then, run BFS and push flow in .
- When stuck, if , then we finish and return . Otherwise, we update , and run step 2 again.
Step 3 is necessary because and may be connected by edges whose capacity is less than the original , so we scale it down over time to catch those too.
Proof of correctness
This algorithm is the same as Ford-Fulkerson, because we will eventually consider every single edge as we decrease . Then, correctness follows from the correctness of Ford-Fulkerson.
Runtime
The biggest difference here is that instead of updating flow by at least 1, here we update flow by at least for each iteration. This dramatically increases runtime.
How many times do we update at most? Since we divide by 2 every time, starting from , we have at most of these “stages.”
Then runtime is bounded by times the number of iterations for each stage of times , which is the runtime for BFS and writing .
Example: when
How many iterations are there? We know maximum flow is bounded by , and each iteration, we update flow by at least . Therefore, the number of iterations has to be less than , because otherwise we would surpass max flow.
Furthermore, the amount of flow added is at least the number of iterations times . We have the following relationship:
Let the true maximum flow be represented by . We claim that
where is the current flow value at a given stage of and is the number of edges. Then, whenever we add additional flow to our current flow, it must be less than the max flow. We have
Subtract from both sides, and we get the inequality from above. Substituting that in and multiplying by on both sides gives us
Then, our final runtime is made up of three parts.
- The runtime for BFS and creating , both given by .
- The number of iterations at each stage of , which we just proved to always be .
- The number of stages, which we showed to be .
Multiply all of these together, and we get the runtime of . This runtime is polynomial with respect to the input.
Edmunds-Karp algorithm
- Initialize flow for all edges.
- Write residual graph and if there exists a path in , pick the shortest -path, which is defined as the minimum number of edges in the path.
- Push flow on that path, like in FF.
- Repeat until no such path exists in anymore, return as final answer.
Runtime
As before, the algorithm stops when no -path exists in anymore. Here, the number of “hops” (number of edges) on the -path we choose to push flow on is important. Define as the number of hops on the shortest -path in .
Divide into stages according to
- Consider the BFS tree rooted at
- Edges in that are not in the tree are not relevant to the argument, because they either go between vertices in the same layer (shortest path would never use), or they’re not used at all
- Whenever remains the same, we remain in the same “stage,” and we only move to the next stage when updates
We have two claims:
- As the algorithm executes, never decreases.
- There can be at most iterations of pushing flow before must increase (by at least 1).
We showed this visually with an example in class, but while stays the same, we only push flow along “important” edges, that is, edges that exist in the BFS tree for this stage of .
- Every time we push flow on an edge, we saturate at least one edge
- This edge then becomes a back edge in the residual graph, which effectively removes it from consideration for shortest -path
- We can at most remove edges before the graph is empty and there is no possibility of and being connected
- This proves that there are at most iterations of pushing flow before must increase.
- When increases, at least one new edge is introduced as “important” and we have to redraw the BFS tree.
Final calculation
At most the shortest path will take hops, and worst case it increases by 1 on every push of flow.
- Here, is the measurement of progress, the number of stages
- There are at most stages, each stage has at most iterations, and we run BFS for each iteration, which takes time.
- We have a final runtime of .