Maximum flow

Problem

Input: directed graph $G = (V, E)$ , the capacity for for each edge $c : E \to R_{\geq 0}$ , a source vertex $s$ , and a sink vertex $t$

Output: a “feasible” flow, which is a map $f : E \to R$ , where the flow value $∣ f ∣$ is maximized.

Feasible flow

A flow $f : E \to R$ is considered feasible if and only if both constraints are satisfied:

Capacity constraint: for every edge $e \in E$ , we have $0 \leq f (e) \leq c (e)$ . That is, the flow on an edge cannot be non-negative, and also cannot carry more than its given capacity.
Flow conservation constraint: for every $v \in V - {s, t}$ , we have $\sum_{u \in V, (u, v) \in E} f (u, v) = \sum_{u^{'} \in V, (v, u^{'}) \in E} f (v, u^{'})$ . The flow of all the edges going into a vertex must equal the flow that comes out of it. The flow is conserved, and none can be created or destroyed. Only $s$ should have more flow going out than in, and only $t$ should have more flow going in than out.

Flow value

Flow value is represented by the notation $∣ f ∣$ and equals

∣ f ∣ = u^{'} \in V, (s, u^{'}) \in E \sum f (s, u^{'}) - u \in V, (u, s) \in E \sum f (u, s) = u \in V, (u, t) \in E \sum f (u, t) - u^{'} \in V, (t, u^{'}) \in E \sum f (t, u^{'})

where $s$ is the source vertex. It’s the sum of the flow from all the edges coming out of $s$ . Due to flow conservation, it’s also equivalent to the sum of the flow from all the edges going into $t$ .

We subtract out the flow that is going back into $s$ for completeness’ sake, but if we want to maximize $∣ f ∣$ , this number should be zero.
Same idea with $f (t, u^{'})$ , the flow that is leaving the sink vertex. To maximize flow value, ideally we want our map $f$ to assign these edges zero flow.

The proof for this is noticing that for every “internal” vertex $v \in V - {s, t}$ , the flow on edges going into $v$ minus the flow on edges leaving $v$ is zero. Then we’re just left with the flow coming in and going out of $s$ and $t$ , and under flow conservation these two must be equal.

Again, in our flow $f$ ideally we want $f (u, s)$ and $f (t, u^{'})$ to be zero for all $u, u^{'} \in V$ .

Simple example: zero flow

Let $G = (V, E)$ be a directed graph with $c : E \to R$ . Let $f : E \to R$ be the flow such that $\forall e \in E$ we have $f (e) = 0$ . We would like to find the path $p : s \to t$ in $G$ . Let $f_{p} : E \to R$ equal the flow on the given path $p$ . That is, we have

f_{p} (e) = {c (p) 0 if e \in p otherwise

where $c (p) = min {c (e) : e \in p}$ . Then, the flow on all other paths other than $p$ will be zero, and we can construct a new $f^{'}$ where $f^{'} = f + f_{p}$ . Furthermore, $∣ f^{'} ∣ = ∣ f ∣ + c (p)$ .

Flow algebra: lemma

When can we push flow along a particular path? We want to try adding additional flow to a preexisting feasible $f$ such that we increase the total amount we can send.

Let $G = (V, E)$ be a directed graph with $c : E \to R$ , and we have a current feasible flow $f : E \to R$ . If there exists a path $p : s \to t$ such that for each edge $e \in p$ , we have two cases:

$e \in E$ , and $c (e) - f (e) > 0$ . We can send more flow along that path, because we have a strict non-zero amount of flow that can additionally be sent along $e$ .
$e = (u, v)$ , where the reverse directed edge $e^{'} = (v, u) \in E$ , and $f (e^{'}) > 0$ . $e$ does not actually exist in $G$ , only $e^{'}$ does. We say that the original flow sent along $e^{'}$ is $f (e^{'})$ , and the fact that $e \in p$ means that we’re changing our mind about our previously sent flow $f (e^{'})$ , and instead we’re choosing to reset it by reversing it.

We can now calculate a $c_{f} (p)$ such that

c_{f} (p) = min {c (e) - f (e) f (e^{'}) if e belongs to the first case if e belongs to second case

This equals the max capacity that we can send on this particular path $p$ , given our current flow map $f$ .

Consider

f_{p} (e) = ⎩ ⎨ ⎧ c_{f} (p) - c_{f} (p) 0 if e \in p and e belongs to first case if e ’ \in p and e belongs to second case otherwise

If all of the above is true, and such a path $p$ exists, then we can construct a new flow $f^{'} = f + f_{p}$ that remains feasible, and $∣ f^{'} ∣ = ∣ f ∣ + c_{f} (p)$ .

If such a path $p$ exists, this means we can construct a new $f^{'}$ with additional flow
Naturally leads to an algorithm that keeps trying to find such a path and updating the flow until it does not exist anymore.

Residual graph

We have a directed graph $G = (V, E)$ with capacity $c : E \to R$ . We also have a preexisting flow $f : E \to R$ that is feasible.

Then, let $G_{f}$ be the residual graph where the vertex set is identical, and the edges consist of the two cases from before. We either have

$e \in E$ , then we set $c_{f} (e) = c (e) - f (e)$ .
$e = (u, v) \in / E$ such that $e^{'} = (v, u) \in E$ , then we set $c_{f} (e) = f (e^{'})$ .

Ford-Fulkerson algorithm

Initialize $f (e) = 0, \forall e \in E$ .
Write residual graph $G_{f}$ using $f$ .
While BFS finds a path $s \to t$ in $G_{f}$ with non-zero $c_{f} (p)$ , we keep pushing flow and updating $f \leftarrow f + f_{p}$ . Then, update $G_{f}$ based on new $f$ .
When the while loop ends, we return $f .$

Running time

We start with a flow $f$ where everything is zero. Since we define the capacity of the edges to be integers, and the flow on $G_{f}$ are also integers, on each run of BFS we will increase flow by at least 1 (because we look for a path $p$ where $c_{f} (p) > 0$ ). So the worst case runtime is how many iterations it takes to reach the maximum flow of the graph.

We can see $∣ f ∣$ as a “measure of progress” for our algorithm. We want to keep increasing $∣ f ∣$ until we reached the maximum flow $F$ . We multiply the number of iterations needed by the time we take within each iteration.

Running BFS takes $O (m)$ , and creating the residual graph takes $O (m)$ because we simply update the flow for each edge in the graph.

Let the max flow be $\leq F$ . Let $C$ be the max capacity of any edge in the graph. At most, there are $O (n)$ edges on the source vertex $s$ , so the size of the min cut is at most $O (n)$ . Then to calculate $C (S)$ also takes $O (n)$ time.

Therefore, final runtime is given by $O (mn C)$ .

Showing the runtime

An $(s, t)$ -cut in $G$ is a set $S \subset V$ such that source $s \in S$ and sink $t \in V - S$ . Consider the set of edges $e = (u, v)$ where $u \in S$ and $v \in V - S$ ; that is, all edges that cross $S$ . Let’s call the set $S$ (abuse of notation). Define the capacity of $S$ to be

C (S) = e \in S \sum c (e) = u \in S, v \in / S \sum c (u, v)

We can then make this claim: for any feasible flow $f$ and every $(s, t)$ -cut $S$ it must be true that $∣ f ∣ \leq C (S)$ . The proof uses the definition of $∣ f ∣$ .

For each edge $e = (u, v)$ , we have four cases:

If $u, v \in S$ , then this edge contributes zero, because both vertices are in $S$ .
If $u, v \in / S$ , then this edge also contributes zero, because both verts are not in $S$ .
If $u \in S, v \in / S$ , the flow contributed is $f (u, v)$ .
If $u \in / S, v \in S$ , then the flow is “backwards” and this contributes $- f (u, v)$ .

Therefore, $C (S)$ is equal to $\sum_{u \in S, v \in / S} f (u, v) - \sum_{u \in / S, v \in S f (u, v)}$ , and $∣ f ∣$ is upper bounded by $\sum_{u \in S, v \in / S} f (u, v)$ , minus the negative flow of edges that are directed into $S$ .

Max-flow min-cut theorem

Then, our final claim: suppose $f$ is feasible and there exists no path $p : s \to t$ in $G_{f}$ . Then, this implies there exists a $(s, t)$ -cut $S$ such that $C (S) = ∣ f ∣$ . This means $∣ f ∣$ is now equal to the maximum flow. Then, $S$ is also called the minimum cut.

Another way of saying it: we run Ford-Fulkerson until we create a residual graph $G_{f}$ that disconnects source $s$ and sink $t$ . At this point, we’ll have created a min cut consisting of vertices $S$ , and $C (S) = ∣ f ∣$ .

f max ∣ f ∣ \leq S \subset V min C (S)

Which cut $S$ do we analyze? Let $S = {v ∣ v is reachable from s in G_{f}}$ . This is a valid $(s, t)$ -cut because it’s impossible that $t \in S$ , otherwise the algorithm would not have terminated (since this implies the existence of a path $s \to t$ ).

We want to show

∣ f ∣ = u \in S, v \in / S \sum f (u, v) - u \in / S, v \in S \sum f (u, v) = u \in S, v \in / S \sum c (u, v)

Consider $G_{f}$ . It must be true that for all the edges $e = (u, v)$ that cross the cut it must be true that $f (e) - c (e) = 0$ . Otherwise, $v$ would have been in $S$ . Then $f (e) = c (e)$ . Furthermore, consider the edges $e = (u, v)$ where $u \in / S$ and $v \in S$ . These are the edges flowing back into $S$ , but since $v$ is not in $S$ , then it must be true that $c_{f} (e) = f (e^{'}) = 0$ . We have

∣ f ∣ = u \in S, v \in / S \sum c (u, v) - u \in / S, v \in S \sum 0 = C (S)

Input size causes exponential runtime

This algorithm depends on the maximum capacity of the edge. If $C = 1 0^{100}$ , for example, the runtime becomes exponential.

Assume we use a matrix to represent out graph. Then we need $n \times n$ bits to represent all possible edges.
To represent the capacity of each edge, this takes $lo g_{2} C$ bits.
So, considering space requirements as well, we have $O (mn C) + O (n^{2} lo g_{2} C)$ .

So this algorithm is exponential with respect to the encoding of the input.

Capacity scaling

Algorithm

Initialize flow $f = 0$ and $Δ = C$ , the maximum flow.
Repeat until stuck: create residual graph $G_{f} (Δ)$ but only include edges where the residual capacity is at least $\frac{Δ}{2}$ . That is, $c_{f} (u, v) \geq \frac{Δ}{2}$ . Then, run BFS and push flow in $G_{f} (Δ)$ .
When stuck, if $Δ < 1$ , then we finish and return $f$ . Otherwise, we update $Δ \leftarrow \frac{Δ}{2}$ , and run step 2 again.

Step 3 is necessary because $s$ and $t$ may be connected by edges whose capacity is less than the original $Δ$ , so we scale it down over time to catch those too.

Proof of correctness

This algorithm is the same as Ford-Fulkerson, because we will eventually consider every single edge as we decrease $\frac{Δ}{2}$ . Then, correctness follows from the correctness of Ford-Fulkerson.

Runtime

The biggest difference here is that instead of updating flow by at least 1, here we update flow by at least $\frac{Δ}{2}$ for each iteration. This dramatically increases runtime.

How many times do we update $Δ$ at most? Since we divide by 2 every time, starting from $C$ , we have at most $lo g_{2} (C)$ of these “stages.”

Then runtime is bounded by $lo g_{2} (C)$ times the number of iterations for each stage of $Δ$ times $O (m)$ , which is the runtime for BFS and writing $G_{f}$ .

Example: when $Δ = C$

How many iterations are there? We know maximum flow is bounded by $n C$ , and each iteration, we update flow by at least $\frac{C}{2}$ . Therefore, the number of iterations has to be less than $2 n$ , because otherwise we would surpass max flow.

Furthermore, the amount of flow added is at least the number of iterations times $\frac{Δ}{2}$ . We have the following relationship:
$(num of iterations) \cdot \frac{Δ}{2} \leq ∣ f ∣ \leq n C$

Let the true maximum flow be represented by $max ∣ f^{*} ∣$ . We claim that

max ∣ f^{*} ∣ - ∣ f ∣ \leq m Δ

where $∣ f ∣$ is the current flow value at a given stage of $Δ$ and $m$ is the number of edges. Then, whenever we add additional flow to our current flow, it must be less than the max flow. We have

∣ f ∣ + (number of iterations at stage Δ) \cdot \frac{Δ}{2} \leq max ∣ f^{*} ∣

Subtract $∣ f^{*} ∣$ from both sides, and we get the inequality from above. Substituting that in and multiplying by $\frac{2}{Δ}$ on both sides gives us

(number of iterations at stage Δ) \leq 2 m

Then, our final runtime is made up of three parts.

The runtime for BFS and creating $G_{f}$ , both given by $O (m)$ .
The number of iterations at each stage of $Δ$ , which we just proved to always be $\leq 2 m = O (m)$ .
The number of stages, which we showed to be $lo g_{2} C$ .

Multiply all of these together, and we get the runtime of $O (m^{2} lo g_{2} C)$ . This runtime is polynomial with respect to the input.

Edmunds-Karp algorithm

Initialize flow $f = 0$ for all edges.
Write residual graph $G_{f}$ and if there exists a path $s \to t$ in $G_{f}$ , pick the shortest $(s, t)$ -path, which is defined as the minimum number of edges in the path.
Push flow on that path, like in FF.
Repeat until no such path exists in $G_{f}$ anymore, return $f$ as final answer.

Runtime

As before, the algorithm stops when no $(s, t)$ -path exists in $G_{f}$ anymore. Here, the number of “hops” (number of edges) on the $(s, t)$ -path we choose to push flow on is important. Define $d (f)$ as the number of hops on the shortest $(s, t)$ -path in $G_{f}$ .

Divide into stages according to $d (f)$

Consider the BFS tree rooted at $s$
Edges in $G$ that are not in the tree are not relevant to the argument, because they either go between vertices in the same layer (shortest path would never use), or they’re not used at all
Whenever $d (f)$ remains the same, we remain in the same “stage,” and we only move to the next stage when $d (f)$ updates

We have two claims:

As the algorithm executes, $d (f)$ never decreases.
There can be at most $O (m)$ iterations of pushing flow before $d (f)$ must increase (by at least 1).

We showed this visually with an example in class, but while $d (f)$ stays the same, we only push flow along “important” edges, that is, edges that exist in the BFS tree for this stage of $d (f)$ .

Every time we push flow on an edge, we saturate at least one edge
This edge then becomes a back edge in the residual graph, which effectively removes it from consideration for shortest $(s, t)$ -path
We can at most remove $O (m)$ edges before the graph is empty and there is no possibility of $s$ and $t$ being connected
This proves that there are at most $O (m)$ iterations of pushing flow before $d (f)$ must increase.
When $d (f)$ increases, at least one new edge is introduced as “important” and we have to redraw the BFS tree.

Final calculation

At most the shortest path will take $d (f) = n$ hops, and worst case it increases by 1 on every push of flow.

Here, $d (f)$ is the measurement of progress, the number of stages
There are at most $O (n)$ stages, each stage has at most $O (m)$ iterations, and we run BFS for each iteration, which takes $O (m)$ time.
We have a final runtime of $n \cdot m \cdot m = O (m^{2} n)$ .

All Notes

otherworld

Feasible flow

Flow value

Simple example: zero flow

Flow algebra: lemma

Residual graph

Ford-Fulkerson algorithm

Running time

Showing the runtime

Max-flow min-cut theorem

Input size causes exponential runtime

Capacity scaling

Algorithm

Proof of correctness

Runtime

Edmunds-Karp algorithm

Runtime

Final calculation

Graph View

Table of Contents

Backlinks