\documentclass[12pt]{article}

\usepackage[pdftex]{graphicx}
\usepackage{amsmath,amssymb,url}
\usepackage{bbm}

\newfont{\bssdoz}{cmssbx10 scaled 1200}
\newfont{\bssten}{cmssbx10}


\oddsidemargin  0in \evensidemargin 0in \topmargin -0.5in
\headheight 0.25in \headsep 0.25in
\textwidth   6.5in \textheight 9in %\marginparsep 0pt \marginparwidth 0pt
\parskip 1.5ex  \parindent 0ex \footskip 20pt

\begin{document}
\parindent 0ex
\thispagestyle{empty} \vspace*{-0.75in}
{\bssdoz CME 305: Discrete Mathematics and Algorithms}
{\\ \bssten Instructor: Professor Amin Saberi (saberi@stanford.edu) }
{\\ \bssten HW\#4 --  Due 3/16/12}
\vspace{5pt}

\begin{enumerate}

\item
Given a connected, undirected graph $G = (V,E)$ and a set of terminals
$S=\{s_1, s_2,\ldots,s_k\} \subseteq V$, a multiway cut is a set of edges whose 
removal disconnects the terminals from each other. The multiway cut problem asks for the minimum weight such set.
The problem of finding a minimum weight multiway cut is NP-hard for any fixed $k \geq 3$.
Observe that the case $k = 2$ is precisely the minimum $s-t$ cut problem.

Define an \textit{isolating cut} for $s_i$ to be a set of edges whose removal disconnects $s_i$ from the rest of the terminals. Consider the following algorithm
\begin{itemize}
	\item For each $i = 1,\ldots, k$, compute a minimum weight isolating cut for $s_i$, say $C_i$.
	\item Discard the heaviest of these cuts, and output the union of the rest, say C.
\end{itemize}

Each computation in Step 1 can be accomplished by identifying the $terminals$ in $S - \{s_i\}$ into a single node, and finding a minimum cut separating this node from $s_i$; this takes one max-flow computation. Clearly, removing $C$ from the graph disconnects every pair of terminals, and so is a multiway cut.

\begin{enumerate}
\item Prove that this algorithm achieves a $2-2/k$ approximation.

\textbf{Solution.} This questions is from Vijay Vazirani's approximation algorithms book. Let $A$ be an optimal multiway cut in $G$. We can view $A$ as the union of $k$ cuts as follows: The removal of $A$ from $G$ will create $k$ connected components, each having one terminal (since $A$ is a minimum weight multiway cut, no more than $k$ components will be created). Let $A_i$ be the cut separating the component containing $s_i$ from the rest of the graph. Then $A = \cup_{i=1}^k A_i$.Since each edge of $A$ is incident at two of these components, each edge will be in two of the cuts $A_i$. Hence,

$$\sum_{i=1}^k w(A_i) = 2w(A)$$

Clearly, $A_i$ is an isolating cut for $s_i$. Since $C_i$ is a minimum weight isolating cut for $s_i$, $w(C_i) \leq w(A_i)$. Notice that this already gives a factor 2 algorithm, by taking the union of all $k$ cuts $C_i$. Finally, since $C$ is obtained by discarding the heaviest of the cuts $C_i$,

$$w(C) \leq \left(1-1/k\right) \sum_{i=1}^k w(C_i) \leq   \left(1-1/k\right) \sum_{i=1}^k w(A_i) = 2 \left(1-1/k\right)w(A) $$



\item Prove that this analysis is tight by providing an example graph where the approximation bound is exactly achieved.

\textbf{Solution}
Consider for $k=4$
\begin{center}
\includegraphics[width=4cm]{q1example.png}
\end{center}

For each terminal $s_i$, the minimum weight isolating cuts for $s_i$ is given by the edge incident to $s_i$. So, the cut $C$ returned by the algorithm has weight $(k -1)(2 -\epsilon)$. On the other hand, the optimal multiway cut is given by the cycle edges, and has weight $k$.

\end{enumerate}

\item Consider variants on the metric TSP problem in which the object is to find a simple path containing all the vertices of the graph. Three different problems arise, depending on the number (0, 1, or 2) of endpoints of the path that are specified. Obtain the following approximation algorithms.
\begin{enumerate}
\item  If zero or one endpoints are specified, obtain a 3/2 factor algorithm. 

\textbf{Solution.} We prove the 1 endpoint case, and the zero endpoints will follow by arbitrarily selecting an endpoint. Call the single endpoint $v$.

Construct a minimum spanning tree $T$ of $G$. Determine the set of odd degree vertices, call it $S$.
There will be an even number of them, i.e. $|S|$ = even. $v$ may or may not be in $S$. If $v$ is in $S$, take it out, along with 
another arbitrary vertex from $S$. If $v$ is not in $S$, don't do anything.

Find the minimum matching $M$ on $S$, and add it to $T$, call the result $G'$.
Notice that in $G'$,  all nodes have even degree, except for $v$ and some other node. Thus we can
find an eulerian path and shortcut it to obtain a hamiltonian path. It remains to bound the weight of $G'$.

Notice that $M$ is a matching on at most $n-1 \geq |S|$ nodes, with this 
bound being tight only when $n$ is odd and $v \notin S$, otherwise the bound is strict.
Also notice that the optimal hamiltonian path $P^*$ contains 
two edge-disjoint matchings on $n-1$ and $n-2$ nodes (just alternate edges along the path).
Thus the weight of a minimum matching on $S$ is at most half the weight of $P^*$, the optimal path.

Since $P^*$ is a spanning tree, we also have the weight of $T$ is less than $P^*$. Thus the union of the path and matching
have at most $3/2$ the weight of $P^*$.


\item If both endpoints are specified, obtain a 5/3 factor algorithm.

\textbf{Solution}
Construct a minimum spanning tree $T$ of $G$. Determine the set of odd degree vertices, call it $S$.
There will be an even number of them, i.e. $|S|$ = even. Call the two prescribed endpoints $s$ and $t$.
If $s$ or $t$ have wrong degree (even) add them to $S$. Run a perfect matching on $S$ and add the result to $T$, which can
then be shortcutted to return a hamiltonian path. It remains to prove the approximation factor.

First we prove a lemma. We prove that 
the  multiset $Q$,  containing  the  edges 
belonging  to  the  minimum spanning  tree  $T$ plus   the 
optimal Hamiltonian path with two fixed  endpoints $s$ 
and  $t$,  can  be partitioned  into  three  disjoint subsets 
$E_1, E_2$ and $E_3$,  each yielding a perfect   matching  on 
the set of  odd-degree  vertices in  $T$  after shortcutting. 

Proof of lemma: every vertex of wrong degree in $T$
has odd degree in $Q$, so $S$ contains all vertices of odd
degree in $Q$. Let the number of vertices in $S$ be equal to $2m$.
Renumber the vertices according their order of occurrence in $P_{st}^*$. After
renumbering, $P_{st}^*$ consists of the edges $(s,1), (1,2), \ldots, (n-2, t)$, where $n$
is the number of vertices.
The first subset, $E_1$, contains the edges between
the $(2k-1)$-st and the $2k$-th vertex in $S$ for $k=1 \ldots m$
(so if $s$ and $2$ are the first two vertices in $S$, then $E_1$ contains
the edges $(s,1)$ and $(1,2)$. The first perfect matching on the vertices
in $S$ is obtained by applying shortcuts to $E_1$.

Consider the graph defined by the vertex set $\{s,1,\ldots,n-2,t\}$ and the edge set
$Q-E$. Thus the graph is connected (is still contains $T$) and all of its vertices
have even degree due to removal of $E_1$. Thus it contains an eulerian cycle 
which can be partitioned into the subsets $E_2$ and $E_3$ by shorcutting to a regular
cycle and taking alternate edges. Thus we have proved the lemma.

Now with this lemma, as $P_{st}^*$ is a tree, we have
$c(T) \leq c(P_{st}^*)$. The lemma above asserts
that the multiset containing edges belonging to $T$ and $P_{st}^*$ can be split
in three edge disjoint subsets each yielding a perfect matching after shortcutting.
Hence $c(M) \leq 2/3 c(P_{st}^*)$. Thus
$$P_{st} \leq c(T) + c(M) \leq 5/3 c(P_{st}^*)$$

\end{enumerate}
\textbf{Hint.} Consider modifying Christofides algorithm for metric TSP.

\item Recall the \emph{minimum vertex cover} problem: given a graph
$G(V,E)$ find a subset $S^* \subseteq V$ with minimum cardinality such
that every edge in $E$ has at least one endpoint in $S^*$.
\begin{enumerate}
\item Consider the following greedy algorithm. Find the highest degree
vertex, add it to the vertex cover $S$ and remove it along with all
incident edges. Repeat iteratively. Prove that this algorithm has an
unbounded approximation factor i.e.  for any $c$ there exists an input
graph $G$ such that $|S|$ $\geq c$ OPT.

\item Prove that when the graph $G$ is bipartite we can find the minimum
vertex cover in polynomial time. State your algorithm.

\end{enumerate}

\textbf{Solution.}

\begin{enumerate}

\item Consider a bipartite graph with partition $(A,B)$. Let $|A| = n$
and partition $B$ into $n$ disjoint sets $\{B_i\}_1^n$ with $|B_i| =
\lfloor n/i \rfloor$.  Then $|B| = n + \lfloor n/2 \rfloor + \lfloor n/3
\rfloor + \cdots + 1 = O(n \log n)$. For each vertex $a \in A$ place an
edge to a vertex $b \in B_k$ such that each vertex in $B_k$ has degree
at least $k$. Repeat this for $k = 1,2,\dots,n$ and note that this
construction is possible since  each $|B_k| \leq n/k$. 

Clearly, OPT for this graph is just $|A| = n$. But the algorithm will
(depending on how degree ties are broken) remove all vertices in $B$.
The approximation ratio is ALG/OPT $=O(n \log(n))/n = O(\log(n))$ Since
the ratio depends on $n$ it is unbounded.


\item Let $(A,B)$ be the bipartite partitioning of the graph and $M$ be
the maximum matching. The following definition is useful. A path $P$ is
called an {\bf alternating path} $e_1, e_2, \dots, e_k$ if and only if
$e_i \in M$ implies $e_{i+1} \in E \backslash M$.

Let $U$ be the set of all unmatched vertices in $A$. Let $R$ be the set
of all vertices reachable by from $U$ by alternating paths (note that $U
\subseteq R$). Let $S = (A - R) \cup (B \cap R)$.  First we claim that
$S$ is a vertex cover. Indeed, suppose $(a,b) \in E$ with $a \in A$ and
$b \in B$ is not covered by any vertex in $S$. Then $a \notin A-R$ so $a
\in R$ and hence $b \in R$. Thus by definition of $S$ we must have $b
\in S$, contradiction. So $S$ is a vertex cover.

Next we claim that $|S| = |M|$. Note that each vertex in $A - R$ is an
endpoint of an edge in $M$. We can make the same claim for $B \cap R$
since suppose $b \in B \cap R$ is unmatched. Then there exists an
alternating path $(u,b_1),e_2,e_3,\dots,e_k,(a,b)$ with $u$ and $b$
unmatched. This contradicts the fact that $M$ is maximum since we can
increase its size by $1$ by removing $e_2,e_4, e_6 \dots, e_k$ from the
matching and adding $(u,b_1),e_3,e_5, \dots,(a,b)$ to the matching.
Therefore must have that each vertex of $S$ is an endpoint of an edge in
$M$. It is also easy to see by construction of $S$ that no edge $e \in
M$ has both endpoints in $S$. Thus $|S| = |M|$ as claimed.

Finally, recall from lecture that for any vertex cover $C$ we have that
$|M| \leq |C|$.  This implies that $S$ is the minimum vertex cover.
Finding $S$ takes polynomial time since we can find the maximum matching
in polynomial time (use network flow) and $R$ can be found in time
$O(|E|)$.

\end{enumerate}


\item Consider scheduling $n$ jobs to $m$ identical machines to minimize the time taken by the machine
with the heaviest load (i.e. to minimize the makespan). One algorithm we saw in class
was to order the jobs by decreasing processing times $t_1 \geq t_2 \geq \ldots \geq t_n$, then greedily assign jobs
to the machine whose load is the smallest so far (starting with the heaviest job).
\begin{enumerate}
\item Show that this algorithm is a $4/3$ approximation to the optimal makespan.
\item Show that the $4/3$ approximation is tight for this algorithm.
\end{enumerate}
\textbf{Solution} Will be in lecture notes.

\item Give a $O(\log n)$ factor approximation to the \emph{Asymmetric
Travelling Salesman Problem}. ATSP: We are given a set of points $V$ and
a cost function $c \, : \, V \times V \to [0,\infty)$ which defines a
cost of travel from point $u$ to point $v$ as $c(u,v)$. Note, here we
have asymmetry i.e. in general $c(u,v) \neq c(v,u)$. Assume further that
the triangle inequality is satisfied i.e. $c(u,v) \leq c(u,w) + c(w,v)$
for all $u,v,w \in V$.  ATSP asks to find a minimum cost cycle which
visits each vertex exactly once.

{\bf Hint:} Use the fact that a minimum cost cycle cover (i.e. set of
disjoint cycles covering all the vertices) can be found in polynomial
time. Shrink the cycles and recurse.

\textbf{Solution}
Note that the triangle inequality allows us the following. Any walk on
$G$ that visited each vertex at least once may be modified into a cycle
which visits each vertex exactly once and is of smaller cost than the walk. 
So w.l.o.g. we may look for a walk that is a $O(\log n)$ approximation.

The algorithm is as follows. On iteration $k$ find the min-cost cycle cover
of graph $G^k$ and shrink the cycles into a new set of vertices keeping all 
edges leaving or entering each cycle to output $G^{k+1}$. We start with
$G^0 = G$ and repeat for $K$ iterations until $G^K$ is a single vertex. 
The walk that visits each vertex is easily reconstructed by traversing
the cycles recursively.
Note that the min-cost cycle cover can be found in polynomial time 
by combining the result of Problem $5$ on Homework $2$ with along with
the fact that we can find minimum weight perfect matching as described in
Lecture $9$. Thus the algorithm running time is polynomial.

First note that $K$ is at most $\log n$ since on each iteration we 
reduce the number of vertices by at least half (worst case is all
$2$ cycles and $\log n$ is number of times we divide $n$ by $2$).
Letting OPT$(G)$ be the cost of the optimal tour on $G$ and $C(G)$
be the cost of the min-cost cycle cover of $G$ note that
$C(G^k) \leq $ OPT$(G^k) \leq $ OPT$(G)$. The first inequality 
follows from the fact that the optimal tour is a cycle cover and
the second from the fact that the edges of $G^k$ are a subset of 
the edges of $G$.

The above results imply the cost of the tour found by the algorithm
is bounded by
\[ \sum_{k=1}^{K} C(G^k) \leq \log(n) \text{OPT}(G) \] 
which yields the $O(\log n)$ approximation ratio.


\item {\bf (Kleinberg \& Tardos 13.1)} \emph{3-Coloring} is a yes/no
question, but we can phrase it as an optimization problem as follows.

Suppose we are given a graph $G(V,E)$, and we want to color each node
with one of three colors, even if we aren't necessarily able to give
different colors to every pair of adjacent node.  Rather, we say that
an edge $(u,v)$ is \emph{satisfied} if the colors assigned to $u$ and
$v$ are different.

Consider a 3-coloring that maximizes the number of satisfied edges,
and let $c^*$ denote this number.  Give a polynomial-time algorithm
that produces a 3-coloring that satisfies at least $\frac{2}{3}c^*$
edges.  If you want, your algorithm can be randomized; in this case,
the \emph{expected} number of edges it satisfies should be at least
$\frac{2}{3}c^*$.
\textbf{Solution}
Consider the simplest possible algorithm. For each uncolored vertex pick a 
color uniformly at random and color the vertex with this color. Its not hard 
to see that $\mathbbm{P}(e \mbox{ is satisfied}) = 2/3$. Let $X_e$ be the indicator 
random variable of $e$ being satisfied. Then $X = \sum_{e \in E} X_e$ is the number
of satisfied edges. By linearity of expectation
\[ \mathbbm{E}(X) =  \sum_{e \in E} \mathbbm{E}(X_e) = \sum_{e \in E}
\mathbbm{P}(e \mbox{ is satisfied}) = \frac{2}{3}|E|\,. \]
Since $|E| \geq c^*$ we have $|ALG| \geq 2/3 c^*$ in expectation.


\end{enumerate}

\end{document}
