\documentclass[twoside]{article}

\usepackage{amssymb, amsmath}
\usepackage{mathrsfs}

\oddsidemargin  0in \evensidemargin 0in \topmargin -0.5in
\headheight 0.2in \headsep 0.2in
\textwidth   6.5in \textheight 9in 
\parskip 1.5ex  \parindent 0ex \footskip 40pt


\begin{document}

\framebox[6.4in]{
\begin{minipage}{6.4in}
  \vspace{1mm}
  \center \makebox[6.2in]{{\bf CS369M: Algorithms for Modern Massive Data Set Analysis \hfill Lecture 10 - 10/26/2009}} 
  \vspace{2mm} \\
  \center \makebox[6.2in]{{\Large Expander Graphs in Algorithms Theory and in Data Applications  }} 
  \vspace{1mm} \\
  \center \makebox[6.2in]{{\it Lecturer: Michael Mahoney \hfill Scribes: Erik Goldman and Richa Bhayani}}
  \vspace{1mm}
\end{minipage}
} \vspace{2mm} \\
\mbox{{ \it *Unedited Notes}}

\section{ Expanders}

\subsection{Introduction}
Intuition: expanders reduce the need for randomness.

Definition: (S, d) is a metric space if d: $SxS \rightarrow \mathbb{R}^+$, $d(x,y) \geq 0$, $d(x,y) = 0$ iff $x = y$, $d(x,y) = d(y,x)$ and $d(x,y) \leq d(x, z) + d(z, y)$.

Distances: can be a Gram matrix, kernal, allowing algorithms in an infinite-dimensional space.

Can we replace d with something "nicer" while preserving distances?  If so, use the function $d'$ that's faster but introduces a little error.

Given (X, d) = $(\mathbb{R}^n, L_2)$ and a mapping f: $X \rightarrow \mathbb{R}$:

expansion(f) = $max_{x_1, x_2 \in X} \frac{\left||{f(x_1) - f(x_2)}\right||_2}{d(x_1, x_2)}$

contraction(f) = $max \frac{d(x_1, x_2)}{\left||{f(x_1) - f(x_2)}\right||}$

distortion(f) = expansion(f) * contraction(f)

NB: Johnson-Lindenstrauss says that we can map $x_i \rightarrow f(x)$ such that distance is within $1 \pm \epsilon$ of the original.

Theorem: every n-point metric space can be embedded in $L_2$ with distortion $\leq O(log n)$.
Proof idea: given (X, d), map each point $x \rightarrow \phi(x)$ in $O(lg^2 n)$ with coordinates equal to the distance to $S \subseteq V$ where S is chosen randomly.

Q1: Is this tight? \\
Q2: $C_2(X, d) is d'$ \\
Q3: Are there metrics that are better? (yes) \\

Theorem [LLR]: $\exists$ a polynomial time algorithms that, given a metric space (X, d), will compute $(C_2, d$ with $C_2$ as the minimum distortion of the space.

Proof: $\left|X\right| = n$, without loss of generality, scale f so contraction(f) = 1.  $distortion(x) \le \gamma$ iff $d(x_1, x_2) \leq \left||{f(x_1) - f(x_2)}\right||_2^2 = \gamma^2 d(x_1, x_2)^2$.

Let $u_i = f(x_i) = i^{th}$ row of embedding matrix.  Let $Z = uu^T$, Z is positive definite.

note: $\left||{f(x_i) - f(x_j)}\right||_2^2 = (u_i - u_j)^T (u_i - u_j) = u_i^T u_i + u_j^T u_j - Zu_iu_j$
$ = Z_{ii} + Z_{jj} - 2Z_{ij}$

So I can look for a PSD matrix Z s.t. $d(x_i, x_j) \leq Z_{ii} + Z_{jj} - 2Z_{ij}$, so $C_2(X, d) \leq \gamma$ iff $\exists Z \in PSD$ s.t. the previous inequality holds.

This can be solved as an optimization problem!

\subsection {Dual}
Note: By looking at the dual, we get an idea about the lower bound. (This will come up again in graph partitioning). we've to look at the constraints again to go from the primal to the dual.
Here, $z \epsilon PSD$ is the constraint causing a problem.

\subsection{Lemma}
$z \epsilon PSD <-> $ For all PSD , Q, $\sum q_{ij}z_{ij} \geq 0$ \\
Primal:
\renewcommand{\labelenumi}{\Roman{enumi}.}
\begin{enumerate}
    \item $\sum q_{ij}z_{ij}  \geq 0 $ for all $Q \epsilon PSD$
	 \item $z_{ii} + z_{jj} - 2z_{ij} \geq {d(x_i,x_j)}^2 $
	 \item $\gamma^2{d(x_i,x_j)}^2 \geq z_{ii} + z_{jj} - 2z_{ij}$
\end{enumerate}

\subsection{Theorem:(LLR contd.)}
\begin{center}
$C_2(x,d)= \max_{(P \epsilon PSD, P.1=0)} \sqrt{(\sum_{P_{ij} >0} P_{ij}d(x_i,x_j)^2)/(-(\sum_{(P_{ij} <0)} P_{ij}d(x_i,x_j)^2)}$
\end{center}
Proof:
Assume $\gamma < C_2(x,d)$ . Look for a linear combination of constraints such that,
 $Q,Z= \sum q_{ij}z_{ij} \geq 0$ 
    Note the cone of PSD matrices is convex. $\sum_{}k \alpha_kG.z=P.z$ for some $P$.So, modifying first constraint from the primal,you get 
\renewcommand{\labelenumi}{\Roman{enumi}'.}
\begin{enumerate}
\item $\sum P_{ij}z_{ij}  \geq 0 $ for some $P \epsilon PSD $ \\
Construct P.Choose the elements such that you zero out $z_{ij}$. i.e.
if $P_{ij} >0 $ multiply second constraint from primal by $P_{ij}/2$ , if  $P_{ij} <0 $ multiply second constraint from primal by $-P_{ij}/2$\\
Therefore , you can modify the other constraints from primal to be :
\item $\sum_{{ij}, P_{ij}>0}  P_{ij}/2 (z_{ii} + z_{jj} - 2z_{ij}) \geq P{ij}/2 d(x_i,x_j)^{2}$
\item $\sum_{{ij}, P_{ij}<0} P_{ij}/2 (z_{ii} + z_{jj} - 2z_{ij}) \leq P{ij}/2\sum{\gamma^2(d(x_i,x_j)^2)}$
\end{enumerate}

Adding the second and third constraints above,
\begin{center}
$ 0 \geq \sum_{ij, P_{ij} >0 } P{ij} d(x_i,x_j)^{2} +  \sum_{ij, P_{ij} <0 }P{ij}\sum{\gamma^2(d(x_i,x_j)^2)} $
\end{center}
This will be false if you choose $\gamma$ to be small . Will be false if $\gamma^2 \leq top/bottom$

\section{Expanders: Some Q\&A}
\subsection{Why is this bidirectional?}
If its not, then in one of the assumption we make(not specified which one) you get a value greater than zero intstead of lesser.

\subsection{Why Expanders?}
They are extremal. So, if your algorithm works for expanders,hypercubes and low dimensional graphs,you're good.

\subsection{Can I draw a nice picture of an expander?}
No. Because there are no good cuts for an expander.

\section{Expanders: Some definitions}
From now on, unless stated otherwise, a graph G=(V,E) is undirected and d-regular.(all vertices have the same degree d).

For $S,T \subset V$, where V is the vertex set, you can denote the set of edges from S to T as:
\begin{center}
 $E(S,T)=\{(u,v) | u\epsilon S ,v\epsilon T ,(u,v)\epsilon E\}$
\end{center}

The edge expansion ratio of G, denoted h(G) is defined as:
\begin{center}
$h(G)=\displaystyle{ \min_{(S:|S|<n/2)} E(S,S') / |S|}$
\end{center}
Extensions for this definition:
The numerator is known as edge boundary. You could try for a different definition of edge boundary.
Another possibility is to look at expansion in terms of vertices instead of edges.

\subsection{Definition}
A graph G is $(d,\epsilon)$ expanded if \begin{enumerate}
    \item d-regular 
    \item $h(G)\geq \epsilon $ (Think of $\epsilon$ is a constant. eg 1/10.)
 \end{enumerate}


\subsection{FACT}
(This will be expanded in the next lecture)\\
Cheeger's inequality from Spectral Graph Theory:
Let G be a d-regular graph such that $\lambda_1 > \lambda_2 > ... > \lambda_n$, then, 
\begin{center}
 $(d-\lambda_2)/2 < h(G) \leq \sqrt{2d(d-\lambda_2}$
\end{center}
If the graph is an expander, $h(G)=1/n$



\end{document}
