\newcounter{lecture}\setcounter{lecture}{2}
\def\thetoday{January 13, 2009}
\def\thetitle{Trees and Cayley's Theorem}

\input{lec_hdr.tex}

A \defn{tree} is a connected graph with no cycles.  A \defn{forest} is a 
graph where each connected component is a tree.  A node in a forest with 
degree 1 is called a \defn{leaf}.


\begin{claim} (simple properties of trees)
\begin{itemize}
\item Every tree has at least two leaves (vertices of degree 1), assuming $n \geq 2$.
\item The number of edges in a tree of size $n$ is $n-1$
\end{itemize}
\label{simple}
\end{claim}

\begin{proof}
The first property can be seen by starting a path at an arbitrary
node, walking to any neighbor that hasn't been visited before. After
at most $n$ steps, we must reach a vertex $v$ where there are no 
untraversed neighbors. Therefore, $v$ must have only one neighbor and 
is a leaf, otherwise there is a cycle and the graph is not a tree. 
We can then traverse a path starting at $v$, and the vertex where the 
path stops must be a second leaf.

We can show the second property by induction. The base case $n=1$ is
trivial. Any tree of size $n \geq 2$ has at least one leaf. Removing 
a leaf results in a tree with one less node and one less edge. Therefore, 
a tree with $n$ vertices has one more edge than a tree with $n-1$ vertices.  
Applying this to the base case proves the result.
\end{proof} 

\subsubsection*{Graph and Tree Combinatorics}

How many graphs are there with $n$ vertices?  Since there are ${n \choose 2}$ 
distinct edges, there are $2^{n \choose 2 }$ graphs.

We call a graph \defn{labeled} if the vertex ordering is important.
How many labeled trees are there for a given number of vertices $n$?

\begin{thm} Cayley's Theorem ($19^{th}$ century)

The number of labelled trees with $n$ vertices is $n^{n-2}$.

\end{thm}

The idea of our proof will be to use a bijective mapping between
labeled trees and something easier to count.  To prove Cayley's
theorem, we will use \emph{Pr\"ufer codes}.

To build the Pr\"ufer code of a tree $T(V,E)$, we use the following
iterative procedure.

At step $t=1,2, \dots, n-1$
\begin{itemize}
\item Let vertex $i$ be the leaf with smallest label in $T$ at step $t$,
  and vertex $j$  its parent.
\item Remove $i$ from $T$.
\item Let $A'_t = j$, and let $B_t = i$.
\item Repeat until there are no edges.
\end{itemize}

We call the two $n-1$ length strings $A'$ and $B$ \emph{extended
  Pr\"ufer code} of $T$.  Note that this is just a permuted
edge list of $T$.

By construction, the last entry of $A'$ must be $n$, since $n$ will
never be the leaf with smallest label and there are always at least
two leaves (claim \ref{simple}). The string $A$ obtained by erasing 
the last entry of $A'$ and is called the Pr\"ufer code of $T$. 

%For an example on how to construct
%the Pr\"ufer code of a tree, psee figure 1 in the appendix.

\begin{lemma}
\label{lem1}

We can construct $B$ and $A'$ using only $A$.

\end{lemma}

\begin{proof}

  Given $A$, we construct $A'$ by appending $n$ to $A$. Let
  $A'=a_1a_2...a_{n-2}n$, we construct $B=b_1b_2...b_{n-1}$ as
  follows:

$\forall i, b_i$ is the smallest number that hasn't appeared in
$\{b_1,..,b_{i-1}\}\cup \{a_i,...,a_{n-2}\}$. This can be seen from
the way in which Pr\"ufer codes are constructed by 'pruning' leaf from
the tree.  At step $t=i$, we have already removed nodes $\{b_1,..,b_{i-1}\}$
and it cannot be a parent in future steps (so it's not in $\{a_i,...,a_{n-2}\}$).  
All remaining nodes correspond to precisely the set of leaves, so $b_i$
must be the smallest of these as stated.

\end{proof}

\begin{prop} The following hold for Pr\"ufer codes:

\begin{itemize}

\item The number of Pr\"ufer codes is $n^{n-2}$.
\item $\forall v \in V$, the degree of $v$ is the number of times it appears in the code plus one.
\item Each Pr\"ufer code yields exactly one tree.

\end{itemize}

\end{prop}

\begin{proof}

The first statement is obvious from the Pr\"ufer code representation:
there are $n$ choices at each of the $n-2$ digits of the code.  
To see the second statement, simply note that the number of times it
appears in the code is the number of ``child'' vertices it has, then
add one for being a leaf at some point to account for the degree.

To prove the third statement, note that each tree produces a unique
Pr\"ufer code by the deterministic nature of their construction.
Also, as shown in Lemma \ref{lem1}, we can uniquely construct an edge
list from each Pr\"ufer code.  All that remains to be shown is that
the edge list recovered must be a tree.  This is easy to see, as at
each stage in the reconstruction of $B$ from Lemma \ref{lem1},
starting from the right (label $n$) we are adding nodes of degree one.  Since we
start with a tree, and adding a node of degree one to a tree produces
another tree, the final graph must be a tree.

\end{proof}

%-------------------Biblio---------------------------------------------------------
%\newpage
%\begin{thebibliography}{}
	
%\bibitem{QIN} Jon Kleinberg, Prabhakar Raghavan, {\it Query Incentive Networks}.
%\end{thebibliography}


%----------------------end Biblio--------------------------------------------------
\newpage
\begin{figure}[htp]
\centering
\includegraphics[scale=0.4]{Lecture2_prufer1.png} 
\vskip 1in
\includegraphics[scale=0.4]{Lecture2_prufer2.png}
\end{figure}


\end{document}
