Thursday, May 26, 2016

Reconstructing a manifold from sample data, with noise

We follow the article [3] and add more background and clarifications. Some assumptions are made that are not explicitly mentioned in the article, to make calculations easier.

Background in probability, measure theory, topology

Let $X$ be a random variable over a space $A$. Recall that the expression $P(X)$ is a number in $[0,1]$ describing the probability of the event $X$ happening. This is called a probability distribution. Here we will consider continuous random variables, so $P(X=x)=0$ for any single element $x\in A$.

Definition: The probability density function of $X$ is the function $f:A\to \R$ satisfying
  • $f(x)\geqslant 0$ for all $x\in A$, and
  • $\int_B f(x)\ dx = P(X\in B)$ for any $B\subseteq A$.
The second condition implies $\int_A f(x)\ dx=1$.

Often authors use just $P$ instead of $f$, and write $P(x)$ instead of $P(X=x)$.

Definition: Let $Y=g(X)$ be another random variable. The expected value of $Y$ is
\[
E[Y] = E[g(X)] = \int_Ag(x)f(x)\ dx.
\]
The mean of $X$ is $\mu= E[X]$, and the variance of $X$ is $\sigma^2 = E[(X-\mu)^2]$. If $\vec X=(X_1\ \cdots\ X_n)^T$ is a multivariate random variable, then $\vec \mu=E[\vec X]$ is an $n$-vector, and the variance is an $(n\times n)$-matrix given as
\[
\Sigma = E[(\vec X-E[\vec X])(\vec X-E[\vec X])^T]
\hspace{1cm}
\text{or}
\hspace{1cm}
\Sigma_{ij} = E[(X_i-E[X_i])(X_j-E[X_j])].
\]
The covariance of $X$ and $Y$ is $E[(X-E[X])(Y-E[Y])]$. Note that the covariance of $X$ with itself is just the usual variance of $X$.

Example: One example of a probability distribution is the normal (or Gaussian) distribution, and we say a random variable with the normal distribution is normally distributed. If a random variable $X$ is normally distributed with mean $\mu$ and variance $\sigma^2$, then the probability density function of $X$ is
\[
f(x) = \frac{\exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)}{\sigma\sqrt{2\pi}}.
\]
If $\vec X=(X_1\ \cdots\ X_n)^T$ is a normally distributed multivariate random variable, then $\vec \mu = (E[X_1]\ \cdots\ E[X_n])^T$ and the probability density function of $\vec X$ is
\[
f(\vec x) = \frac{\exp\left(-\frac 12 (\vec x-\vec \mu)^T\Sigma^{-1}(\vec x-\vec \mu)\right)}{\sqrt{(2\pi)^n\det(\Sigma)}}.
\]

Definition: A measure on $\R^D$ is a function $m:\{$subsets of $\R^D\}\to [0,\infty]$ such that $m(\emptyset) = 0$ and $m(\bigcup_{i\in I} E_i) = \sum_{i\in I} m(E_i)$ for $\{E_i\}_{i\in I}$ a countable sequence of disjoint subsets of $\R^D$. A probability measure on $\R^D$ is a measure $m$ on $\R^D$ with the added condition that $m(\R^D)=1$.

A probability distribution is an example of a probability measure.

Definition: Let $U= \{U_i\}_{i\in I}$ be a covering of a topological space $M$. The nerve of the covering $U$ is a set $N$ of subsets of $I$ given by
\[
N = \left\{J\subset I\ :\ \bigcap_{j\in J} U_j \neq\emptyset\right\}.
\]
Note that this makes $N$ into an abstract simplicial complex, as $J\in N$ implies $J'\in N$ for all $J'\subseteq J$.

Let $M$ be a smooth compact submanifold of $\R^d$. By the tubular neighborhood theorem (see Theorem 2.11.4 in [3]), every smooth compact submanifold $M$ of $\R^d$ has a tubular neighborhood for some $\epsilon>0$.

Definition: For a particular embedding of $M$, let the condition number of $M$ be $\tau=\sup\{\epsilon\ :$ $M$ has an $\epsilon-$tubular neighborhood$\}$.

Distributions on a manifold

Let $M$ be a $d$-dimensional manifold embedded in $\R^D$, with $D>d$. Recall that every element in $NM\subseteq \R^D$, the normal bundle of $M$, may be represented as a pair $(\vec x,\vec y)$, where $\vec x\in M$ and $\vec y\in T^\perp$ (since $M$ is a manifold, all the normal spaces are isomorphic). Hence we may consider a probability distribution $P$ on $NM$, with $\vec X$ the $d$-multivariate random variable representing points on $M$ and $\vec Y$ the $(D-d)$-multivariate random variable representing points on the space normal to $M$ at a point on $M$. We make the assumption that $\vec X$ and $\vec Y$ are independent, or that
\[
P(\vec X, \vec Y) = P_M(\vec X)P_{T^\perp}(\vec Y).
\]
That is, $P_{T^\perp}$ is a probability distribution that is the same at any point on the manifold.

Definition: Let $P$ be a probability distribution on $NM$ and $f_M$ the probability density function of $P_M$. In the context described above, $P$ satisfies the strong variance condition if
  • there exist $a,b>0$ such that $f_M(\vec x)\in [a,b]$ for all $\vec x\in M$, and
  • $P_{T^\perp}(\vec Y)$ is normally distributed with $\vec \mu = 0$ and $\Sigma = \sigma^2I$.
The second condition implies that the covariance of $Y_i$ with $Y_j$ is trivial iff $i\neq j$, and that the vairance of all the $Y_i$s is the same. From the normally distributed multivariate example above, this also tells us that the probability density function $f^\perp$ of $\vec Y$ is
\[
f^\perp(\vec y) = \frac{\exp\left(\displaystyle-\frac{\sigma^2}{2}\sum_{i=1}^{D-d}y_i^2\right)}{\sigma^{D-d}\sqrt{(2\pi)^{D-d}}}.
\]

Theorem:
In the context described above, let $P$ be a probability distribution on $NM$ satisfying the strong variance condition, and let $\delta>0$. If there is $c>1$ such that
\[
\sigma <\frac{c\tau(\sqrt9-\sqrt 8)}{9\sqrt{8(D-d)}},
\]
then there is an algorithm that computes the homology of $M$ from a random sample of $n$ points, with probability $1-\delta$. The number $n$ depends on $\tau,\delta,c,d,D$, and the diameter of $M$.

The homology computing algorithm

Below is a broad view of the algorithm described in sections 3, 4, and 5 of [1]. Let $M$ be a $d$-manifold embedded in $\R^D$, and $P$ a probability measure on $NM$ satisfying the strong variance condition.

1. Calculate the following numbers:
\begin{align*}
\tau & = \text{condition number of $M$}\\
\text{vol}(M) & = \text{volume of $M$}\\
\sigma^2 & = \text{variance of $P$}
\end{align*}
2. Define (or choose) the following numbers:
\begin{align*}
\delta & \in (0,1) \\
r & \in \left(2\sqrt{2(D-d)}\sigma,\textstyle\frac\tau9 (3-2\sqrt 2)\right) \\
n & > \text{function}(a,r,\tau,d,\delta,\text{vol}(M)) & (\max(A,B)\  \text{in Proposition 9 of [1])} \\
s & = 4r \\
deg & > \textstyle \frac{3a}4 \left(1-\left(\frac r{2\tau}\right)^2\right)^{d/2}\text{vol}\left(B^d(r,0)\right)\\
R & = (9r+\tau)/2
\end{align*}
3. Choose $n$ points randomly from $NM$ according to $P$.
4. From these $n$ points, construct the nearest neighbor graph $G$ with distance $s$.
5. Remove from $G$ all the vertices of degree $<deg$ to get a refined graph $G'$.
6. Set $U=\bigcup_{\vec x\in V(G')}B^D(R,\vec x)$ and construct the simplicial complex $K$ of its nerve.
7. Compute the homology of $K$, which is the homology of $M$, with probability $1-\delta$.

References:
[1] Niyogi, Smale, and Weinberger (A topological view of unsupervised learning from noisy data)
[2] Folland (Real analysis, Chapter 10.1)
[3] Bredon (Topology and Geometry, Chapter 2.11)

Thursday, May 19, 2016

Persistent homology (an example)

Here we follow the article "Persistent homology - a Survey," by Herbert Edelsbrunner and John Harer, published in 2008 in "Surveys on discrete and computational geometry," Volume 453.

Consider the sphere, which has known homology groups. Consider a slightly bent embedding of the sphere in $\R^3$, call it $M$, as in the diagram below (imagine it as a hollow blob, whose outline is drawn below). Let $f:M\to \R$ be the height function, measuring the distance from a point in $M$ to a plane just below $M$, coming out of the page. Then we have some critical values $t_0,t_1,t_2,t_3$, as indicated below. Note we have embedded the shape so that no two critical points of $f$ have the same value.
This is remniscent of Morse theory. Set $M_i = f^{-1}[0,t_i]$ and $b_i = \dim(H_i)$ the $i$th Betti number. Then we may easily calculate the Betti numbers of the $M_j$, as in the table below.
\[
\renewcommand\arraystretch{1.3}
\begin{array}{r|c|c|c|c|c}
& M_0 & M_1 & M_2 & M_3 & M \\\hline
b_0 & 1 & 2 & 1 & 1 & 1 \\\hline
b_1 & 0 & 0 & 0 & 0 & 0 \\\hline
b_2 & 0 & 0 & 0 & 1 & 1
\end{array}
\renewcommand\arraystretch{1}
\]
Definition: In the context above, suppose that there is some $p$ and $j>i$ such that:
  • $b_p(M_i)=b_p(M_{i-1})+1$,
  • $b_p(M_j)=b_p(M_{j-1})-1$, and
  •  the generator of $H_p$ introduced at $t_i$ is the same generator of $H_p$ that disappears at $t_j$.
Then $(i,j)$ (or ($t_i,t_j$)) is called a persistence pair and the persistence of $(i,j)$ is $j-i$ (or $f(j)-f(i)$).

For $i$ not in a persistence pair, we say that $i$ represents an essential cycle, or that the persistence of $i$ is infinite. In the example considered, the only persistence pair is $(1,2)$. This may be presented in a persistence diagram, with the indices of critical points on both axes, and the persistence measured as a vertical distance.
If we put a simplicial complex structure on $M$, we may also calculate the homology (and persistence pairs, although they may be different than the ones found above). To make calculations easier, we instead describe a CW structure on our embedded sphere $M$ (with $X_i$ the $i$-skeleton, and the ordering of the $i$-cells as indicated). The results will be the same as for a simplicial complex structure.
This gives one 0-cell, two 1-cells, and three 2-cells (with the obvious gluings), allowing us to construct the chain groups $C_p$ as well as maps between them. The map $d_p:C_p\to C_{p-1}$ as a matrix has size $\dim(C_{p-1})\times \dim(C_p)$, and has entry $(i,j)$ equal to the number of times, counting multiplicity, that the $i$th $(p-1)$-cell is a face of the $j$th $p$-cell. Calculations are done in $\Z/2\Z$.
\[
d_2\ :\ C_2\to C_1
\hspace{.5cm}\text{is}\hspace{.5cm}
\begin{bmatrix}
1 & 0 & 1 \\ 0 & 1 & 1
\end{bmatrix}
\hspace{2cm}
d_1\ :\ C_1\to C_0
\hspace{.5cm}\text{is}\hspace{.5cm}
\begin{bmatrix}
0 & 0
\end{bmatrix}
\]
The Betti numbers are then $b_p = \dim(C_p) - \text{rk}(d_p)-\text{rk}(d_{p+1})$. From above, it is immediate that $\text{rk}(d_1)=0$, $\text{rk}(d_2) = 2$, and $\text{rk}(d_p)=0$ for all other $p$. This tells us that
\begin{align*}
b_0 & = \dim(C_0) - \text{rk}(d_0) - \text{rk}(d_1) = 1 - 0 - 0 = 1, \\
b_1 & = \dim(C_1) - \text{rk}(d_1) - \text{rk}(d_2) = 2 - 0 - 2 = 0, \\
b_2 & = \dim(C_2) - \text{rk}(d_2) - \text{rk}(d_3) = 3 - 2 - 0 = 1, \\
\end{align*}
as expected. To find the persistence pairs, we introduce a filtration on the simplices (equivalently, on the cells) by always having the faces of a cell precede the cell, as well as lower-dimensional cells preceding higher-dimensional cells. Using the same ordering as described above, consider the following filtration:
\begin{align*}
K_0 & = \{\}, \\
K_1 & = \{e^0_1\}, \\
K_2 & = \{e^0_1,e^1_1,e^1_2\} ,\\
K_3 & = \{e^0_1,e^1_1,e^1_2,e^2_1,e^2_2,e^2_3\},
\end{align*}
so $\emptyset = K_0\subset K_1\subset K_2\subset K_3 = M$. This gives an ordering on all the cells of $M$, namely
\[
\sigma_1 = e^0_1,\
\sigma_2 = e^1_1,\
\sigma_3 = e^1_2,\
\sigma_4 = e^2_1,\
\sigma_5 = e^2_2,\
\sigma_6 = e^2_3.
\]
Construct the boundary matrix $D$, with the $(i,j)$ entry of $D$ equal to the number of times, counting multiplicity, modulo 2, that $\sigma_i$ is a codimension 1 face of $\sigma_j$. In the case of our example sphere, we get the matrix
\[
D = \begin{bmatrix}
0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 1 & 0 & 1 \\
0 & 0 & 0 & 0 & 1 & 1 \\
0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0
\end{bmatrix}
\ \ \sim\ \
\begin{bmatrix}
0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 1 & 0 & 0 \\
0 & 0 & 0 & 0 & 1 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0
\end{bmatrix}
\]
in its reduced form (call it $\tilde D$). With respect to the matrix $\tilde D$, define the following numbers:
\begin{align*}
low(j) & = \text{the row number of the lowest non-zero entry in column $j$} ,\\
zero(p) & = \text{the number of zero columns that correspond to $p$-simplices} ,\\
one(p) & = \text{the number of 1s in rows that correspond to $p$-simplices}.
\end{align*}
We calculate all the relevant values of these expressions to be as below.
\begin{align*}
low(1) & = 0 & zero(0) & = 1 & one(0) & = 0 \\
low(2) & = 0 & zero(1) & = 2 & one(1) & = 2 \\
low(3) & = 0 & zero(2) & = 1 & one(2) & = 0 \\
low(4) & = 2 \\
low(5) & = 3 \\
low(6) & = 0
\end{align*}
For persistence, we have
  • if $low(j)=i\neq 0$, then $(i,j)$ is a persistence pair, 
  • if $low(j)=0$ and there is no $k$ such that $low(k)=j$, then $j$ is an essential cycle.
For our sphere example, we get two persistence pairs $(2,4)$ and $(3,5)$, and two essential cycles 1 and 6. Note that this is different from the persistence pairs found by the height function $f:M\to \R$ earlier (but there are still two essential cycles), because there we were comparing the homologies $H_p(M_j)$, but here we are comparing $H_p(K_\ell)$. The persistence diagram is as below.
As an added feature, from the numbers above we may calculate the homology and relative homology groups. Construct the relative chain groups $C_p(M,K_\ell) = C_p(M)/C_p(K_\ell)$ and set $zero(p,\ell)$ to be $zero(p)$ for the lower right submatrix of $\tilde D$ corresponding to the cells in $M-K_\ell$ (and similarly for $one(p,\ell)$). We find these numbers for the bent sphere to be as below.
\begin{align*}
zero(0,0) & = 1 & zero(0,1) & = 0 & zero(0,2) & = 0 & zero(0,3) & = 0 \\
zero(1,0) & = 2 & zero(1,1) & = 2 & zero(1,2) & = 0 & zero(1,3) & = 0 \\
zero(2,0) & = 1 & zero(2,1) & = 1 & zero(2,2) & = 1 & zero(2,3) & = 0 \\[10pt]
one(0,0) & = 0 & one(0,1) & = 0 & one(0,2) & = 0 & one(0,3) & = 0 \\
one(1,0) & = 2 & one(1,1) & = 2 & one(1,2) & = 0 & one(1,3) & = 0 \\
one(2,0) & = 0 & one(2,1) & = 0 & one(2,2) & = 0 & one(2,3) & = 0
\end{align*}
Note that $zero(p,0)=zero(p)$ and $one(p,0)=one(p)$, as well as $zero(p,3)=one(p,3)=0$. The above numbers are useful in calculating
\begin{align*}
\dim(H_p(M)) & = zero(p)-one(p), \\
\dim(H_p(M,K_\ell)) & = zero(p,\ell) - one(p,\ell).
\end{align*}

References: Edelsbrunner and Harer (Persistent homology - a Survey)

Tuesday, May 17, 2016

Spectral sequences and filtrations

Definition: Let $C^\bullet\in C(A)$ be a cochain complex with boundary maps $d^\bullet$ over some category $A$. A filtration of $C^\bullet$ is a sequence of objects $F^nC^\bullet$ with boundary maps $d^{\bullet,n}_h$ in the category of cochain complexes $C(A)$ of $A$, either a
\begin{align*}
\text{decreasing filtration}\ C^\bullet & \supseteq \cdots \supseteq F^{n-1}C^\bullet \supseteq F^nC^\bullet\supseteq F^{n+1}C^\bullet\supseteq \cdots\ \text{or}\ \\
\text{increasing filtration}\ C^\bullet & \supseteq \cdots \supseteq F^{n+1}C^\bullet \supseteq F^nC^\bullet\supseteq F^{n-1}C^\bullet\supseteq \cdots,
\end{align*}
where ``$\supseteq$" is defined as necessary, along with maps $d_v^{\bullet,n}:F^nC^\bullet \to F^{n\pm1}C^\bullet$. These maps are compatible, in the sense that $d_v^{k\pm1,n}d_h^{k,n} = d_h^{k,n\mp1}d_v^{k,n}$.

Example:
Define ``$\supseteq$" as $X\supseteq Y$ iff $\Hom(Y,X)$ is non-empty. The bete (or brutal) filtration of $C^\bullet$ is a decreasing filtration
\[
\left(F^nC^\bullet\right)^i = \begin{cases} 0 &\ \text{if}\ i<n, \\ C^i & \ \text{if}\ i\geqslant n,\end{cases}
\hspace{1cm}\text{with}\hspace{1cm}
H^k(F^nC^\bullet) = \begin{cases} 0 & \ \text{if}\ k<n \\ Z^n & \ \text{if}\ k=n, \\ H^k(C^\bullet) & \ \text{if}\ k>n.\end{cases}
\]
This filtration may be represented by the diagram

which clearly commutes. The good filtration of $C^\bullet$ is also a decreasing filtration
\[
\left(F^nC^\bullet\right)^i = \begin{cases} C^i &\ \text{if}\ i<n, \\ Z^iC^\bullet & \ \text{if}\ i=n, \\ 0 & \ \text{if}\  i>n,\end{cases}
\hspace{1cm}\text{with}\hspace{1cm}
H^k(F^nC^\bullet) = \begin{cases} H^k(C^\bullet) & \ \text{if}\ k\leqslant n, \\ 0 & \ \text{if}\ k>n.\end{cases}
\]
This filtration may be represented by the diagram

which also commutes. Both of these are also called truncations. The good filtration is "better" because the cocycle groups $Z^n$ do not appear in the cohomology groups. The same may be done for homology groups.

Definition: Set $F^nC^k = (F^nC^\bullet)^k = F^nC^\bullet \cap C^k$, and let the zeroth page of the cohomology spectral sequence of $C^\bullet$ with the filtration $F$ be given by
\begin{align*}
E^{p,q}_0 & = F^pC^{p+q} / F^{p+1}C^{p+q} \ \ \text{if $F$ is decreasing,} \\
& = F^pC^{p+q} / F^{p-1}C^{p+q} \ \ \text{if $F$ is increasing.}
\end{align*}
Let the first page of the cohomology spectral sequence of $C^\bullet$ with the filtration $F$ be given by
\begin{align*}
E^{p,q}_1 & = H^{p+q}(F^pC^\bullet / F^{p+1}C^\bullet) \ \ \text{if $F$ is decreasing,} \\
& = H^{p+q}(F^pC^\bullet / F^{p-1}C^\bullet) \ \ \text{if $F$ is increasing.}
\end{align*}
From now on, assume that $F$ is an increasing filtration. Let the second page of the cohomology spectral sequence of $C^\bullet$ with the filtration $F$ be given by
\[
E^{p,q}_2 = \frac{\ker(E_1^{p,q} \to E_1^{p+1,q})}{\text{im}(E_1^{p-1,q}\to E_1^{p,q})}.
\]
Continue in this manner and let the $r$th page of the cohomology spectral sequence of $C^\bullet$ with the filtration $F$ be given by
\[
E_r^{p,q} = \frac{\{x\in F^pC^{p+q}\ :\ dx\in F^{p+r}C^{p+q+1}\}}{F^{p+1}C^{p+q} + dF^{p-r+1}C^{p+q-1}}.
\]

The same may be done for a homology spectral sequence. Note that a spectral sequence may also be defined without coming from a filtration.

Definition: A homology spectral sequence is a collection of objects $E^r_{p,q}$ and maps $d^r_{p,q}:E^r_{p,q}\to E^r_{p-r,q+r-1}$ with $d^rd^r=0$ such that
\[
E^{r+1}_{p,q}\cong \ker(d^r_{p,q})/\text{im}(d^r_{p+r,q-r+1}).
\]
Similarly, a cohomology spectral sequence is a collection of objects $E_r^{p,q}$ and maps $d_r^{p,q}:E_r^{p,q}\to E_r^{p+r,q-r+1}$ with $d^rd^r=0$ such that
\[
E_{r+1}^{p,q}\cong \ker(d_r^{p,q})/\text{im}(d_r^{p-r,q+r-1}).
\]

References: Weibel (An introduction to homological algebra, Chapter 1.2), McCleary (A user's guide to spectral sequences, Chapter 2.2), Hutchings (Algebraic topology lecture notes, see math.berkeley.edu/~hutching/teach/215b-2011)