mat-blag: persistence

Thursday, August 3, 2017

New directions in TDA

Conference topic

This post is informal, meant as a collection of (personally) new things from the workshop "Topological data analysis: Developing abstract foundations" at the Banff International Research Station, July 31 - August 4, 2017. New actual questions:

Does there exist a constructible sheaf valued in persistence modules over $\Ran^{\leqslant n}(M)$?
- On the stalks it should be the persistence module of $P\in \Ran^{\leqslant n}(M)$. What about arbitrary open sets?
- Is there such a thing as a colimit of persistence modules?
- Uli Bauer suggested something to do with ordering the elements of the sample and taking small open sets.
Can framed vector spaces be used to make the TDA pipeline functorial? Does Ezra Miller's work help?
- Should be a functor from $(\R,\leqslant)$, the reals as a poset, to $\text{Vect}$ or $\text{Vect}_{fr}$, the category of (framed) vector spaces. Filtration function $f:\R^n\to \R$ is assumed to be given.
- Framed perspective should not be too difficult, just need to find right definitions.
- Does this give an equivalence of categories (category of persistence modules and category of matchings)? Is that what we want? Do we want to keep only specific properties?
- Ezra's work is very dense and unpublished. But it seems to have a very precise functoriality (which is not the main thrust of the work, however).
Can the Bubenik-de Silva-Scott interleaving categorification be viewed as a (co)limit? Diagrams are suggestive.
- Reference is 1707.06288 on the arXiv.
- Probably not a colimit, because that would be very large, though the arrows suggest a colimit.
- Have to be careful, because the (co)limit should be in the category of posets, not just interleavings.

New things to learn about:

Algebraic geometry / homotopy theory: the etale space of a sheaf, Kan extensions, model categories, symmetric monoidal categories.
TDA related: Gromov-Hausdorff distance, the universal distance (Michael Lesnick's thesis and papers), merge trees, Reeb graphs, Mapper (the program).

Thursday, May 19, 2016

Persistent homology (an example)

Here we follow the article "Persistent homology - a Survey," by Herbert Edelsbrunner and John Harer, published in 2008 in "Surveys on discrete and computational geometry," Volume 453.

Consider the sphere, which has known homology groups. Consider a slightly bent embedding of the sphere in $\R^3$, call it $M$, as in the diagram below (imagine it as a hollow blob, whose outline is drawn below). Let $f:M\to \R$ be the height function, measuring the distance from a point in $M$ to a plane just below $M$, coming out of the page. Then we have some critical values $t_0,t_1,t_2,t_3$, as indicated below. Note we have embedded the shape so that no two critical points of $f$ have the same value.

This is remniscent of Morse theory. Set $M_i = f^{-1}[0,t_i]$ and $b_i = \dim(H_i)$ the $i$th Betti number. Then we may easily calculate the Betti numbers of the $M_j$, as in the table below.
\[
\renewcommand\arraystretch{1.3}
\begin{array}{r|c|c|c|c|c}
& M_0 & M_1 & M_2 & M_3 & M \\\hline
b_0 & 1 & 2 & 1 & 1 & 1 \\\hline
b_1 & 0 & 0 & 0 & 0 & 0 \\\hline
b_2 & 0 & 0 & 0 & 1 & 1
\end{array}
\renewcommand\arraystretch{1}
\]
Definition: In the context above, suppose that there is some $p$ and $j>i$ such that:

$b_p(M_i)=b_p(M_{i-1})+1$,
$b_p(M_j)=b_p(M_{j-1})-1$, and
the generator of $H_p$ introduced at $t_i$ is the same generator of $H_p$ that disappears at $t_j$.

Then $(i,j)$ (or ($t_i,t_j$)) is called a persistence pair and the persistence of $(i,j)$ is $j-i$ (or $f(j)-f(i)$).

For $i$ not in a persistence pair, we say that $i$ represents an essential cycle, or that the persistence of $i$ is infinite. In the example considered, the only persistence pair is $(1,2)$. This may be presented in a persistence diagram, with the indices of critical points on both axes, and the persistence measured as a vertical distance.

If we put a simplicial complex structure on $M$, we may also calculate the homology (and persistence pairs, although they may be different than the ones found above). To make calculations easier, we instead describe a CW structure on our embedded sphere $M$ (with $X_i$ the $i$-skeleton, and the ordering of the $i$-cells as indicated). The results will be the same as for a simplicial complex structure.

This gives one 0-cell, two 1-cells, and three 2-cells (with the obvious gluings), allowing us to construct the chain groups $C_p$ as well as maps between them. The map $d_p:C_p\to C_{p-1}$ as a matrix has size $\dim(C_{p-1})\times \dim(C_p)$, and has entry $(i,j)$ equal to the number of times, counting multiplicity, that the $i$th $(p-1)$-cell is a face of the $j$th $p$-cell. Calculations are done in $\Z/2\Z$.
\[
d_2\ :\ C_2\to C_1
\hspace{.5cm}\text{is}\hspace{.5cm}
\begin{bmatrix}
1 & 0 & 1 \\ 0 & 1 & 1
\end{bmatrix}
\hspace{2cm}
d_1\ :\ C_1\to C_0
\hspace{.5cm}\text{is}\hspace{.5cm}
\begin{bmatrix}
0 & 0
\end{bmatrix}
\]
The Betti numbers are then $b_p = \dim(C_p) - \text{rk}(d_p)-\text{rk}(d_{p+1})$. From above, it is immediate that $\text{rk}(d_1)=0$, $\text{rk}(d_2) = 2$, and $\text{rk}(d_p)=0$ for all other $p$. This tells us that
\begin{align*}
b_0 & = \dim(C_0) - \text{rk}(d_0) - \text{rk}(d_1) = 1 - 0 - 0 = 1, \\
b_1 & = \dim(C_1) - \text{rk}(d_1) - \text{rk}(d_2) = 2 - 0 - 2 = 0, \\
b_2 & = \dim(C_2) - \text{rk}(d_2) - \text{rk}(d_3) = 3 - 2 - 0 = 1, \\
\end{align*}
as expected. To find the persistence pairs, we introduce a filtration on the simplices (equivalently, on the cells) by always having the faces of a cell precede the cell, as well as lower-dimensional cells preceding higher-dimensional cells. Using the same ordering as described above, consider the following filtration:
\begin{align*}
K_0 & = \{\}, \\
K_1 & = \{e^0_1\}, \\
K_2 & = \{e^0_1,e^1_1,e^1_2\} ,\\
K_3 & = \{e^0_1,e^1_1,e^1_2,e^2_1,e^2_2,e^2_3\},
\end{align*}
so $\emptyset = K_0\subset K_1\subset K_2\subset K_3 = M$. This gives an ordering on all the cells of $M$, namely
\[
\sigma_1 = e^0_1,\
\sigma_2 = e^1_1,\
\sigma_3 = e^1_2,\
\sigma_4 = e^2_1,\
\sigma_5 = e^2_2,\
\sigma_6 = e^2_3.
\]
Construct the boundary matrix $D$, with the $(i,j)$ entry of $D$ equal to the number of times, counting multiplicity, modulo 2, that $\sigma_i$ is a codimension 1 face of $\sigma_j$. In the case of our example sphere, we get the matrix
\[
D = \begin{bmatrix}
0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 1 & 0 & 1 \\
0 & 0 & 0 & 0 & 1 & 1 \\
0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0
\end{bmatrix}
\ \ \sim\ \
\begin{bmatrix}
0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 1 & 0 & 0 \\
0 & 0 & 0 & 0 & 1 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0
\end{bmatrix}
\]
in its reduced form (call it $\tilde D$). With respect to the matrix $\tilde D$, define the following numbers:
\begin{align*}
low(j) & = \text{the row number of the lowest non-zero entry in column $j$} ,\\
zero(p) & = \text{the number of zero columns that correspond to $p$-simplices} ,\\
one(p) & = \text{the number of 1s in rows that correspond to $p$-simplices}.
\end{align*}
We calculate all the relevant values of these expressions to be as below.
\begin{align*}
low(1) & = 0 & zero(0) & = 1 & one(0) & = 0 \\
low(2) & = 0 & zero(1) & = 2 & one(1) & = 2 \\
low(3) & = 0 & zero(2) & = 1 & one(2) & = 0 \\
low(4) & = 2 \\
low(5) & = 3 \\
low(6) & = 0
\end{align*}
For persistence, we have

if $low(j)=i\neq 0$, then $(i,j)$ is a persistence pair,
if $low(j)=0$ and there is no $k$ such that $low(k)=j$, then $j$ is an essential cycle.

For our sphere example, we get two persistence pairs $(2,4)$ and $(3,5)$, and two essential cycles 1 and 6. Note that this is different from the persistence pairs found by the height function $f:M\to \R$ earlier (but there are still two essential cycles), because there we were comparing the homologies $H_p(M_j)$, but here we are comparing $H_p(K_\ell)$. The persistence diagram is as below.

As an added feature, from the numbers above we may calculate the homology and relative homology groups. Construct the relative chain groups $C_p(M,K_\ell) = C_p(M)/C_p(K_\ell)$ and set $zero(p,\ell)$ to be $zero(p)$ for the lower right submatrix of $\tilde D$ corresponding to the cells in $M-K_\ell$ (and similarly for $one(p,\ell)$). We find these numbers for the bent sphere to be as below.
\begin{align*}
zero(0,0) & = 1 & zero(0,1) & = 0 & zero(0,2) & = 0 & zero(0,3) & = 0 \\
zero(1,0) & = 2 & zero(1,1) & = 2 & zero(1,2) & = 0 & zero(1,3) & = 0 \\
zero(2,0) & = 1 & zero(2,1) & = 1 & zero(2,2) & = 1 & zero(2,3) & = 0 \\[10pt]
one(0,0) & = 0 & one(0,1) & = 0 & one(0,2) & = 0 & one(0,3) & = 0 \\
one(1,0) & = 2 & one(1,1) & = 2 & one(1,2) & = 0 & one(1,3) & = 0 \\
one(2,0) & = 0 & one(2,1) & = 0 & one(2,2) & = 0 & one(2,3) & = 0
\end{align*}
Note that $zero(p,0)=zero(p)$ and $one(p,0)=one(p)$, as well as $zero(p,3)=one(p,3)=0$. The above numbers are useful in calculating
\begin{align*}
\dim(H_p(M)) & = zero(p)-one(p), \\
\dim(H_p(M,K_\ell)) & = zero(p,\ell) - one(p,\ell).
\end{align*}

References: Edelsbrunner and Harer (Persistent homology - a Survey)