mat-blag: persistent homology

Showing posts with label persistent homology. Show all posts

Sunday, November 25, 2018

Visualizing paths in configuration space

The goal of this post is to visualize how point configurations induce persistent homology, and how paths between point samples induce changes in the simplicial complexes producing the homology. We use the Čech simplicial complex construction of a finite subset of $\R^N$.

Definition: For $M$ a Riemannian mandifold, $\Conf_n(M):= \{P\subseteq M : |P|=n\}$ is the configuration space of $n$ points on $M$.

The space $\Conf_n(M)$ is itself a topological space, with topology induced by the Hausdorff distance of subsets. Let $\SC$ be the set of abstract simplicial complexes $(V,S)$, where $V$ is a set and $S\subseteq P(V)$ closed under subsets. Let $\uSC$ be the set of unlabeled abstract simplicial complexes, with the natural projection map $\SC\to \uSC$.

Definition: The Čech map is the function $\check C\colon \Conf_n(M)\times \R_{\geqslant 0}\to \SC$ given by $V(\check C(P,r))=P$ and $P'\in S(\check C(P,r))$ whenever $\bigcap_{p\in P'} B(p,r) \neq \emptyset$, for every $P'\subseteq P$. The unlabeled Čech map is the composition of $\check C$ with the projection to $\uSC$.

We will consider the case $M=\R^2$ and $n=4$. To describe an implementation of the Čech map, we only need to consider double and triple intersections. Finding if $B(P_1,r)\cap B(P_2,r)$ is empty or not is easy, but to determine if $B(P_1,r)\cap B(P_2,r)\cap B(P_3,r)$ is empty or not requires more care. Below is an implementation in Mathematica.


(* CechPt : Finds the coordinate where balls of the same radii around a,b,c will first intersect *)
(* Input : 3 coordinates {x, y}. Output : 1 coordinate {x, y} *)
CechPt[a_,b_,c_] := Module[{
    cenx = Det[{{Norm[a]^2, a[[2]], 1}, {Norm[b]^2, b[[2]], 1}, {Norm[c]^2, c[[2]], 1}}],
    ceny = Det[{{a[[1]], Norm[a]^2, 1}, {b[[1]], Norm[b]^2, 1}, {c[[1]], Norm[c]^2, 1}}],
    scal = 2*Det[{{a[[1]], a[[2]], 1}, {b[[1]], b[[2]], 1}, {c[[1]], c[[2]], 1}}]},
   cen = {cenx/scal, ceny/scal};
   If[Max[ArcCos[(b-a).(c-a)/(Norm[b-a]*Norm[c-a])],
           ArcCos[(a-b).(c-b)/(Norm[a-b]*Norm[c-b])],
           ArcCos[(a-c).(b-c)/(Norm[a-c]*Norm[b-c])]] < Pi/2, cen,
     If[Norm[cen-(a+b)/2] < Norm[cen-(a+c)/2],
       If[Norm[cen-(a+b)/2] < Norm[cen-(b+c)/2], (a+b)/2, (b+c)/2],
       If[Norm[cen-(a+c)/2] < Norm[cen-(b+c)/2], (a+c)/2, (b+c)/2]]]];

Here cen is the circumcenter of the input points, which corresponds to our desired point only if it lies within the convex hull of the points. Now $B(P_1,r)\cap B(P_2,r)\cap B(P_3,r)$ is non-empty if and only if the distance from each of $P_1$, $P_2$, $P_3$ to CechPt[$P_1$, $P_2$, $P_3$] is less than or equal to $r$.

Let $\gamma\colon I\to \Conf_4(\R^2)$ be a path, and $\gamma(0)= \{P_1,P_2,P_3,P_4\}$. At each $t\in I$ and for every pair and triple $P'\subseteq\gamma(t)$, we can find the smallest $r$ such that $\bigcap_{p\in P'}B(P,r)\neq \emptyset$. This gives 6 curves for the pairs $P'$, and 4 curves for the triples $P'$, which we can plot all together in Mathematica.


PList[t_] := {P1[t],P2[t],P3[t],P4[t]};
(* Graphs of pairwise distances *)
DistGraph1 = Plot[Table[Norm[pair[[1]]-pair[[2]]]/2, {pair,Subsets[PList[t],{2}]}], 
               {t, 0, 1}, PlotRange -> {{0,1},{0,1.5}}, PlotStyle -> {Gray}, AspectRatio -> 1];
(* Graphs of minimum distance from every triple to its CechPt*)
DistGraph2 = Plot[Table[Max[
                  Table[Norm[triple[[k]]-CechPt@@triple],{k,1,3}]], {triple,Subsets[PList[t],{3}]}],
               {t, 0, 1}, PlotRange -> {{0,1},{0,1.5}}, PlotStyle -> {Orange}, AspectRatio -> 1];

The code is given so that it may be easily generalized to more than 4 points. Next, use the Manipulate command to add interactivity to the graphs.


Manipulate[{
  Show[DistGraph2, DistGraph1],
  Show[
    ParametricPlot[PList[t],{t,0,X[[1]]},PlotRange -> {{-2,2},{-2,2}},PlotStyle -> {Black}],
    Graphics[Join[
      {Opacity[.2],Red}, Table[Disk[point,X[[2]]],{point,PList[X[[1]]]}],
      {Opacity[1],Red}, Table[Circle[point,X[[2]]],{point,PList[X[[1]]]}],
      {Red,Disk[P1[X[[1]]],.05]},
      {Blue,Disk[P2[X[[1]]],.05]},
      {Darker[Green],Disk[P3[X[[1]]],.05]},
      {Yellow,Disk[P4[X[[1]]],.05]}]]],
  Graphics[Join[
    {Black, Thick},
    Flatten[Table[{{Opacity[0], Opacity[.3]}[[Boole[X[[2]] >= Norm[pair[[1]][[1]][X[[1]]] 
               - pair[[2]][[1]][X[[1]]]]/2] + 1]], Line[#[[2]]&/@pair]},
               {pair,Subsets[{{P1,{0,0}},{P2,{2,0}},{P3,{0,2}},{P4,{2,2}}},{2}]}]],
    Flatten[Table[{{Opacity[0], Opacity[.3]}[[Boole[X[[2]] >= Max[Table[Norm[triple[[k]][[1]][X[[1]]]
               - CechPt@@(#[[1]][X[[1]]]&/@triple)], {k,1,3}]]] + 1]], Polygon[#[[2]]&/@triple]},
               {triple,Subsets[{{P1,{0,0}},{P2,{2,0}},{P3,{0,2}},{P4,{2,2}}},{3}]}]],
    {Opacity[1], Red, Disk[{0,0},.07], Blue, Disk[{2,0},.07], Darker[Green], Disk[{0,2},.07],
               Yellow, Disk[{2,2},.07]}]]
  }, {{X, {.1, .1}}, Locator}]

This produces the interactive visualization below, allowing the user to drag the crosshairs on the graph on the left (graphs of when double and triple intersections are reached). The paths of the individual points $P_1,P_2,P_3,P_4$ are in the middle and the image of the unlabeled Čech map is on the right.

The graphs on the left stratify the strip $I\times \R_{\geqslant 0}$, so that the unlabeled Čech map is constant on each stratum. Computing the Betti numbers of each simplicial complex gives the CROCKER plot (see TZH) of the stratified space. We use the Čech instead of the Rips complex, so perhaps this should be called the CROCKEČ plot. The stratified space, 0-dimensional, and 1-dimensional plots are given below.

Here the Betti numbers were computed by inspection, since the complexes are so small. An extension would be to make this computation automatic once the input path $\gamma$ is given.

The Mathematica code for this post is available online.

References: Topaz, Ziegelmeier, Halverson (Topological Data Analysis of Biological Aggregation Models)

Sunday, May 21, 2017

Categories and the TDA pipeline

Conference topic

This post contains topics and ideas from ACAT at HIM, April 2017, as presented by Professor Ulrich Bauer (see slide 11 of his presentation, online at ulrich-bauer.org/persistence-bonn-talk.pdf). The central theme is to assign categories and functors to analyze the process
\[
\text{filtration}\ \longrightarrow\ \text{(co)homology}\ \longrightarrow\ \text{barcode.}
\hspace{3cm}(\text{pipe}) \] Remark: The categories we will use are below. For filtrations, we have the ordered reals (though any poset $P$ would work) and topological spaces:
\begin{align*}
R\ :\ & \Obj(R) = \R, & \Top\ :\ & \Obj(\Top) = \{\text{topological spaces}\}, \\[5pt]
& \Hom(r,s) = \begin{cases}
\{r \mapsto s\}, & \text{ if } r\leqslant s, \\ \emptyset, & \text{ else,}
\end{cases} && \Hom(X,Y) = \{\text{functions }f:X\to Y\}.
\end{align*}
For (co)homology groups, we have the category of (framed) vector spaces. We write $V^n$ for $V^{\oplus n} = V\oplus V\oplus \cdots \oplus V$, and $e_n$ for a frame of $V^n$ (see below).
\begin{align*}
\Vect\ :\ & \Obj(\Vect) = \{V^{\oplus n}\ :\ 0\leqslant n< \infty\},\\
& \Hom(V^n,V^m) = \{\text{homomorphisms }f:V^n\to V^m\}, \\[5pt]
\Vect^{fr}\ :\ & \Obj(\Vect^{fr}) = \{V^n\times e^n\ :\ 0\leqslant n<\infty\}, \\
& \Hom(V^n\times e^n,V^m\times e^m) = \{\text{hom. }f:V^n\to V^m,\ g:e^n\to e^m,\ g\in \Mat(n,m)\}.
\end{align*}
Finally for barcodes, we have $\Delta$, the category of finite ordered sets, and its variants. A partial injective function, or matching $f:A\nrightarrow B$ is a bijection $A'\to B'$ for some $A'\subseteq A$, $B'\subseteq B$.
\begin{align*}
\Delta\ :\ & \Obj(\Delta) = \{[n]=(0,1,\dots,n)\ :\ 0\leqslant n<\infty\},\\
& \Hom([n],[m]) = \{ \text{order-preserving functions }f:[n]\to [m]\}, \\[5pt]
\Delta'\ :\
& \Obj(\Delta')= \{a=(a_0<a_1<\cdots<a_n)\ :\ a_i\in \Z_{\geqslant 0}, 0\leqslant n<\infty\},\\ & \Hom(a,b) = \{\text{order-preserving functions }f:a\to b\}, \\[5pt]
\Delta''\ :\
& \Obj(\Delta'')= \{a=(a_0<a_1<\cdots<a_n)\ :\ a_i\in \Z_{\geqslant 0}, 0\leqslant n<\infty\},\\ & \Hom(a,b) = \{\text{order-preserving partial injective functions }f:a\nrightarrow b\}.
\end{align*}

Definition: A frame $e$ of a vector space $V^n$ is equivalently:

an ordered basis of $V^n$,
a linear isomorphism $V^n\to V^n$, or
an element in the fiber of the principal rank $n$ frame bundle over a point.

Frames (of possibly different sizes) are related by full rank elements of $\Mat(n,m)$, which contains all $n\times m$ matrices over a given field.

Definition: Let $(P,\leqslant)$ be a poset. A (indexed topological) filtration is a functor $F:P\to \Top$, with
\[
\Hom(F(r),F(s)) = \begin{cases}
\{\iota:F(r) \hookrightarrow F(s)\}, & \text{ if }r\leqslant s, \\ \emptyset, & \text{ else,}
\end{cases}
\]
where $\iota$ is the inclusion map. That is, we require $F(r)\subseteq F(s)$ whenever $r\leqslant s$.

Definition: A persistence module is the composition of functors $M_i:P \tov{F} \Top \tov{H_i} \Vect$.

Homology will be taken over some field $k$. A framed persistence module is the same composition as above, but mapping into $\Vect^{fr}$ instead. The framing is chosen to describe how many different vector spaces have already been encountered in the filtration.

Definition: A barcode is a collection of intervals of $\R$. It may also be viewed as the composition of functors $B_i:P\tov{F}\Top\tov{H_i}\Vect \tov{\dim}\Delta$.

Similarly as above, we may talk about a framed barcode by instead mapping into $\Vect^{fr}$ and then to $\Delta''$, keeping track of which vector spaces we have already encountered. This allows us to interpret the process $(\text{pipe})$ in two different ways. First we have the unframed approach
\[
\begin{array}{r c c c l}
\Top & \to & \Vect & \to & \Delta, \\
X_t & \mapsto & H_i(X_t;k) & \mapsto & [\dim(H_i(X_t;k))].
\end{array}
\]
The problem here is interpreting the inclusion $X_t\hookrightarrow X_{t'}$ as a map in $\Delta$, for instance, in the case when $H_i(X_t;k)\cong H_i(X_{t'};k)$, but $H_i(X_t\hookrightarrow X_{t'}) \neq \id$. To fix this, we have the framed interpretation of $(\text{pipe})$
\[
\begin{array}{r c c c l}
\Top & \to & \Vect^{fr} & \to & \Delta'', \\
X_t & \mapsto & H_i(X_t;k)\times e & \mapsto & [e].
\end{array}
\]
The first map produces a frame $e$ of size $n$, where $n$ is the total number of different vector spaces encountered over all $t'\leqslant t$, by setting the first $\dim(H_i(X_t;k))$ coordinates to be the appropriate ones, and then the rest. This is done with the second map to $\Delta''$ in mind, as the size of $[e]$ is $\dim(H_i(X_t;k))$, with only the first $\dim(H_i(X_t;k))$ basis vectors taken from $e$. As usual, these maps are best understood by example.

Example: Given the closed curve $X$ in $\R^2$ below, let $\varphi:X\to \R$ be the height map from the line 0, with $X_i=\varphi^{-1}(-\infty,i]$, for $i=r,s,t,u,v$. Let $e_i$ be the standard $i$th basis vector in $\R^N$.

Remark: This seems to make $(\text{pipe})$ functorial, as the maps $X_t\hookrightarrow X_{t'}$ may be naturally viewed as partial injective functions in $\Delta''$, to account for the problem mentioned with the unframed interpretation. However, we have traded locality for functoriality, as the image of $X_t$ in $\Delta''$ can not be calculated without having calculated $X_{t'}$ for all $t'<t$.

References: Bauer (Algebraic perspectives of persistence), Bauer and Lesnick (Induced matchings and the algebraic stability of persistence barcodes)

Sunday, April 9, 2017

Distance and persistence diagrams

We assume we have a Morse-type function $f:X\to \R$, whose associated persistence diagram is $D(f) = \{f_1,\dots,f_n\}$, which we will think of as a collection of persistence birth-death pairs $f_i$ in the extended real plane $(\R^*)^2$. If the topological space $X$ was filtered without such a function, define one by $x\mapsto i$ where $i$ is the smallest index such that $x\in X_i$.

Definition: Let $f,g:X\to \R$ be two Morse-type functions with associated persistence diagrams $D(f)$, $D(g)$. The (Wasserstein) $q$-distance between $f$ and $g$ is defined as
\[
W_q(f,g) := \inf_{\sigma\in S_n} \left(\sum_{i=1}^n ||f_i-g_{\sigma(i)}||^q_\infty\right)^{1/q}.\]The bottleneck distance between $f$ and $g$ is
\begin{align*}
W_\infty(f,g) & := \lim_{q\to\infty} \left\{W_q(f,g)\right\} & (\text{limit of $q$-distances}) \\
& = \max_i\left\{||f_i-g_{\sigma(i)}||_\infty\ :\ \sigma = \arg W_q(f,g)\right\}. & (\text{length of longest edge in best matching})
\end{align*}
Example: Consider the torus of inner and outer radius 1 embedded in the natural way. Left $f,g:T^2\to \R$ be height functions of the torus, but projecting to the planes $z=-2$ and $z=x-4$, respectively. Note all critical points occur on the plane $y=0$. Below, the slice at this plane is given (distances along planes from the first critical point are shown), as well as $D(f), D(g)$ on the same diagram (degrees of homology classes are shown).

For $D(f) = \{(0,\infty),(2,\infty),(4,\infty),(6,\infty)\}$ and $D(g) = \{(0,\infty), (2,\infty),(2\sqrt 2,\infty),(2+2\sqrt 2,\infty)\}$, it is clear that $\sigma=\id$ will be the best matching. The $q$-distance between $f$ and $g$ is then given by
\[
W_q(f,g) = \left(||(4,\infty)-(2\sqrt 2,\infty)||^q_\infty+||(6,\infty)-(2+2\sqrt 2,\infty)||^q_\infty\right)^{1/q} = 2^{1/q}(4-2\sqrt 2),\]with bottleneck distance $4-2\sqrt2$. However, we would like to say that these two functions are the same in some way, as no critical points are switched, and extended persistence allows us to do that. The decomposed extended persistence module is given below.

The extended persistence classes have length 3 ($(1,4)$ for the 0-class, $(4,1)$ for the 2-class) and 1 ($(2,3)$ and $(3,2)$ for the 1-classes), no matter if we use $f$ or $g$ to define the $X_i$ and $X^j$.

Remark: An interesting question to ask is how long does it take for an essential homology class to be built? Some things to keep in mind while resolving this question:

The 0-class case should be treated spearately because of reduced homology
A class may be encountered several times (like the first 1-class in the example above)
What does it mean for a class to be "begin being built" (this is probably the key)
A class is certainly "done being built" (the first time) when it first appears in the persistence module

It seems that the extended persistence pair gives the length between when the class is "done being built" the first time $f$ encounters it fully and when it "begins to be built" the last time $f$ encounters it.

The bottleneck distance satisfies a nice stability condition for tame functions $f:X\to \R$, which have finite dimensional homology groups $H_k(f^{-1}(-\infty,a])$ for all $a\in \R$.

Theorem (Cohen-Steiner, Edelsbrunner, Harer 2007): Let $f,g:X\to \R$ be tame. Then $W_\infty(f,g) \leqslant ||f-g||_\infty$.

This bound is reached when $g=f+c$ for some constant $c$, and the Wasserstein distance is 0 when $g(p_i)=f(p_i)$ for all critical values. Hence it seems without stronger assumptions about $f$ and $g$, this bound is as good as we can get.

References: Edelsbrunner and Morozov (Persistent homology: theory and practice), Cohen-Steiner, Edelsbrunner and Harer (Stability of persistence diagrams)

Monday, March 27, 2017

Revisiting persistent homology

Here we revisit and expand on persistent homology, previously in the post "Persistent homology (an example)," 2016-05-19. All homology, except where noted, will be over a field $k$, and $X$ will be a topological space. Often a Morse-type function $f:X\to \R$ is introduced along with $X$, but we will try to take a more abstract view.

Definition: The space $X$ may be described as a filtered space with a filtration of sublevel sets
\[ \emptyset = X_0 \subseteq X_1\subseteq \cdots \subseteq X_m = X, \] whose persistence module is the (not necessarily exact) sequence
\[ 0 = H(X_0) \to H(X_1)\to\cdots \to H(X_m) = H(X) \] of homology groups of the filtration.

Remark: Every persistence module may be uniquely decomposed as a direct sum of sequences $0\to k\to \cdots\to k\to 0$, where every map is $\id$, except the first and last. The indices at which each sequence in the summand has its first and last non-zero map are called the birth and death of the homology class represented by the sequence.

In some cases a homology class may not die, so we consider the extended persistence module to make everything finite. We introduce the superlevel sets $X^i = X\setminus X_i$. If $f$ was our Morse-type function for $X$, with critical points $p_1<\cdots<p_m$, then for $t_0<p_1<t_1<\cdots<p_m<t_m$, we set $X_i = f^{-1}(-\infty,t_i]$ and $X^i = f^{-1}[t_i,\infty)$. The extended persistence module of $X$ is
\[
0 = H_k(X_0) \to H_k(X_1)\to\cdots \to H_k(X_m) \to H_k(X,X^m) \to H_k(X,X^{m-1}) \to \cdots \to H_k(X,X^0)=0.
\]
Definition: The persistence of a homology class in a persistence module conveys the idea of how long it is alive, presented by a persistence pair.

The persistence of all homology classes in a persistence module is often presented in a persistence diagram, the collection of persistence pairs $(i,j)$, or $(p_i,p_j)$ or $(f(p_i),f(p_j))$, as desired; or a linear barcode, the collection of persistence pairs $(i,j)$ as intervals $[i,j]$, ordered vertically.

Example: Let $X = T^n=(S^1)^n$ be the $n$-torus. One filtration of $X$ is $X_0=\emptyset$ and $X_i = T^i$ for $1\leqslant i\leqslant n$. Note that $H_k(T^n,T^n\setminus X_n)=H_k(T^n)$ and $H_k(T^n,T^n\setminus X_0)=H_k(\emptyset)$. The first $n+1$ modules of the extended persistence module at level $k$ split into $\binom nk$ sequences, as $H_k(T^n) = \Z^{\binom nk}$. Geometric considerations allow $X^i = T^n\setminus T^i$ to be simplified in some cases. For instance, when $n=3$ and $k=0,1$ we have that $\widetilde H_k(T^3,T^3\setminus T^2)\cong \widetilde H_k(T^3,T^2) \cong \widetilde H_k(T^3/T^2)$, and knowing that $X^1=T^3\setminus T^1\simeq (S^1\vee S^1)\times S^1$, the relevant part of the long exact sequence for relative homology is

The two 1-cycles from $S^1\vee S^1\subset X^1$ map via $f$ to the same 1-cycle in $T^3$, hence $\text{im}(g)=\Z^2$. By exactness, $\text{ker}(g)=\Z^2$, and as $g$ is surjective, $A=\Z$. Hence the extended persistence $k$-modules decompose as

The persistence pairs are $(1,3)$ with multiplicity 2 and $(2,3)$, $(3,1)$ with multiplicity 1. The persistence diagrams and barcodes of the degree 0 and 1 homology classes are given below.

The diagonal $y=x$ is often given to indicate how short a lifespan a class has. Barcodes are usually not given for extended persistence diagrams, as length of a class (birth to death) is less important than position (above or below the diagonal).

Now we consider some generalizations of the ideas presented above.

Remark: A filtration can also be viewed as a diagram $X_0 \to X_1 \to \cdots \to X_m$, where each arrow is the inclusion map. We could generalize and consider a zigzag diagram, a sequence $X_0 \leftrightarrow X_1 \leftrightarrow \cdots \leftrightarrow X_m$, where $\leftrightarrow$ represents either $\to$ or $\leftarrow$. Homology can be applied and the resulting seuquence can also be uniquely decomposed into summands $k \leftrightarrow \cdots \leftrightarrow k$ where every arrow is the identity, giving zigzag persistent homology.

Remark: A filtration could also be viewed as a functor $F:\{0,\dots,m\}\to \text{Top}$, where $F(i)=X_i$ and $F(i\to j)$, for $j\>i$, is the composition of maps $X_i\to \cdots \to X_j$. Hence the degree-$k$ persistent homology of $X_i$ can be defined as the image of the maps $H_kF(i\to j)$, for all $j\>i$, and the functor $H_kF:\{0,\dots,m\}\to \text{Vec}$ may be viewed as the $k$th persistence module. This is a categorification of persistent homology.

Remark: A space $X$ can be filtered in several different ways. A multifiltration $X_\alpha$, for $\alpha$ a multi-index, is a collection of filtrations such that fixing all but one of the indices in $\alpha$ gives a (one-dimensional) filtration of $X$. The multidimensional persistence of $X_\alpha$ is a $|\alpha|$-dimensional grid of homology groups, with the barcode generalizing to the rank invariant, a map on the grid.

Another generalization, viewing filtrations as quivers, will not be discussed here, but rather presented as a separate post later.

References: Edelsbrunner and Morozov (Persistent homology: theory and practice), Carlsson, de Silva, and Morozov (Zigzag persistent homology and real-valued functions), Bubenik and Scott (Categorification of persistent homology), Carlsson and Zomorodian (The theory of multidimensional persistence)

Thursday, May 19, 2016

Persistent homology (an example)

Here we follow the article "Persistent homology - a Survey," by Herbert Edelsbrunner and John Harer, published in 2008 in "Surveys on discrete and computational geometry," Volume 453.

Consider the sphere, which has known homology groups. Consider a slightly bent embedding of the sphere in $\R^3$, call it $M$, as in the diagram below (imagine it as a hollow blob, whose outline is drawn below). Let $f:M\to \R$ be the height function, measuring the distance from a point in $M$ to a plane just below $M$, coming out of the page. Then we have some critical values $t_0,t_1,t_2,t_3$, as indicated below. Note we have embedded the shape so that no two critical points of $f$ have the same value.

This is remniscent of Morse theory. Set $M_i = f^{-1}[0,t_i]$ and $b_i = \dim(H_i)$ the $i$th Betti number. Then we may easily calculate the Betti numbers of the $M_j$, as in the table below.
\[
\renewcommand\arraystretch{1.3}
\begin{array}{r|c|c|c|c|c}
& M_0 & M_1 & M_2 & M_3 & M \\\hline
b_0 & 1 & 2 & 1 & 1 & 1 \\\hline
b_1 & 0 & 0 & 0 & 0 & 0 \\\hline
b_2 & 0 & 0 & 0 & 1 & 1
\end{array}
\renewcommand\arraystretch{1}
\]
Definition: In the context above, suppose that there is some $p$ and $j>i$ such that:

$b_p(M_i)=b_p(M_{i-1})+1$,
$b_p(M_j)=b_p(M_{j-1})-1$, and
the generator of $H_p$ introduced at $t_i$ is the same generator of $H_p$ that disappears at $t_j$.

Then $(i,j)$ (or ($t_i,t_j$)) is called a persistence pair and the persistence of $(i,j)$ is $j-i$ (or $f(j)-f(i)$).

For $i$ not in a persistence pair, we say that $i$ represents an essential cycle, or that the persistence of $i$ is infinite. In the example considered, the only persistence pair is $(1,2)$. This may be presented in a persistence diagram, with the indices of critical points on both axes, and the persistence measured as a vertical distance.

If we put a simplicial complex structure on $M$, we may also calculate the homology (and persistence pairs, although they may be different than the ones found above). To make calculations easier, we instead describe a CW structure on our embedded sphere $M$ (with $X_i$ the $i$-skeleton, and the ordering of the $i$-cells as indicated). The results will be the same as for a simplicial complex structure.

This gives one 0-cell, two 1-cells, and three 2-cells (with the obvious gluings), allowing us to construct the chain groups $C_p$ as well as maps between them. The map $d_p:C_p\to C_{p-1}$ as a matrix has size $\dim(C_{p-1})\times \dim(C_p)$, and has entry $(i,j)$ equal to the number of times, counting multiplicity, that the $i$th $(p-1)$-cell is a face of the $j$th $p$-cell. Calculations are done in $\Z/2\Z$.
\[
d_2\ :\ C_2\to C_1
\hspace{.5cm}\text{is}\hspace{.5cm}
\begin{bmatrix}
1 & 0 & 1 \\ 0 & 1 & 1
\end{bmatrix}
\hspace{2cm}
d_1\ :\ C_1\to C_0
\hspace{.5cm}\text{is}\hspace{.5cm}
\begin{bmatrix}
0 & 0
\end{bmatrix}
\]
The Betti numbers are then $b_p = \dim(C_p) - \text{rk}(d_p)-\text{rk}(d_{p+1})$. From above, it is immediate that $\text{rk}(d_1)=0$, $\text{rk}(d_2) = 2$, and $\text{rk}(d_p)=0$ for all other $p$. This tells us that
\begin{align*}
b_0 & = \dim(C_0) - \text{rk}(d_0) - \text{rk}(d_1) = 1 - 0 - 0 = 1, \\
b_1 & = \dim(C_1) - \text{rk}(d_1) - \text{rk}(d_2) = 2 - 0 - 2 = 0, \\
b_2 & = \dim(C_2) - \text{rk}(d_2) - \text{rk}(d_3) = 3 - 2 - 0 = 1, \\
\end{align*}
as expected. To find the persistence pairs, we introduce a filtration on the simplices (equivalently, on the cells) by always having the faces of a cell precede the cell, as well as lower-dimensional cells preceding higher-dimensional cells. Using the same ordering as described above, consider the following filtration:
\begin{align*}
K_0 & = \{\}, \\
K_1 & = \{e^0_1\}, \\
K_2 & = \{e^0_1,e^1_1,e^1_2\} ,\\
K_3 & = \{e^0_1,e^1_1,e^1_2,e^2_1,e^2_2,e^2_3\},
\end{align*}
so $\emptyset = K_0\subset K_1\subset K_2\subset K_3 = M$. This gives an ordering on all the cells of $M$, namely
\[
\sigma_1 = e^0_1,\
\sigma_2 = e^1_1,\
\sigma_3 = e^1_2,\
\sigma_4 = e^2_1,\
\sigma_5 = e^2_2,\
\sigma_6 = e^2_3.
\]
Construct the boundary matrix $D$, with the $(i,j)$ entry of $D$ equal to the number of times, counting multiplicity, modulo 2, that $\sigma_i$ is a codimension 1 face of $\sigma_j$. In the case of our example sphere, we get the matrix
\[
D = \begin{bmatrix}
0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 1 & 0 & 1 \\
0 & 0 & 0 & 0 & 1 & 1 \\
0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0
\end{bmatrix}
\ \ \sim\ \
\begin{bmatrix}
0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 1 & 0 & 0 \\
0 & 0 & 0 & 0 & 1 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0
\end{bmatrix}
\]
in its reduced form (call it $\tilde D$). With respect to the matrix $\tilde D$, define the following numbers:
\begin{align*}
low(j) & = \text{the row number of the lowest non-zero entry in column $j$} ,\\
zero(p) & = \text{the number of zero columns that correspond to $p$-simplices} ,\\
one(p) & = \text{the number of 1s in rows that correspond to $p$-simplices}.
\end{align*}
We calculate all the relevant values of these expressions to be as below.
\begin{align*}
low(1) & = 0 & zero(0) & = 1 & one(0) & = 0 \\
low(2) & = 0 & zero(1) & = 2 & one(1) & = 2 \\
low(3) & = 0 & zero(2) & = 1 & one(2) & = 0 \\
low(4) & = 2 \\
low(5) & = 3 \\
low(6) & = 0
\end{align*}
For persistence, we have

if $low(j)=i\neq 0$, then $(i,j)$ is a persistence pair,
if $low(j)=0$ and there is no $k$ such that $low(k)=j$, then $j$ is an essential cycle.

For our sphere example, we get two persistence pairs $(2,4)$ and $(3,5)$, and two essential cycles 1 and 6. Note that this is different from the persistence pairs found by the height function $f:M\to \R$ earlier (but there are still two essential cycles), because there we were comparing the homologies $H_p(M_j)$, but here we are comparing $H_p(K_\ell)$. The persistence diagram is as below.

As an added feature, from the numbers above we may calculate the homology and relative homology groups. Construct the relative chain groups $C_p(M,K_\ell) = C_p(M)/C_p(K_\ell)$ and set $zero(p,\ell)$ to be $zero(p)$ for the lower right submatrix of $\tilde D$ corresponding to the cells in $M-K_\ell$ (and similarly for $one(p,\ell)$). We find these numbers for the bent sphere to be as below.
\begin{align*}
zero(0,0) & = 1 & zero(0,1) & = 0 & zero(0,2) & = 0 & zero(0,3) & = 0 \\
zero(1,0) & = 2 & zero(1,1) & = 2 & zero(1,2) & = 0 & zero(1,3) & = 0 \\
zero(2,0) & = 1 & zero(2,1) & = 1 & zero(2,2) & = 1 & zero(2,3) & = 0 \\[10pt]
one(0,0) & = 0 & one(0,1) & = 0 & one(0,2) & = 0 & one(0,3) & = 0 \\
one(1,0) & = 2 & one(1,1) & = 2 & one(1,2) & = 0 & one(1,3) & = 0 \\
one(2,0) & = 0 & one(2,1) & = 0 & one(2,2) & = 0 & one(2,3) & = 0
\end{align*}
Note that $zero(p,0)=zero(p)$ and $one(p,0)=one(p)$, as well as $zero(p,3)=one(p,3)=0$. The above numbers are useful in calculating
\begin{align*}
\dim(H_p(M)) & = zero(p)-one(p), \\
\dim(H_p(M,K_\ell)) & = zero(p,\ell) - one(p,\ell).
\end{align*}

References: Edelsbrunner and Harer (Persistent homology - a Survey)