mat-blag: TDA

Showing posts with label TDA. Show all posts

Thursday, August 3, 2017

New directions in TDA

Conference topic

This post is informal, meant as a collection of (personally) new things from the workshop "Topological data analysis: Developing abstract foundations" at the Banff International Research Station, July 31 - August 4, 2017. New actual questions:

Does there exist a constructible sheaf valued in persistence modules over $\Ran^{\leqslant n}(M)$?
- On the stalks it should be the persistence module of $P\in \Ran^{\leqslant n}(M)$. What about arbitrary open sets?
- Is there such a thing as a colimit of persistence modules?
- Uli Bauer suggested something to do with ordering the elements of the sample and taking small open sets.
Can framed vector spaces be used to make the TDA pipeline functorial? Does Ezra Miller's work help?
- Should be a functor from $(\R,\leqslant)$, the reals as a poset, to $\text{Vect}$ or $\text{Vect}_{fr}$, the category of (framed) vector spaces. Filtration function $f:\R^n\to \R$ is assumed to be given.
- Framed perspective should not be too difficult, just need to find right definitions.
- Does this give an equivalence of categories (category of persistence modules and category of matchings)? Is that what we want? Do we want to keep only specific properties?
- Ezra's work is very dense and unpublished. But it seems to have a very precise functoriality (which is not the main thrust of the work, however).
Can the Bubenik-de Silva-Scott interleaving categorification be viewed as a (co)limit? Diagrams are suggestive.
- Reference is 1707.06288 on the arXiv.
- Probably not a colimit, because that would be very large, though the arrows suggest a colimit.
- Have to be careful, because the (co)limit should be in the category of posets, not just interleavings.

New things to learn about:

Algebraic geometry / homotopy theory: the etale space of a sheaf, Kan extensions, model categories, symmetric monoidal categories.
TDA related: Gromov-Hausdorff distance, the universal distance (Michael Lesnick's thesis and papers), merge trees, Reeb graphs, Mapper (the program).

Sunday, May 21, 2017

Categories and the TDA pipeline

Conference topic

This post contains topics and ideas from ACAT at HIM, April 2017, as presented by Professor Ulrich Bauer (see slide 11 of his presentation, online at ulrich-bauer.org/persistence-bonn-talk.pdf). The central theme is to assign categories and functors to analyze the process
\[
\text{filtration}\ \longrightarrow\ \text{(co)homology}\ \longrightarrow\ \text{barcode.}
\hspace{3cm}(\text{pipe}) \] Remark: The categories we will use are below. For filtrations, we have the ordered reals (though any poset $P$ would work) and topological spaces:
\begin{align*}
R\ :\ & \Obj(R) = \R, & \Top\ :\ & \Obj(\Top) = \{\text{topological spaces}\}, \\[5pt]
& \Hom(r,s) = \begin{cases}
\{r \mapsto s\}, & \text{ if } r\leqslant s, \\ \emptyset, & \text{ else,}
\end{cases} && \Hom(X,Y) = \{\text{functions }f:X\to Y\}.
\end{align*}
For (co)homology groups, we have the category of (framed) vector spaces. We write $V^n$ for $V^{\oplus n} = V\oplus V\oplus \cdots \oplus V$, and $e_n$ for a frame of $V^n$ (see below).
\begin{align*}
\Vect\ :\ & \Obj(\Vect) = \{V^{\oplus n}\ :\ 0\leqslant n< \infty\},\\
& \Hom(V^n,V^m) = \{\text{homomorphisms }f:V^n\to V^m\}, \\[5pt]
\Vect^{fr}\ :\ & \Obj(\Vect^{fr}) = \{V^n\times e^n\ :\ 0\leqslant n<\infty\}, \\
& \Hom(V^n\times e^n,V^m\times e^m) = \{\text{hom. }f:V^n\to V^m,\ g:e^n\to e^m,\ g\in \Mat(n,m)\}.
\end{align*}
Finally for barcodes, we have $\Delta$, the category of finite ordered sets, and its variants. A partial injective function, or matching $f:A\nrightarrow B$ is a bijection $A'\to B'$ for some $A'\subseteq A$, $B'\subseteq B$.
\begin{align*}
\Delta\ :\ & \Obj(\Delta) = \{[n]=(0,1,\dots,n)\ :\ 0\leqslant n<\infty\},\\
& \Hom([n],[m]) = \{ \text{order-preserving functions }f:[n]\to [m]\}, \\[5pt]
\Delta'\ :\
& \Obj(\Delta')= \{a=(a_0<a_1<\cdots<a_n)\ :\ a_i\in \Z_{\geqslant 0}, 0\leqslant n<\infty\},\\ & \Hom(a,b) = \{\text{order-preserving functions }f:a\to b\}, \\[5pt]
\Delta''\ :\
& \Obj(\Delta'')= \{a=(a_0<a_1<\cdots<a_n)\ :\ a_i\in \Z_{\geqslant 0}, 0\leqslant n<\infty\},\\ & \Hom(a,b) = \{\text{order-preserving partial injective functions }f:a\nrightarrow b\}.
\end{align*}

Definition: A frame $e$ of a vector space $V^n$ is equivalently:

an ordered basis of $V^n$,
a linear isomorphism $V^n\to V^n$, or
an element in the fiber of the principal rank $n$ frame bundle over a point.

Frames (of possibly different sizes) are related by full rank elements of $\Mat(n,m)$, which contains all $n\times m$ matrices over a given field.

Definition: Let $(P,\leqslant)$ be a poset. A (indexed topological) filtration is a functor $F:P\to \Top$, with
\[
\Hom(F(r),F(s)) = \begin{cases}
\{\iota:F(r) \hookrightarrow F(s)\}, & \text{ if }r\leqslant s, \\ \emptyset, & \text{ else,}
\end{cases}
\]
where $\iota$ is the inclusion map. That is, we require $F(r)\subseteq F(s)$ whenever $r\leqslant s$.

Definition: A persistence module is the composition of functors $M_i:P \tov{F} \Top \tov{H_i} \Vect$.

Homology will be taken over some field $k$. A framed persistence module is the same composition as above, but mapping into $\Vect^{fr}$ instead. The framing is chosen to describe how many different vector spaces have already been encountered in the filtration.

Definition: A barcode is a collection of intervals of $\R$. It may also be viewed as the composition of functors $B_i:P\tov{F}\Top\tov{H_i}\Vect \tov{\dim}\Delta$.

Similarly as above, we may talk about a framed barcode by instead mapping into $\Vect^{fr}$ and then to $\Delta''$, keeping track of which vector spaces we have already encountered. This allows us to interpret the process $(\text{pipe})$ in two different ways. First we have the unframed approach
\[
\begin{array}{r c c c l}
\Top & \to & \Vect & \to & \Delta, \\
X_t & \mapsto & H_i(X_t;k) & \mapsto & [\dim(H_i(X_t;k))].
\end{array}
\]
The problem here is interpreting the inclusion $X_t\hookrightarrow X_{t'}$ as a map in $\Delta$, for instance, in the case when $H_i(X_t;k)\cong H_i(X_{t'};k)$, but $H_i(X_t\hookrightarrow X_{t'}) \neq \id$. To fix this, we have the framed interpretation of $(\text{pipe})$
\[
\begin{array}{r c c c l}
\Top & \to & \Vect^{fr} & \to & \Delta'', \\
X_t & \mapsto & H_i(X_t;k)\times e & \mapsto & [e].
\end{array}
\]
The first map produces a frame $e$ of size $n$, where $n$ is the total number of different vector spaces encountered over all $t'\leqslant t$, by setting the first $\dim(H_i(X_t;k))$ coordinates to be the appropriate ones, and then the rest. This is done with the second map to $\Delta''$ in mind, as the size of $[e]$ is $\dim(H_i(X_t;k))$, with only the first $\dim(H_i(X_t;k))$ basis vectors taken from $e$. As usual, these maps are best understood by example.

Example: Given the closed curve $X$ in $\R^2$ below, let $\varphi:X\to \R$ be the height map from the line 0, with $X_i=\varphi^{-1}(-\infty,i]$, for $i=r,s,t,u,v$. Let $e_i$ be the standard $i$th basis vector in $\R^N$.

Remark: This seems to make $(\text{pipe})$ functorial, as the maps $X_t\hookrightarrow X_{t'}$ may be naturally viewed as partial injective functions in $\Delta''$, to account for the problem mentioned with the unframed interpretation. However, we have traded locality for functoriality, as the image of $X_t$ in $\Delta''$ can not be calculated without having calculated $X_{t'}$ for all $t'<t$.

References: Bauer (Algebraic perspectives of persistence), Bauer and Lesnick (Induced matchings and the algebraic stability of persistence barcodes)

Sunday, February 12, 2017

Generalizing planar detection to k-plane detection

In this post the planar detection algorithm in $\R^3$ of Bauer and Polthier in Detection of Planar Regions in Volume Data for Topology Optimization is generalized to detect $k$-planes with largest density in $\R^n$. Let $\Omega\subset \R^n$ be the compact support of a piecewise-constant probability density function $\rho:\R^n\to \R_{\geqslant 0}$.

Definition: Let $(G,\rho)$ be a grid, where $G \subset \lambda \Z^n + c \subset \R^n$ is a lattice in $\Omega$. A cell $x$ of the grid is $B_\infty(x,\lambda/2) = \{y\in \R^n\ :\ ||x-y||_\infty\leqslant\lambda/2\}$, for $x\in G$. Every cell is assigned a value
\[
\int_{B_\infty(x,\lambda/2)}\rho\ dx,
\]
called the mass of the cell, which may be though of as a type of Radon transform of $\rho$.

Assuming that $k$ is a global variable, running $Recursive(G,w,k)$ will give the desired result. This algorithm is the naive generalization of Bauer and Polthier, and suffers from calculating mass along the same $k$-plane several times, whenever $k<n-1$ (as any $k$-plane does not lie in a unique $(k+1)$-plane).

Algorithm: $k$PlaneFinder
$Recursive(G,w,k)$:
Input: A grid $(G,\rho)$, a width $w$ of fattened $k$-planes, the current plane dimension $k\leqslant k'<n$
Output: A $k$-planar connected component covering most mass in $G$

discretize the unit $(k'-1)$-hemisphere in an appropriate manner
order the vertices by a Hamiltonian path
for each vertex $\textbf{n}$:
    sort the grid cells in direction $\textbf{n}$
    discretize the range in direction $\textbf{n}$ equidistantly
    for each $k'$-plane $(\textbf{n},d)$:
        collect the cells closer than $w$ to the $k'$-plane into a graph $G'$
        if $k'\neq k$:
            run $Recursive(G',w,k'-1)$
        else:
            compute the connected component having the most mass in $G'$
return the connected $k$-component having most mass (and the corresponding $k$-plane)

Measuring along connected components of a $k$-plane works the same way as in the original version, as the gird on $\R^n$ similarly induces a connectivity graph.

Remark: Bauer and Polthier cite Kantaforoush and Shahshahani in evenly sampling points on the unit 2-sphere, but it is not clear how their method (using the inscribed icosahedron) generalizes. Another method would be uniformly sampling random points on $S^{k-1}$ and take all on one hemisphere. A Hamiltonian path could then be taken from an arbitrary point and then using the greedy algorithm (with respect to Euclidean distance) to find consecutive vertices (to keep down the time of consecutive sorting operations).

Recall the Grassmannian $Gr(n,k)$ of all $k$-planes in $\R^n$ through the origin, a compact manifold of dimension $k(n-k)$. Note that any $k$-plane $P\subset \R^n$ is a translation of an element $Q\in Gr(n,k)$ by an element of $Q^\perp$ (we conflate notation for $Q$ and its natural embedding in $\R^n$).

Remark: $Gr(n,k)$ is parametrizable, so by choosing directions in the unit $(n-k)$-hemisphere, the process of choosing $k$-planes in the algorithm may be completely parametrized. The quick sorting of points that was available in Bauer and Polthier's $n=3,k=2$ case may be replaced by an iterated restriction of the original data set through a complete flag $P\subset \cdots \subset \R^n$.

References: Bauer and Polthier (Detection of Planar Regions in Volume Data for Topology Optimization), Katanforoush and Shahshahani (Distributing points on the Sphere 1)

Thursday, May 26, 2016

Reconstructing a manifold from sample data, with noise

We follow the article [3] and add more background and clarifications. Some assumptions are made that are not explicitly mentioned in the article, to make calculations easier.

Background in probability, measure theory, topology

Let $X$ be a random variable over a space $A$. Recall that the expression $P(X)$ is a number in $[0,1]$ describing the probability of the event $X$ happening. This is called a probability distribution. Here we will consider continuous random variables, so $P(X=x)=0$ for any single element $x\in A$.

Definition: The probability density function of $X$ is the function $f:A\to \R$ satisfying

$f(x)\geqslant 0$ for all $x\in A$, and
$\int_B f(x)\ dx = P(X\in B)$ for any $B\subseteq A$.

The second condition implies $\int_A f(x)\ dx=1$.

Often authors use just $P$ instead of $f$, and write $P(x)$ instead of $P(X=x)$.

Definition: Let $Y=g(X)$ be another random variable. The expected value of $Y$ is
\[
E[Y] = E[g(X)] = \int_Ag(x)f(x)\ dx.
\]
The mean of $X$ is $\mu= E[X]$, and the variance of $X$ is $\sigma^2 = E[(X-\mu)^2]$. If $\vec X=(X_1\ \cdots\ X_n)^T$ is a multivariate random variable, then $\vec \mu=E[\vec X]$ is an $n$-vector, and the variance is an $(n\times n)$-matrix given as
\[
\Sigma = E[(\vec X-E[\vec X])(\vec X-E[\vec X])^T]
\hspace{1cm}
\text{or}
\hspace{1cm}
\Sigma_{ij} = E[(X_i-E[X_i])(X_j-E[X_j])].
\]
The covariance of $X$ and $Y$ is $E[(X-E[X])(Y-E[Y])]$. Note that the covariance of $X$ with itself is just the usual variance of $X$.

Example: One example of a probability distribution is the normal (or Gaussian) distribution, and we say a random variable with the normal distribution is normally distributed. If a random variable $X$ is normally distributed with mean $\mu$ and variance $\sigma^2$, then the probability density function of $X$ is
\[
f(x) = \frac{\exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)}{\sigma\sqrt{2\pi}}.
\]
If $\vec X=(X_1\ \cdots\ X_n)^T$ is a normally distributed multivariate random variable, then $\vec \mu = (E[X_1]\ \cdots\ E[X_n])^T$ and the probability density function of $\vec X$ is
\[
f(\vec x) = \frac{\exp\left(-\frac 12 (\vec x-\vec \mu)^T\Sigma^{-1}(\vec x-\vec \mu)\right)}{\sqrt{(2\pi)^n\det(\Sigma)}}.
\]

Definition: A measure on $\R^D$ is a function $m:\{$subsets of $\R^D\}\to [0,\infty]$ such that $m(\emptyset) = 0$ and $m(\bigcup_{i\in I} E_i) = \sum_{i\in I} m(E_i)$ for $\{E_i\}_{i\in I}$ a countable sequence of disjoint subsets of $\R^D$. A probability measure on $\R^D$ is a measure $m$ on $\R^D$ with the added condition that $m(\R^D)=1$.

A probability distribution is an example of a probability measure.

Definition: Let $U= \{U_i\}_{i\in I}$ be a covering of a topological space $M$. The nerve of the covering $U$ is a set $N$ of subsets of $I$ given by
\[
N = \left\{J\subset I\ :\ \bigcap_{j\in J} U_j \neq\emptyset\right\}.
\]
Note that this makes $N$ into an abstract simplicial complex, as $J\in N$ implies $J'\in N$ for all $J'\subseteq J$.

Let $M$ be a smooth compact submanifold of $\R^d$. By the tubular neighborhood theorem (see Theorem 2.11.4 in [3]), every smooth compact submanifold $M$ of $\R^d$ has a tubular neighborhood for some $\epsilon>0$.

Definition: For a particular embedding of $M$, let the condition number of $M$ be $\tau=\sup\{\epsilon\ :$ $M$ has an $\epsilon-$tubular neighborhood$\}$.

Distributions on a manifold

Let $M$ be a $d$-dimensional manifold embedded in $\R^D$, with $D>d$. Recall that every element in $NM\subseteq \R^D$, the normal bundle of $M$, may be represented as a pair $(\vec x,\vec y)$, where $\vec x\in M$ and $\vec y\in T^\perp$ (since $M$ is a manifold, all the normal spaces are isomorphic). Hence we may consider a probability distribution $P$ on $NM$, with $\vec X$ the $d$-multivariate random variable representing points on $M$ and $\vec Y$ the $(D-d)$-multivariate random variable representing points on the space normal to $M$ at a point on $M$. We make the assumption that $\vec X$ and $\vec Y$ are independent, or that
\[
P(\vec X, \vec Y) = P_M(\vec X)P_{T^\perp}(\vec Y).
\]
That is, $P_{T^\perp}$ is a probability distribution that is the same at any point on the manifold.

Definition: Let $P$ be a probability distribution on $NM$ and $f_M$ the probability density function of $P_M$. In the context described above, $P$ satisfies the strong variance condition if

there exist $a,b>0$ such that $f_M(\vec x)\in [a,b]$ for all $\vec x\in M$, and
$P_{T^\perp}(\vec Y)$ is normally distributed with $\vec \mu = 0$ and $\Sigma = \sigma^2I$.

The second condition implies that the covariance of $Y_i$ with $Y_j$ is trivial iff $i\neq j$, and that the vairance of all the $Y_i$s is the same. From the normally distributed multivariate example above, this also tells us that the probability density function $f^\perp$ of $\vec Y$ is
\[
f^\perp(\vec y) = \frac{\exp\left(\displaystyle-\frac{\sigma^2}{2}\sum_{i=1}^{D-d}y_i^2\right)}{\sigma^{D-d}\sqrt{(2\pi)^{D-d}}}.
\]

Theorem: In the context described above, let $P$ be a probability distribution on $NM$ satisfying the strong variance condition, and let $\delta>0$. If there is $c>1$ such that
\[
\sigma <\frac{c\tau(\sqrt9-\sqrt 8)}{9\sqrt{8(D-d)}},
\]
then there is an algorithm that computes the homology of $M$ from a random sample of $n$ points, with probability $1-\delta$. The number $n$ depends on $\tau,\delta,c,d,D$, and the diameter of $M$.

The homology computing algorithm

Below is a broad view of the algorithm described in sections 3, 4, and 5 of [1]. Let $M$ be a $d$-manifold embedded in $\R^D$, and $P$ a probability measure on $NM$ satisfying the strong variance condition.

1. Calculate the following numbers:
\begin{align*}
\tau & = \text{condition number of $M$}\\
\text{vol}(M) & = \text{volume of $M$}\\
\sigma^2 & = \text{variance of $P$}
\end{align*}
2. Define (or choose) the following numbers:

\begin{align*}
\delta & \in (0,1) \\
r & \in \left(2\sqrt{2(D-d)}\sigma,\textstyle\frac\tau9 (3-2\sqrt 2)\right) \\
n & > \text{function}(a,r,\tau,d,\delta,\text{vol}(M)) & (\max(A,B)\ \text{in Proposition 9 of [1])} \\
s & = 4r \\
deg & > \textstyle \frac{3a}4 \left(1-\left(\frac r{2\tau}\right)^2\right)^{d/2}\text{vol}\left(B^d(r,0)\right)\\
R & = (9r+\tau)/2
\end{align*}
3. Choose $n$ points randomly from $NM$ according to $P$.
4. From these $n$ points, construct the nearest neighbor graph $G$ with distance $s$.
5. Remove from $G$ all the vertices of degree $<deg$ to get a refined graph $G'$.
6. Set $U=\bigcup_{\vec x\in V(G')}B^D(R,\vec x)$ and construct the simplicial complex $K$ of its nerve.
7. Compute the homology of $K$, which is the homology of $M$, with probability $1-\delta$.

References:
[1] Niyogi, Smale, and Weinberger (A topological view of unsupervised learning from noisy data)
[2] Folland (Real analysis, Chapter 10.1)
[3] Bredon (Topology and Geometry, Chapter 2.11)