Showing posts with label nerve. Show all posts
Showing posts with label nerve. Show all posts

Friday, April 20, 2018

Exit paths and entry paths through $\infty$-categories

Let $X$ be a topological space, $(A,\leqslant)$ a poset, and $f: X\to (A,\leqslant)$ a continuous map.

Definition: An exit path in an $A$-stratified space $X$ is a continuous map $\sigma: |\Delta^n|\to X$ for which there exists a chain $a_0\leqslant \cdots \leqslant a_n$ in $A$ such that $f(\sigma(t_0,\dots,t_i,0,\dots,0))=a_i$ for $t_i\neq 0$. An entry path is a continuous map $\tau: |\Delta^n|\to X$ for which there exists a chain $b_0\leqslant \cdots \leqslant b_n$ in $A$ such that $f(\tau(0,\dots,0,t_i,\dots,t_n))=b_i$ for $t_i\neq 0$.

Up to reordering of vertices of $\Delta^n$ and induced reordering of the realization $|\Delta^n|$, an exit path is the same as an entry path. The next example describes this equivalence.

Example: The standard 2-simplex $|\Delta^2|$ is uniquely an exit path and an entry path with a chain of 3 distinct elements, stratfied in the ways described below.
Recall the following algebraic constructions, through Joyal's quasi-category model:
  • A simplicial set is a functor $\Delta^{op}\to \Set$.
  • A Kan complex is a simplicial set satisfying the inner horn condition for all $0\leqslant k\leqslant n$. That is, the $k$th $n$-horn lifts (can be filled in) to a map on $\Delta^n$.
  • An $\infty$-category is a simplicial set satisfying the inner horn condition for all $0<k<n$.
Moreover, if the lift is unique, then the Kan complex is the nerve of some category. Recall also the category $\Sing(X) = \{$continuous $\sigma: |\Delta^n|\to X\}$, which can be combined with the stratification $f: X\to A$ of $X$

Remark: The subcategory $\Sing^A(X)$ of exit paths and the subcategory $\Sing_A(X)$ of entry paths are full subcategories of $\Sing(X)$, with $(\Sing^A(X))^{op} = \Sing_A(X)$. If the stratification is conical, then these two categories are $\infty$-categories.
Recall the nerve construction of a category. Here we are interested in the nerve of the category $SC$ of simplicial complexes, so $N(SC)_n = \{$sequences of $n$ composable simplicial maps$\}$. Recall the $k$th $n$-horns, which are compatible diagrams of elements of $N(SC)_n$. In general, they are colimits of a diagram in the category $\Delta$. That is, \[ \Lambda^n_k := \colim \left(\bigsqcup_{0\leqslant i<j\leqslant n} \Delta^{n-2} \rightrightarrows \bigsqcup_{0\leqslant i\leqslant n \atop i\neq k} \Delta^{n-1}\right). \] Example: The images of the 3 different types of 2-horns and 4 different types of 3-horns in $SC$ are given below. Note that they are not unique, and depend on the choice of simplices $S_i$ (equivalently, on the choice of functor $\Delta^{op}\to SC$).
For example, the 0th 2-horn $\Lambda^2_0$ can be filled in if there exists a simplicial map $h: S_1\to S_2$ in $SC$ (that is, an element of $N(SC)_1$) such that $h\circ f = g$. Similarly, the 1st 3-horn $\Lambda^3_1$ can be filled in if there exists a functor $F: [0<1<2]\to SC$ for which $F(0<1)=f_{02}$, $F(0<2)=f_{03}$, and $F(1<2)=f_{23}$ (equivalently, a compatible collection of elements of $N(SC)_2$).

Definition: Let $A,B$ be $\infty$-categories. A functor $F: A\to B$ is a morphism of the simplicial sets $A,B$. That is, $F:A\to B$ is a natural transformation for $A,B\in \text{Fun}(\Delta^{op},\Set)$.

A functor of simplicial sets of a particular type can be identified with a functor of 1-categories. Recall the nerve of a 1-category, which turns it into an $\infty$-category. This construction has a left adjoint.

Definition: Let $\mathcal C$ be an $\infty$-category. The homotopy category $h\mathcal C$ of $\mathcal C$ has objects $\mathcal C_0$ and morphisms $\Hom_{h\mathcal C}(X,Y) = \pi_0(\text{Map}_{\mathcal C}(X,Y))$.

By Lurie, $h$ is left-adjoint to $N$. That is, $h : \sSet \rightleftarrows \text{Cat} : N$, or $\text{Map}_{\sSet}(\mathcal C,N(\mathcal D)) \cong \text{Map}_{\text{Cat}}(h\mathcal C, \mathcal D)$, for any $\infty$-category $\mathcal C$ and any 1-category $\mathcal D$. Our next goal is to describe a functor $\Sing_A(X)\to N(SC)$, maybe through this adjunction, where $SC$ is the 1-category of simplicial complexes and simplicial maps.

References: Lurie (Higher topos theory, Sections 1.1.3 and 1.2.3), Lurie (Higher algebra, Appendix A.6), Goerss and Jardine (Simplicial homotopy theory, Section I.3), Joyal (Quasi-categories and Kan complexes)

Thursday, August 31, 2017

Exit paths, part 1

This post is meant to set up all the necessary ideas to define the category of exit paths.

Preliminaries

 Let $X$ be a topological space and $C$ a category. Recall the following terms:
  • $\Delta$: The category whose objects are finite ordered sets $[n]=(1,\dots,n)$ and whose morphisms are non-decreasing maps. It has several full subcategories, including
    • $\Delta_s$, comprising the same objects of $\Delta$ and only injective morphisms, and
    • $\Delta_{\leqslant n}$, comprising only the objects $[0],\dots,[n]$ with the same morphisms.
  • equalizer: An object $E$ and a universal map $e:E\to X$, with respect to two maps $f,g:X\to Y$. It is universal in the sense that all maps into $X$ whose compositions with $f,g$ are equal factor through $e$. Equalizers and coequalizers are described by the diagram below, with universality given by existence of the dotted maps.
  • fibered product or pullback: The universal object $X\times_Z Y$ with maps to $X$ and $Y$, with respect to maps $X\to Z$ and $Y\to Z$.
  • fully faithful: A functor $F$ whose morphism restriction $\Hom(X,Y)\to \Hom(F(X),F(Y))$ is surjective (full) and injective (faithful).
  • locally constant sheaf: A sheaf $\mathcal F$ over $X$ for which every $x\in X$ has a neighborhood $U$ such that $\mathcal F|_U$ is a constant sheaf. For example, constructible sheaves are locally constant on every stratum. 
  • simplicial object: A contravariant functor from $\Delta$ to any other category. When the target category is $\text{Set}$, it is called a simplicial set. They may also be viewed as a collection $S = \{S_n\}_{\geqslant 0}$ for $S_n=S([n])$ the value of the functor on each $[n]$. Simplicial sets come with two natural maps:
    • face maps $d_i:S_n\to S_{n-1}$ induced by the map $[n-1]\to [n]$ which skips the $i$th piece, and
    • degeneracy maps $s_i:S_n\to S_{n+1}$ induced by the map $[n+1]\to[n]$ which repeats the $i$th piece.
  • stratification: A property of a cover $\{U_i\}$ of $X$ for which consecutive differences $U_{i+1}\setminus U_i$ have ``nicer" properties than all of $X$. For example, $E_i\to U_{i+1}\setminus U_i$ is a rank $i$ vector bundle, but there is no vector bundle $E\to X$ that restricts to every $E_i$.

Now we get into new territory.

Definition: The nerve of a category $C$ is the collection $N(C) = \{N(C)_n = Fun([n],C)\}_{n\geqslant 0}$, where $[n]$ is considered as a category with objects $0,\dots,n$ and a single morphism in $\Hom_{[n]}(s,t)$ iff $s\leqslant t$.

Note that the nerve of $C$ is a simplicial set, as it is a functor from $\Delta^{op}\to Fun(\Delta,C)$. Moreover, the pieces $N(C)_0$ are the objects of $C$ and $N(C)_1$ are the morphisms of $C$, so all the information about $C$ is contained in its nerve. There is more in the higher pieces $N(C)_n$, so the nerve (and simplicial sets in general) may be viewed as a generalization of a category.

Kan structures


Let $\text{sSet}$ be the category of simplicial sets. We may consider $\Delta^n = \Hom_\Delta(-,[n])$ as a contravariant functor $\Delta\to \text{Set}$, so it is an object of $\text{sSet}$.

Definition: Fix $n\geqslant 0$ and choose $0\leqslant i\leqslant n$. Then the $i$th $n$-horn of a simplicial set is the functor $\Lambda^n_i\subset \Delta^n$ generated by all the faces $\Delta^n(d_j)$, for $j\neq i$.

We purposefully do not describe what "$\subset$" or "generated by" mean for functors, hoping that intuition fills in the gaps. In some sense the horn feels like a partially defined functor (though it is a true simplicial set), well described by diagrams, for instance with $n=2$ and $i=1$ we have

Definition: A simplicial set $S$ is a Kan complex whenever every map $f:\Lambda^n_i\to S$ factors through $\Delta^n$. That is, when there exists a

The map $\iota$ is the inclusion. Moreover, $S$ is an $\infty$-category, or quasi-category, if the extending map $f'$ is unique.

Example: Some basic examples of $\infty$-categories, for $X$ a topological space, are
  • $Sing(X)$, made up of pieces $Sing(X)_n = \Hom(\Delta^n,X)$, and
  • $LCS(X)$, the category of locally constant sheaves over $X$. Here $LCS(X)_n$ over an object $A$, whose objects are $B\to A$ and morphisms are the appropriate commutative diagrams

Definition: A morphism $p\in \Hom_{\text{sSet}}(S,T)$ is a Kan fibration if for every commutative diagram (of solid arrows)

the dotted arrow exists, making the new diagram commute.

Definition: Let $C,D,A$ be categories with functors $F:C\to D$ and $G:C\to A$.
  • The left Kan extension of $F$ along $G$ is a functor $A\xrightarrow L D$ and a universal natural transformation $F\stackrel \lambda \rightsquigarrow L\circ G$.
  • The right Kan extension of $F$ along $G$ is a functor $A\xrightarrow R D$ and a universal natural transformation $R\circ G \stackrel \rho\rightsquigarrow F$.

Exit paths


The setting for this section is constructible sheaves over a topological space $X$. We begin with a slightly more technical definition of a stratification.

Definition: Let $(A,\leqslant)$ be a partially ordered set with the upset topology. That is, if $x\in U$ is open and $x\leqslant y$, then $y\in A$. An $A$-stratification of $X$ is a continuous function $f:X\to A$.

We now begin with a Treumann's definition of an exit path, combined with Lurie's stratified setting.

Definition: An exit path in an $A$-stratified space $X$ is a continuous map $\gamma:[0,1]\to X$ for which there exists a pair of chains $a_1\leqslant \cdots \leqslant a_n$ in $A$ and $0=t_0\leqslant \cdots \leqslant t_n=1$ in $[0,1]$ such that $f(\gamma(t))=a_i$ whenever $t\in (t_{i-1},t_i]$.

This really is a path, and so gives good intuition for what is happening. Recall that the geometric realization of the functor $\Delta^n$ is $|\Delta^n| = \{(t_0,\dots,t_n)\in \R^{n+1}\ :\ t_0+\cdots+t_n=1\}$. Oserving that $[0,1]\cong|\Delta^1|$, Lurie's definition of an exit path is more general by instead considering maps from $|\Delta^n|$.

Definition: The category of exit paths in an $A$-stratified space $X$ is the simplicial subset $Sing^A(X)\subset Sing(X)$ consisting of those simplices $\gamma:|\Delta^n|\to X$ for which there exists a chain $a_0\leqslant \cdots \leqslant a_n$ in $A$ such that $f(\gamma(t_0,\dots,t_i,0,\dots,0))=a_i$ for $t_i\neq 0$.

Example: As with all new ideas, it is useful to have an example. Consider the space $X=\Ran^{\leqslant 2}(M)\times \R_{\geqslant 0}$ of a closed manifold $M$ (see post "A constructible sheaf over the Ran space" 2017-06-24 for more). With the poset $(A,\leqslant)$ being $(a\leqslant b\leqslant c)$ and stratifying map
\[
\begin{array}{r c l}
f\ :\ X & \to & A, \\
(P,t) & \mapsto & \begin{cases}
a & \text{ if } P\in \Ran^1(M), \\
b & \text{ if } P\in \Ran^2(M), t\leqslant d(P_1,P_2), \\
c & \text{ else,}
\end{cases}
\end{array}
\]
we can make a continuous map $\gamma:\Delta^3\to X$ by
\[
\begin{array}{r c l}
(1,0,0) & \mapsto & (P\in \Ran^1(M),0), \\
(t_0,t_1\neq 0,0) & \mapsto & (P\in \Ran^2(M), d(P_1,P_2)), \\
(t_0,t_1,t_2\neq 0) & \mapsto & (P\in \Ran^2(M), t>d(P_1,P_2)).
\end{array}
\]
Then $f(\gamma(t_0\neq 0,0,0))=a$, and $f(\gamma(t_0,t_1\neq 0,0))=b$, and $f(\gamma(t_0,t_1,t_2\neq 0))=c$, as desired. The embedding of such a simplex $\gamma$ is described by the diagram below.


Both the image of $(1,0,0)$ and the 1-simplex from $(1,0,0)$ to $(0,1,0)$ lie in the singularity set of $\Ran^{\leqslant 2}(M)\times \R_{\geqslant 0}$, which is pairs $(P,t)$ where $t=d(P_i,P_j)$ for some $i,j$. The idea that the simplex "exits" a stratum is hopefully made clear by this image.

References: Lurie (Higher algebra, Appendix A), Lurie (What is... an $\infty$-category?), Groth (A short course on $\infty$-categories, Section 1), Joyal (Quasi-categories and Kan complexes), Goerss and Jardine (Simplicial homotopy theory, Chapter 1), Treumann (Exit paths and constructible stacks)

Sunday, May 28, 2017

Čech (co)homology

In this post we briefly recall the construction of Čech cohomology as well as compute a few examples. Let $X$ be a topological space with a cover $\mathcal U = \{U_i\}$, $\mathcal F$ a $C$-valued sheaf on $X$, and $\widehat{\mathcal F}$ a $C$-valued cosheaf on $X$, for some category $C$ (usually abelian groups).

Definition: The nerve $N$ of $\mathcal U$ is the simplicial complex that has an $r$-simplex $\rho$ for every non-empty intersection of $r+1$ opens of $\mathcal U$. The support $U_\rho$ of $\rho$ is this non-empty intersection. The $r$-skeleton $N_r$ of $N$ is the collection of all $r$-simplices.

Remark: The sheaf $\mathcal F$ and cosheaf $\widehat {\mathcal F}$ may be viewed as being defined either on the opens of $\mathcal U$ over $X$, or on the nerve $N$ of $\mathcal U$. Indeed, the inclusion map $V\hookrightarrow U$ on opens is given by the forgetful map $\partial$. That is, $\partial_i:N_r\to N_{r-1}$ forgets the $i$th open defining $\rho\in N_r$, so if $U_\rho = U_0\cap \cdots \cap U_r$, then $U_{\partial_0\rho} = U_1\cap\cdots \cap U_r$.

The Čech (co)homology will be defined as the (co)homology of a particular complex, whose boundary maps will be induced by, equivalently, the inclusion map on opens or $\partial_i$ on simplices.

Definition: In the context above:
  • a $p$-chain is a finite formal sum of elements $a_{\sigma_i}\in \widehat{\mathcal F}(U_{\sigma_i})$, for every $\sigma_i$ a $p$-simplex,
  • a $q$-cochain is a finite formal sum of elements $b_{\tau_j}\in \mathcal F(U_{\tau_j})$, for every $\tau_j$ a $q$-simplex,
  • the $p$-differential is the map $d_p:\check C_p(\mathcal U,\mathcal F) \to \check C_{p-1}(\mathcal U,\mathcal F)$ given by
\[
d_p(a_\sigma) = \sum_{i=0}^p (-1)^i \widehat{\mathcal F}(\partial_i)(a_\sigma),\]
  • the $q$-codifferential is the map $\delta^q:\check C^q(\mathcal U,\mathcal F) \to \check C^{q+1}(\mathcal U,\mathcal F)$ given by
\[
\delta^q(b_\tau) = \sum_{j=0}^{q+1} (-1)^j \mathcal F(\partial_j)(b_\tau).\]The collection of $p$-chains form a group $\check C_p(\mathcal U,\mathcal F)$ and the collection of $q$-cochains also form a group $\check C^q(\mathcal U,\mathcal F)$, both under the respective group operation in each coordinate. The Čech homology $H_*(\mathcal U,\mathcal F)$ is the homology of the chain complex of $\check C_p$ groups, and the Čech cohomology $H^*(\mathcal U,\mathcal F)$ is the cohomology of the cochain complex of $\check C^q$ groups.

Example: Let $X=S^1$ with a cover $\mathcal U = \{U,V,W\}$ and associated nerve $N_{\mathcal U}$ as below.
The cover is chosen so that all intersections are contractible. Let $k$ be a field. Let $\widehat{\mathcal F}$ be a cosheaf over $N$ and $\mathcal F$ a sheaf over $N$, with $\widehat {\mathcal F}(\text{0-cell})=\mathcal F(\text{1-cell}) = (1,1)\in k^2$ and $\widehat{\mathcal F}(\text{1-cell})=\mathcal F(\text{0-cell})=1\in k$, so that the natural extension and restriction maps work. Then all the degree 0 and 1 chain and cochain groups are $k^3$. Giving a counter-clockwise orientation to $X$, we easily see that
\begin{align*}
d_1\sigma_{U\cap V} & = \sigma_V-\sigma_U, & \delta^0\sigma_U & = \sigma_{U\cap V}-\sigma_{W\cap U}, \\
d_1\sigma_{V\cap W} & = \sigma_W-\sigma_V, & \delta^0\sigma_V & = \sigma_{V\cap W}-\sigma_{U\cap V}, \\
d_1\sigma_{W\cap U} & = \sigma_U-\sigma_W, & \delta^0\sigma_W & = \sigma_{W\cap U}-\sigma_{V\cap W}.\end{align*}If we give an ordered basis of $(\sigma_{U\cap V},\sigma_{V\cap W},\sigma_{W\cap U})$ to $\check C_1(\mathcal U,\widehat{\mathcal F})$ and $\check C^1(\mathcal U,\mathcal F)$, and $(\sigma_U,\sigma_V,\sigma_W)$ to $\check C_0(\mathcal U,\widehat{\mathcal F})$ and $\check C^0(\mathcal U,\mathcal F)$, we find that
\[
d_1 = \begin{bmatrix}
-1 & 0 & 1 \\ 1 & -1 & 0 \\ 0 & 1 & -1
\end{bmatrix}
\sim
\begin{bmatrix}
1 & 0 & -1 \\ 0 & 1 & -1 \\ 0 & 0 & 0
\end{bmatrix},
\hspace{1cm}
\delta^0 = \begin{bmatrix}
-1 & 1 & 0 \\ 0 & -1 & 1 \\ 1 & 0 & -1
\end{bmatrix}
\sim
\begin{bmatrix}
1 & 0 & -1 \\ 0 & 1 & -1 \\ 0 & 0 & 0
\end{bmatrix}.
\]
The Čech chain and cochain complexes are then
\[
0 \to \check C_1(\mathcal U,\widehat{\mathcal F}) \tov{d_1} \check C_0(\mathcal U,\widehat{\mathcal F}) \to 0,
\hspace{1cm}
0 \to \check C^0(\mathcal U,\mathcal F) \tov{\delta^0} \check C^1(\mathcal U,\mathcal F) \to 0,\]for which
\begin{align*}
H_1(\mathcal U,\widehat{\mathcal F}) & = \ker(d_1) = k,
& H^0(\mathcal U,\mathcal F) & = \ker(\delta^0) = k, \\
H_0(\mathcal U,\widehat{\mathcal F}) & = k^3/\im(d_1) = k^3/k^2 = k,
& H^1(\mathcal U,\mathcal F) & = k^3/\im(\delta^0) = k^3/k^2 = k.\end{align*}By the Čech-de Rham theorem, we know that the (co)homology groups should agree with the usual groups for $S^1$, as $\mathcal U$ was a good cover, which they do. Next we compute another example with a view towards persistent homology.

Definition: Let $X$ be a topological space and $f:X\to Y$ a map with $\mathcal U$ covering $f(X)$. The Leray sheaf $L^i$ of degree $i$ over $N_{\mathcal U}$ is defined by $L^i(\sigma) = H^i(f^{-1}(U_\sigma))$ and $L^i(\sigma\hookrightarrow \tau) = H^i(f^{-1}(U_\tau)\hookrightarrow f^{-1}(U_\sigma))$, whenever $\sigma$ is a face of $\tau$.

Theorem (Curry, Theorem 8.2.21): In the context above, if $N_{\mathcal U}$ is at most 1-dimensional, then for any $t\in \R$,
\[
H^i(f^{-1}(-\infty,t])\cong H^0((-\infty,t],L^i)\oplus H^1((-\infty,t],L^{i-1}).\]
The idea is to apply this theorem in a filtration, for different values of $t$, but in the example below we will have $t$ large enough so that $X\subset f^{-1}(-\infty,t]$.

Example: Let $f:S^1\to \R$ be a projection map, and let $X = f(S^1)$ with a cover $\mathcal U = \{U,V\}$ as below.
Note that although $f^{-1}(U)\cap f^{-1}(V)$ is not contractible, $U\cap V$ is, and the Čech cohomology will be over $\mathcal U\subset \R$, so we are fine in applying the Čech-de Rham theorem. It is immediate that the only non-zero Leray sheaves are $L^0$, for which
\[
L^0(\sigma_U) = k,\hspace{1cm}
L^0(\sigma_V) = k,\hspace{1cm}
L^0(\sigma_{U\cap V}) = k^2,\]hence $\check C^0(\mathcal U,L^0)=\check C^1(\mathcal U,L^0) = k^2$. Giving $\check C^0(\mathcal U,L^0)$ the ordered basis $(\sigma_U,\sigma_V)$ and noting the homology maps $H^0(f^{-1}(U)\hookrightarrow f^{-1}(U\cap V))$ and $H^0(f^{-1}(V)\hookrightarrow f^{-1}(U\cap V))$ are simply $1\mapsto (1,1)$, the \v Cech complex is
\[
0 \to \check C^0(\mathcal U,L^0) \tov{\left[\begin{smallmatrix}-1 & -1 \\ 1 & 1 \end{smallmatrix}\right]} \check C^1(\mathcal U,L^0) \to 0.
\]
Hence $H^0(\mathcal U,L^0)=\ker(\delta^0)=k$ and $H^1(\mathcal U,L^0)=k^2/\im(\delta^0)=k^2/k=k$, allowing us to conclude, using Curry's and the Čech--de Rham theorems, that
\begin{align*}
H^0(S^1) & \cong H^0(\mathcal U,L^0) \oplus H^1(\mathcal U,L^{-1}) = k\oplus 0 = k, \\
H^1(S^1) & \cong H^0(\mathcal U,L^1) \oplus H^1(\mathcal U,L^0) = 0\oplus k = k, \\
H^2(S^1) & \cong H^0(\mathcal U,L^2) \oplus H^1(\mathcal U,L^1) = 0\oplus 0=0,\end{align*}as expected.

References: Bott and Tu (Differential forms in algebraic topology, Section 10), Bredon (Sheaf theory, Section VI.4), Curry (Sheaves, cosheaves, and applications, Section 8)

Thursday, May 26, 2016

Reconstructing a manifold from sample data, with noise

We follow the article [3] and add more background and clarifications. Some assumptions are made that are not explicitly mentioned in the article, to make calculations easier.

Background in probability, measure theory, topology

Let $X$ be a random variable over a space $A$. Recall that the expression $P(X)$ is a number in $[0,1]$ describing the probability of the event $X$ happening. This is called a probability distribution. Here we will consider continuous random variables, so $P(X=x)=0$ for any single element $x\in A$.

Definition: The probability density function of $X$ is the function $f:A\to \R$ satisfying
  • $f(x)\geqslant 0$ for all $x\in A$, and
  • $\int_B f(x)\ dx = P(X\in B)$ for any $B\subseteq A$.
The second condition implies $\int_A f(x)\ dx=1$.

Often authors use just $P$ instead of $f$, and write $P(x)$ instead of $P(X=x)$.

Definition: Let $Y=g(X)$ be another random variable. The expected value of $Y$ is
\[
E[Y] = E[g(X)] = \int_Ag(x)f(x)\ dx.
\]
The mean of $X$ is $\mu= E[X]$, and the variance of $X$ is $\sigma^2 = E[(X-\mu)^2]$. If $\vec X=(X_1\ \cdots\ X_n)^T$ is a multivariate random variable, then $\vec \mu=E[\vec X]$ is an $n$-vector, and the variance is an $(n\times n)$-matrix given as
\[
\Sigma = E[(\vec X-E[\vec X])(\vec X-E[\vec X])^T]
\hspace{1cm}
\text{or}
\hspace{1cm}
\Sigma_{ij} = E[(X_i-E[X_i])(X_j-E[X_j])].
\]
The covariance of $X$ and $Y$ is $E[(X-E[X])(Y-E[Y])]$. Note that the covariance of $X$ with itself is just the usual variance of $X$.

Example: One example of a probability distribution is the normal (or Gaussian) distribution, and we say a random variable with the normal distribution is normally distributed. If a random variable $X$ is normally distributed with mean $\mu$ and variance $\sigma^2$, then the probability density function of $X$ is
\[
f(x) = \frac{\exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)}{\sigma\sqrt{2\pi}}.
\]
If $\vec X=(X_1\ \cdots\ X_n)^T$ is a normally distributed multivariate random variable, then $\vec \mu = (E[X_1]\ \cdots\ E[X_n])^T$ and the probability density function of $\vec X$ is
\[
f(\vec x) = \frac{\exp\left(-\frac 12 (\vec x-\vec \mu)^T\Sigma^{-1}(\vec x-\vec \mu)\right)}{\sqrt{(2\pi)^n\det(\Sigma)}}.
\]

Definition: A measure on $\R^D$ is a function $m:\{$subsets of $\R^D\}\to [0,\infty]$ such that $m(\emptyset) = 0$ and $m(\bigcup_{i\in I} E_i) = \sum_{i\in I} m(E_i)$ for $\{E_i\}_{i\in I}$ a countable sequence of disjoint subsets of $\R^D$. A probability measure on $\R^D$ is a measure $m$ on $\R^D$ with the added condition that $m(\R^D)=1$.

A probability distribution is an example of a probability measure.

Definition: Let $U= \{U_i\}_{i\in I}$ be a covering of a topological space $M$. The nerve of the covering $U$ is a set $N$ of subsets of $I$ given by
\[
N = \left\{J\subset I\ :\ \bigcap_{j\in J} U_j \neq\emptyset\right\}.
\]
Note that this makes $N$ into an abstract simplicial complex, as $J\in N$ implies $J'\in N$ for all $J'\subseteq J$.

Let $M$ be a smooth compact submanifold of $\R^d$. By the tubular neighborhood theorem (see Theorem 2.11.4 in [3]), every smooth compact submanifold $M$ of $\R^d$ has a tubular neighborhood for some $\epsilon>0$.

Definition: For a particular embedding of $M$, let the condition number of $M$ be $\tau=\sup\{\epsilon\ :$ $M$ has an $\epsilon-$tubular neighborhood$\}$.

Distributions on a manifold

Let $M$ be a $d$-dimensional manifold embedded in $\R^D$, with $D>d$. Recall that every element in $NM\subseteq \R^D$, the normal bundle of $M$, may be represented as a pair $(\vec x,\vec y)$, where $\vec x\in M$ and $\vec y\in T^\perp$ (since $M$ is a manifold, all the normal spaces are isomorphic). Hence we may consider a probability distribution $P$ on $NM$, with $\vec X$ the $d$-multivariate random variable representing points on $M$ and $\vec Y$ the $(D-d)$-multivariate random variable representing points on the space normal to $M$ at a point on $M$. We make the assumption that $\vec X$ and $\vec Y$ are independent, or that
\[
P(\vec X, \vec Y) = P_M(\vec X)P_{T^\perp}(\vec Y).
\]
That is, $P_{T^\perp}$ is a probability distribution that is the same at any point on the manifold.

Definition: Let $P$ be a probability distribution on $NM$ and $f_M$ the probability density function of $P_M$. In the context described above, $P$ satisfies the strong variance condition if
  • there exist $a,b>0$ such that $f_M(\vec x)\in [a,b]$ for all $\vec x\in M$, and
  • $P_{T^\perp}(\vec Y)$ is normally distributed with $\vec \mu = 0$ and $\Sigma = \sigma^2I$.
The second condition implies that the covariance of $Y_i$ with $Y_j$ is trivial iff $i\neq j$, and that the vairance of all the $Y_i$s is the same. From the normally distributed multivariate example above, this also tells us that the probability density function $f^\perp$ of $\vec Y$ is
\[
f^\perp(\vec y) = \frac{\exp\left(\displaystyle-\frac{\sigma^2}{2}\sum_{i=1}^{D-d}y_i^2\right)}{\sigma^{D-d}\sqrt{(2\pi)^{D-d}}}.
\]

Theorem:
In the context described above, let $P$ be a probability distribution on $NM$ satisfying the strong variance condition, and let $\delta>0$. If there is $c>1$ such that
\[
\sigma <\frac{c\tau(\sqrt9-\sqrt 8)}{9\sqrt{8(D-d)}},
\]
then there is an algorithm that computes the homology of $M$ from a random sample of $n$ points, with probability $1-\delta$. The number $n$ depends on $\tau,\delta,c,d,D$, and the diameter of $M$.

The homology computing algorithm

Below is a broad view of the algorithm described in sections 3, 4, and 5 of [1]. Let $M$ be a $d$-manifold embedded in $\R^D$, and $P$ a probability measure on $NM$ satisfying the strong variance condition.

1. Calculate the following numbers:
\begin{align*}
\tau & = \text{condition number of $M$}\\
\text{vol}(M) & = \text{volume of $M$}\\
\sigma^2 & = \text{variance of $P$}
\end{align*}
2. Define (or choose) the following numbers:
\begin{align*}
\delta & \in (0,1) \\
r & \in \left(2\sqrt{2(D-d)}\sigma,\textstyle\frac\tau9 (3-2\sqrt 2)\right) \\
n & > \text{function}(a,r,\tau,d,\delta,\text{vol}(M)) & (\max(A,B)\  \text{in Proposition 9 of [1])} \\
s & = 4r \\
deg & > \textstyle \frac{3a}4 \left(1-\left(\frac r{2\tau}\right)^2\right)^{d/2}\text{vol}\left(B^d(r,0)\right)\\
R & = (9r+\tau)/2
\end{align*}
3. Choose $n$ points randomly from $NM$ according to $P$.
4. From these $n$ points, construct the nearest neighbor graph $G$ with distance $s$.
5. Remove from $G$ all the vertices of degree $<deg$ to get a refined graph $G'$.
6. Set $U=\bigcup_{\vec x\in V(G')}B^D(R,\vec x)$ and construct the simplicial complex $K$ of its nerve.
7. Compute the homology of $K$, which is the homology of $M$, with probability $1-\delta$.

References:
[1] Niyogi, Smale, and Weinberger (A topological view of unsupervised learning from noisy data)
[2] Folland (Real analysis, Chapter 10.1)
[3] Bredon (Topology and Geometry, Chapter 2.11)