Showing posts with label measure. Show all posts
Showing posts with label measure. Show all posts

Thursday, December 22, 2016

Sampling points uniformly on parametrized manifolds

Here I'll describe how to sample points uniformly on a (parametrized) manifold, along with an actual implementation in Python. Let $M$ be a $m$-dimensional manifold embedded in $\R^n$ via $f:\R^m\to \R^n$. Moreover, assume that $f$ is Lipschitz (true if $M$ is compact), injective (true if $M$ is embedded), and is a parameterization, in the sense that there is an $m$-rectangle $A = [a_1,b_1]\times \cdots \times [a_m,b_m]$ such that $f(A) = M$ (the intervals need not be closed). Set $(\widetilde Jf)^2 = \det(Df\cdot Df^T)$ to be the $m$-dimensional Jacobian, and calculate\[c = \int_{a_m}^{b_m}\cdots \int_{a_1}^{b_1} \widetilde J f\ dx_1\cdots dx_m.\]Recall the brief statistical background presented in a previous blog post ("Reconstructing a manifold from sample data, with noise," 2016-05-26). A uniform or constant probability density function is valued the same at every point on its domain.

Proposition: In the setting above:
  • (completely separable) Let $g_1,\dots,g_m$ be probability density functions on $[a_1,b_1],\dots,[a_m,b_m]$, respectively. If $g_1\cdots g_m = \widetilde J f / c$, then the joint probability density function of $g_1,\dots,g_m$ is uniform on $M$ with respect to the metric induced from $\R^n$.
  • (non-separable) Let $g$ be a probability density function on $[a_1,b_1]\times\cdots\times[a_m,b_m]$. If $g=\widetilde J f/c$, then $g$ is uniform on $M$ with respect to the metric induced from $\R^n$.

A much more abstract statement and proof are given in [2], Section 3.2.5, but assuming $f$ is injective and $M$ is in $\R^n$, we evade the worst notation. Section 2.2 of [1] gives a brief explanation of how the given statement follows, while Section 2 of [3] goes into more detail of why the above is true.

Example:
Let $M = S^2$, the sphere of radius $r$, and $f:[0,2\pi)\times [0,\pi) \to \R^3$ the natural embedding given by
\[
(\theta,\varphi) \mapsto (r\cos(\theta)\sin(\varphi),r\sin(\theta)\sin(\varphi), r\cos(\varphi)),
\]with\begin{align*}
Df & = \begin{bmatrix}
-r \sin(\varphi) \sin(\theta) & r \cos(\theta)\sin(\varphi) &  0 \\
r \cos(\varphi) \cos(\theta) & r \cos(\varphi) \sin(\theta) & -r \sin(\varphi)
\end{bmatrix}, & \widetilde Jf & = r^2\sin(\varphi), \\
Df\cdot Df^T & = \begin{bmatrix}
r^2 \sin^2(\varphi) & 0 \\ 0 & r^2
\end{bmatrix}, & c & = 4\pi r^2.
\end{align*}
Let $g_1(\theta) = 1/2\pi$ be the uniform distribution over $[0,2\pi)$, meaning that $g_2(\varphi) = \sin(\varphi)/2$ over $[0,\pi)$. Sampling points randomly from these two distributions and applying $f$ will give uniformly sampled points on $S^2$.

Example: Let $M = T^2$, the torus of major radius $R$ and minor radius $r$, and $f:[0,2\pi)\times [0,2\pi) \to \R^3$ the natural embedding given by
\[
(\theta,\varphi) \mapsto ((R+r\cos(\theta))\cos(\varphi),(R+r\cos(\theta))\sin(\varphi), r\sin(\theta)),
\]with\begin{align*}
Df & = \begin{bmatrix}
-r \cos(\varphi)\sin(\theta) & -r \sin(\varphi) \sin(\theta) & r \cos(\theta) \\
-(R + r \cos(\theta)) \sin(\varphi) & \cos(\varphi) (R + r \cos(\theta)) & 0
\end{bmatrix}, & \widetilde Jf & = r(R+r\cos(\theta)), \\
Df\cdot Df^T & = \begin{bmatrix}
r^2 & 0 \\ 0 & (R + r \cos(\theta))^2
\end{bmatrix}, & c & = 4\pi^2rR.
\end{align*}
Let $g_2(\varphi) = 1/2\pi$ be the uniform distribution over $[0,2\pi)$, meaning that $g_1(\theta) = (1+r\cos(\theta)/R)/(2\pi)$ over $[0,2\pi)$. Sampling points randomly from these two distributions and applying $f$ will give uniformly sampled points on $T^2$.

Below I give a simple implementation of how to actually sample points, in Python using the SciPy package. The functions $f,g_1,\dots,g_m$ are all assumed to be given.

import scipy.stats as st

class var_g1(st.rv_continuous):
    'Uniform variable 1'
    def _pdf(self, x):
        return g1(x)
...
class var_gm(st.rv_continuous):
    'Uniform variable m'
    def _pdf(self, x):
        return gm(x)

dist_g1 = var_g1(a=a1, b=b1, name='Uniform distribution 1')
...
dist_gm = var_gm(a=am, b=bm, name='Uniform distribution m')

def mfld_sample():
    return f(dist_g1.rvs(),...,dist_gm.rvs())
A further application for this would be to understand how to sample points uniformly on projective manifolds, with a leading example the Grassmannian, embedded via Plucker coordinates.

References:
[1] Diaconis, Holmes, and Shahshahani (Sampling from a manifold, Section 2.2)
[2] Federer (Geometric measure theory, Section 3.2.5)
[3] Rhee, Zhou, and Qiu (An iterative algorithm for sampling from manifolds, Section 2)

Thursday, May 26, 2016

Reconstructing a manifold from sample data, with noise

We follow the article [3] and add more background and clarifications. Some assumptions are made that are not explicitly mentioned in the article, to make calculations easier.

Background in probability, measure theory, topology

Let $X$ be a random variable over a space $A$. Recall that the expression $P(X)$ is a number in $[0,1]$ describing the probability of the event $X$ happening. This is called a probability distribution. Here we will consider continuous random variables, so $P(X=x)=0$ for any single element $x\in A$.

Definition: The probability density function of $X$ is the function $f:A\to \R$ satisfying
  • $f(x)\geqslant 0$ for all $x\in A$, and
  • $\int_B f(x)\ dx = P(X\in B)$ for any $B\subseteq A$.
The second condition implies $\int_A f(x)\ dx=1$.

Often authors use just $P$ instead of $f$, and write $P(x)$ instead of $P(X=x)$.

Definition: Let $Y=g(X)$ be another random variable. The expected value of $Y$ is
\[
E[Y] = E[g(X)] = \int_Ag(x)f(x)\ dx.
\]
The mean of $X$ is $\mu= E[X]$, and the variance of $X$ is $\sigma^2 = E[(X-\mu)^2]$. If $\vec X=(X_1\ \cdots\ X_n)^T$ is a multivariate random variable, then $\vec \mu=E[\vec X]$ is an $n$-vector, and the variance is an $(n\times n)$-matrix given as
\[
\Sigma = E[(\vec X-E[\vec X])(\vec X-E[\vec X])^T]
\hspace{1cm}
\text{or}
\hspace{1cm}
\Sigma_{ij} = E[(X_i-E[X_i])(X_j-E[X_j])].
\]
The covariance of $X$ and $Y$ is $E[(X-E[X])(Y-E[Y])]$. Note that the covariance of $X$ with itself is just the usual variance of $X$.

Example: One example of a probability distribution is the normal (or Gaussian) distribution, and we say a random variable with the normal distribution is normally distributed. If a random variable $X$ is normally distributed with mean $\mu$ and variance $\sigma^2$, then the probability density function of $X$ is
\[
f(x) = \frac{\exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)}{\sigma\sqrt{2\pi}}.
\]
If $\vec X=(X_1\ \cdots\ X_n)^T$ is a normally distributed multivariate random variable, then $\vec \mu = (E[X_1]\ \cdots\ E[X_n])^T$ and the probability density function of $\vec X$ is
\[
f(\vec x) = \frac{\exp\left(-\frac 12 (\vec x-\vec \mu)^T\Sigma^{-1}(\vec x-\vec \mu)\right)}{\sqrt{(2\pi)^n\det(\Sigma)}}.
\]

Definition: A measure on $\R^D$ is a function $m:\{$subsets of $\R^D\}\to [0,\infty]$ such that $m(\emptyset) = 0$ and $m(\bigcup_{i\in I} E_i) = \sum_{i\in I} m(E_i)$ for $\{E_i\}_{i\in I}$ a countable sequence of disjoint subsets of $\R^D$. A probability measure on $\R^D$ is a measure $m$ on $\R^D$ with the added condition that $m(\R^D)=1$.

A probability distribution is an example of a probability measure.

Definition: Let $U= \{U_i\}_{i\in I}$ be a covering of a topological space $M$. The nerve of the covering $U$ is a set $N$ of subsets of $I$ given by
\[
N = \left\{J\subset I\ :\ \bigcap_{j\in J} U_j \neq\emptyset\right\}.
\]
Note that this makes $N$ into an abstract simplicial complex, as $J\in N$ implies $J'\in N$ for all $J'\subseteq J$.

Let $M$ be a smooth compact submanifold of $\R^d$. By the tubular neighborhood theorem (see Theorem 2.11.4 in [3]), every smooth compact submanifold $M$ of $\R^d$ has a tubular neighborhood for some $\epsilon>0$.

Definition: For a particular embedding of $M$, let the condition number of $M$ be $\tau=\sup\{\epsilon\ :$ $M$ has an $\epsilon-$tubular neighborhood$\}$.

Distributions on a manifold

Let $M$ be a $d$-dimensional manifold embedded in $\R^D$, with $D>d$. Recall that every element in $NM\subseteq \R^D$, the normal bundle of $M$, may be represented as a pair $(\vec x,\vec y)$, where $\vec x\in M$ and $\vec y\in T^\perp$ (since $M$ is a manifold, all the normal spaces are isomorphic). Hence we may consider a probability distribution $P$ on $NM$, with $\vec X$ the $d$-multivariate random variable representing points on $M$ and $\vec Y$ the $(D-d)$-multivariate random variable representing points on the space normal to $M$ at a point on $M$. We make the assumption that $\vec X$ and $\vec Y$ are independent, or that
\[
P(\vec X, \vec Y) = P_M(\vec X)P_{T^\perp}(\vec Y).
\]
That is, $P_{T^\perp}$ is a probability distribution that is the same at any point on the manifold.

Definition: Let $P$ be a probability distribution on $NM$ and $f_M$ the probability density function of $P_M$. In the context described above, $P$ satisfies the strong variance condition if
  • there exist $a,b>0$ such that $f_M(\vec x)\in [a,b]$ for all $\vec x\in M$, and
  • $P_{T^\perp}(\vec Y)$ is normally distributed with $\vec \mu = 0$ and $\Sigma = \sigma^2I$.
The second condition implies that the covariance of $Y_i$ with $Y_j$ is trivial iff $i\neq j$, and that the vairance of all the $Y_i$s is the same. From the normally distributed multivariate example above, this also tells us that the probability density function $f^\perp$ of $\vec Y$ is
\[
f^\perp(\vec y) = \frac{\exp\left(\displaystyle-\frac{\sigma^2}{2}\sum_{i=1}^{D-d}y_i^2\right)}{\sigma^{D-d}\sqrt{(2\pi)^{D-d}}}.
\]

Theorem:
In the context described above, let $P$ be a probability distribution on $NM$ satisfying the strong variance condition, and let $\delta>0$. If there is $c>1$ such that
\[
\sigma <\frac{c\tau(\sqrt9-\sqrt 8)}{9\sqrt{8(D-d)}},
\]
then there is an algorithm that computes the homology of $M$ from a random sample of $n$ points, with probability $1-\delta$. The number $n$ depends on $\tau,\delta,c,d,D$, and the diameter of $M$.

The homology computing algorithm

Below is a broad view of the algorithm described in sections 3, 4, and 5 of [1]. Let $M$ be a $d$-manifold embedded in $\R^D$, and $P$ a probability measure on $NM$ satisfying the strong variance condition.

1. Calculate the following numbers:
\begin{align*}
\tau & = \text{condition number of $M$}\\
\text{vol}(M) & = \text{volume of $M$}\\
\sigma^2 & = \text{variance of $P$}
\end{align*}
2. Define (or choose) the following numbers:
\begin{align*}
\delta & \in (0,1) \\
r & \in \left(2\sqrt{2(D-d)}\sigma,\textstyle\frac\tau9 (3-2\sqrt 2)\right) \\
n & > \text{function}(a,r,\tau,d,\delta,\text{vol}(M)) & (\max(A,B)\  \text{in Proposition 9 of [1])} \\
s & = 4r \\
deg & > \textstyle \frac{3a}4 \left(1-\left(\frac r{2\tau}\right)^2\right)^{d/2}\text{vol}\left(B^d(r,0)\right)\\
R & = (9r+\tau)/2
\end{align*}
3. Choose $n$ points randomly from $NM$ according to $P$.
4. From these $n$ points, construct the nearest neighbor graph $G$ with distance $s$.
5. Remove from $G$ all the vertices of degree $<deg$ to get a refined graph $G'$.
6. Set $U=\bigcup_{\vec x\in V(G')}B^D(R,\vec x)$ and construct the simplicial complex $K$ of its nerve.
7. Compute the homology of $K$, which is the homology of $M$, with probability $1-\delta$.

References:
[1] Niyogi, Smale, and Weinberger (A topological view of unsupervised learning from noisy data)
[2] Folland (Real analysis, Chapter 10.1)
[3] Bredon (Topology and Geometry, Chapter 2.11)