Thursday, December 22, 2016

Sampling points uniformly on parametrized manifolds

Here I'll describe how to sample points uniformly on a (parametrized) manifold, along with an actual implementation in Python. Let $M$ be a $m$-dimensional manifold embedded in $\R^n$ via $f:\R^m\to \R^n$. Moreover, assume that $f$ is Lipschitz (true if $M$ is compact), injective (true if $M$ is embedded), and is a parameterization, in the sense that there is an $m$-rectangle $A = [a_1,b_1]\times \cdots \times [a_m,b_m]$ such that $f(A) = M$ (the intervals need not be closed). Set $(\widetilde Jf)^2 = \det(Df\cdot Df^T)$ to be the $m$-dimensional Jacobian, and calculate\[c = \int_{a_m}^{b_m}\cdots \int_{a_1}^{b_1} \widetilde J f\ dx_1\cdots dx_m.\]Recall the brief statistical background presented in a previous blog post ("Reconstructing a manifold from sample data, with noise," 2016-05-26). A uniform or constant probability density function is valued the same at every point on its domain.

Proposition: In the setting above:
  • (completely separable) Let $g_1,\dots,g_m$ be probability density functions on $[a_1,b_1],\dots,[a_m,b_m]$, respectively. If $g_1\cdots g_m = \widetilde J f / c$, then the joint probability density function of $g_1,\dots,g_m$ is uniform on $M$ with respect to the metric induced from $\R^n$.
  • (non-separable) Let $g$ be a probability density function on $[a_1,b_1]\times\cdots\times[a_m,b_m]$. If $g=\widetilde J f/c$, then $g$ is uniform on $M$ with respect to the metric induced from $\R^n$.

A much more abstract statement and proof are given in [2], Section 3.2.5, but assuming $f$ is injective and $M$ is in $\R^n$, we evade the worst notation. Section 2.2 of [1] gives a brief explanation of how the given statement follows, while Section 2 of [3] goes into more detail of why the above is true.

Example:
Let $M = S^2$, the sphere of radius $r$, and $f:[0,2\pi)\times [0,\pi) \to \R^3$ the natural embedding given by
\[
(\theta,\varphi) \mapsto (r\cos(\theta)\sin(\varphi),r\sin(\theta)\sin(\varphi), r\cos(\varphi)),
\]with\begin{align*}
Df & = \begin{bmatrix}
-r \sin(\varphi) \sin(\theta) & r \cos(\theta)\sin(\varphi) &  0 \\
r \cos(\varphi) \cos(\theta) & r \cos(\varphi) \sin(\theta) & -r \sin(\varphi)
\end{bmatrix}, & \widetilde Jf & = r^2\sin(\varphi), \\
Df\cdot Df^T & = \begin{bmatrix}
r^2 \sin^2(\varphi) & 0 \\ 0 & r^2
\end{bmatrix}, & c & = 4\pi r^2.
\end{align*}
Let $g_1(\theta) = 1/2\pi$ be the uniform distribution over $[0,2\pi)$, meaning that $g_2(\varphi) = \sin(\varphi)/2$ over $[0,\pi)$. Sampling points randomly from these two distributions and applying $f$ will give uniformly sampled points on $S^2$.

Example: Let $M = T^2$, the torus of major radius $R$ and minor radius $r$, and $f:[0,2\pi)\times [0,2\pi) \to \R^3$ the natural embedding given by
\[
(\theta,\varphi) \mapsto ((R+r\cos(\theta))\cos(\varphi),(R+r\cos(\theta))\sin(\varphi), r\sin(\theta)),
\]with\begin{align*}
Df & = \begin{bmatrix}
-r \cos(\varphi)\sin(\theta) & -r \sin(\varphi) \sin(\theta) & r \cos(\theta) \\
-(R + r \cos(\theta)) \sin(\varphi) & \cos(\varphi) (R + r \cos(\theta)) & 0
\end{bmatrix}, & \widetilde Jf & = r(R+r\cos(\theta)), \\
Df\cdot Df^T & = \begin{bmatrix}
r^2 & 0 \\ 0 & (R + r \cos(\theta))^2
\end{bmatrix}, & c & = 4\pi^2rR.
\end{align*}
Let $g_2(\varphi) = 1/2\pi$ be the uniform distribution over $[0,2\pi)$, meaning that $g_1(\theta) = (1+r\cos(\theta)/R)/(2\pi)$ over $[0,2\pi)$. Sampling points randomly from these two distributions and applying $f$ will give uniformly sampled points on $T^2$.

Below I give a simple implementation of how to actually sample points, in Python using the SciPy package. The functions $f,g_1,\dots,g_m$ are all assumed to be given.

import scipy.stats as st

class var_g1(st.rv_continuous):
    'Uniform variable 1'
    def _pdf(self, x):
        return g1(x)
...
class var_gm(st.rv_continuous):
    'Uniform variable m'
    def _pdf(self, x):
        return gm(x)

dist_g1 = var_g1(a=a1, b=b1, name='Uniform distribution 1')
...
dist_gm = var_gm(a=am, b=bm, name='Uniform distribution m')

def mfld_sample():
    return f(dist_g1.rvs(),...,dist_gm.rvs())
A further application for this would be to understand how to sample points uniformly on projective manifolds, with a leading example the Grassmannian, embedded via Plucker coordinates.

References:
[1] Diaconis, Holmes, and Shahshahani (Sampling from a manifold, Section 2.2)
[2] Federer (Geometric measure theory, Section 3.2.5)
[3] Rhee, Zhou, and Qiu (An iterative algorithm for sampling from manifolds, Section 2)

Thursday, December 8, 2016

The conditioning number of a helix, part 2

Recall the previous attempt to find the conditioning number of a helix (see post "The conditioning number of a helix, part 1," 2016-10-31). Here we complete the approach and although exact solutions are hard to find, we give close approximations.

The setting was a helix $C$ of radius $r$ and stretch $c$, so given as the zero locus of $x-r\cos(z/c)$ and $y-r\sin(z/c)$, and we wanted to find where the normal plane at a point $p\in C$ intersects $C$ again. It may intersect $C$ several times, but we are only interested in the shortest distances. Without loss of generality, assume that $p=(r,0,0)$. The normal plane at $p$ is then given by
\[
0 =
\det\begin{bmatrix}
x-r\cos(p_z/c) & y-r\sin(p_z/c) & z-p_z \\
1 & 0 & r\sin(p_z/c)/c \\
0 & 1 & -r\cos(p_z/c)/c
\end{bmatrix}
=
\det\begin{bmatrix}
x-r & y & z \\
1 & 0 & 0 \\
0 & 1 & -r/c
\end{bmatrix}
=
\frac rc y + z.\]Since the cylinder on which the helix $C$ lies is $x^2+y^2=r^2$, the curve $C'$ representing the intersection of the plane with the cylinder is given by the zero locus of $\pm r\sqrt{x^2-r^2} + cz$. This allows us to find the intersection with the helix. However, since $C$ is parametrized with $z$ the free variable and $C'$ with $x$ free, its is easier to switch to cylindrical coordinates
\[
\left(r = \sqrt{x^2+y^2},\theta = \arctan(y/x), z = z\right).\]Doing so gives a nice description of $C$ and $C'$ as below.\[
\begin{array}{r l c l}
C : &  (r\cos(z/c),r\sin(z/c),z) & = & (r,\theta,\theta c) \\
C' : &  (x,-\sqrt{r^2-x^2},r\sqrt{r^2-x^2}/c) & = & (r.\theta,r^2\sin(\theta)/c)
\end{array}\]The switch in coordinates is represented by the diagram below, where we have only used the top half of $C'$.
Finding $C\cap C'$ is equivalent to solving $\frac{c^2}{r^2} = \frac{\sin(\theta)}\theta$ for $\theta$, a task that can not be solved exactly. Instead we take the tangent lines to $C'$ on the unrolled cylinder at its base, and see where those intersect the line $\theta c$. Inspecting the areas of the tangent lines closer and calculating the euclidean distances in $\R^3$ from $p$ to $a$ and $b$, which is, I can't believe I'm saying this, a great exercise for the reader, we get the distances to be
\[
d(p,a) = \sqrt{2r^2\left(1+\cos\left(\frac{\pi c^2}{r^2-c^2}\right)\right) + \left(\frac{\pi cr^2}{r^2-c^2}\right)^2},
\hspace{.2cm}
d(p,b) = \sqrt{2r^2\left(1-\cos\left(\frac{2\pi c^2}{r^2+c^2}\right)\right) + \left(\frac{2\pi c r^2}{r^2+c^2}\right)^2}.\]Truthfully, the diagrams are tricky to draw in TikZ and I don't want to simply have a scan of some rough work. More importantly, $d(p,a) = d(p,b)$ implies $c = r/\sqrt 3$, meaning that when the stretch $c$ is larger than $r/\sqrt 3$, the normal planes certainly do not intersect the helix again.