Processing math: 63%

Sunday, July 31, 2016

(Co)fibrations, suspensions, and loop spaces

 Seminar topic

Recall the exponential object ZY, which, in the category of topological spaces, is the set of all continuous functions YZ. In general, the definition involves a commuting diagram and gives an isomorphism Hom(X×Y,Z)Hom(X,ZY). The subspace F(Y,Z) of ZY consists of based functions YZ.

Definition: Let F,E,B,X be topological spaces. A map i:FE is a cofibration if for every map f:EX and every homotopy h:F×IX, there exists a homotopy ˜h:E×IX (extending h) making either of the equivalent diagrams below commute.

The horizontal maps on the left are the natural inclusion maps x(x,0) and the map on the right is the natural evaluation map φφ(0). Similarly, a map p:EB is a fibration if for every map g:XE and every homotopy h:X×IB, there exists a homotopy ˜h:X×IE (lifting h) making either of the equivalent diagrams below commute.

The horizontal maps on the right are the natural evaluation maps and the map on the right is the natural inclusion map.

Instead of this terminology, often we say the pair (F,E) has the homotopy extension property and the pair (E,B) has the homotopy lifting property. Now, let let (X,x) be a pointed topological space.

Definition: The (reduced) suspension ΣX of X is
ΣX:=X×I/X×{0}X×{1}{x}×I. 
The unreduced suspension SX of X is
SX:=X×I/X×{0}X×{1}.
The loop space ΩX of X is
ΩX:=F(S1,X).
Remark: If X is well-pointed (the inclusion i:{x}X is a cofibration), then the natural quotient map SXΣX is a homotopy equivalence. Moreover, there is an adjunction F(ΣX,Y)F(X,ΩY). In the fundamental group this gives the adjunction
[ΣX,Y][X,ΩY],
where [A,B] is the set of based homotopy classes of maps AB.

References: May (A concise course in algebraic toplogy, Chapters 6, 7, 8), Aguilar, Gitler, and Prieto (Algebraic topology from a homotopical viewpoint, Chapter 2.10)

Monday, July 25, 2016

Connections, curvature, and Higgs bundles

Recall (from a previous post) that a Kähler manifold M is a complex manifold (with natural complex structure J) with a Hermitian metic g whose fundamental form ω is closed. In this context M is Kähler. Previously we used upper-case letters V,W to denote vector fields on M, but here we use lower-case letters s,u,v and call them sections (to consider vector bundles more generally as sheaves).

Definition: A connection on M is a C-linear homomorphism :A0MA1M satisfying the Leibniz rule (fs)=(df)s+f(s), for s a section of TM and fC(M).

For ease of notation, we often write us for (s)(u), where s,u are sections of TM. On Kähler manifolds there is a special connection that we will consider.

Proposition:
On M there is a unique connection that is (for any u,vA0M)
  1. Hermitian (satisfies dg(u,v)=g((u),v)+g(u,(v))),
  2. torsion-free (satisfies uvvu[u,v]=0), and
  3. compatible with the complex structure J (satisfies uv=Ju(Jv)).

If satisfies the first two conditions, it is called the Levi-Civita connection, and if it satisfies the first and third conditions, it is called the Chern connection. If g is not necessarily Hermitian, is called metric if it satisfies the first condition. From here on out denotes the unique tensor described in the proposition above.

Definition: The curvature tensor of M is defined by
R(u,v)=uvvu[u,v].
It may be viewed as a map A2A1, or A3A0, or A0A0. The Ricci tensor of M is defined by
r(u,v)=trace(wR(u,v)w)=ig(R(ai,u)v,ai),
for the ai a local orthonormal basis of A0=TM. This is a map A2A0. The Ricci curvature of M is defined by
Ric(u,v)=r(Ju,v).
This is a map A2A0.

Definition: An Einstein manifold is a pair (M,g) that is Riemannian and for which the Ricci curvature is directly proportional to the Riemannian metric. That is, there exists a constant λR such that Ric(u,v)=λg(u,v) for any u,vA1.

Recall that a holomorphic vector bundle π:EM has complex fibers and holomorphic projection map π. Here we consider two special vector bundles (as sheaves), defined on open sets UM by
End(E)(U)={f:π1(U)π1(U) : f|π1(x)\ is a homomorphism},ΩM(U)={ni=0fidz1dzi : fiC(U)},
where z1,,zn are local coordinates on U. The first is the endomorphism sheaf of E and the second is the sheaf of differential forms of M, or the holomorphic cotangent sheaf. The cotangent sheaf as defined is a presheaf, so we sheafify to get ΩM.

Definition: A Higgs vector bundle over a complex manifold M is a pair (E,θ), where π:EM is a holomorphic vector bundle and θ is a holomorphic section of End(E)ΩM with θθ=0, called the Higgs field.

References: Huybrechts (Complex Geometry, Chapters 4.2, 4.A), Kobayashi and Nomizu (Foundations of Differential Geometry, Volume 1, Chapter 6.5)

Saturday, July 2, 2016

On the separation of nearest neighbors

We work through Lemma 3 (called the "AB Lemma" or the "cleaning procedure") of [2], adopting a cleaner and more thorough approach.

Necessary tools

Definition: The inverse of the complex-valued function f(z)=zez is called the Lambert W-function and denoted by W=f1. When restricted to the real numbers, it is multi-valued on part of its domain, so it is split up into two branches W0 (for positive values) and W1 (for negative values).

Hoeffding's inequality gives an upper bound on how much we should expect a sum of random variables to deviate from their combined mean. The authors of [2] use a similar inequality called the Chernoff bound, but Hoeffding gives a tighter bound on the desired event.

Proposition:
(Hoeffding - Theorem 2 and Equation (1.4) of [1])
Let X1,,Xn be independent random variables, with Xi bounded on the interval [ai,bi]. Then
P(|1nni=1Xi1nni=1E[Xi]|

The union bound (or Boole's inequality) says that the probability of one of a collection of events happening is no larger than the sum of the probabilities of each of the events happening.

Proposition:
Let A_1,A_2,\dots be a countable collection of events. Then P(\bigcup_i A_i) \leqslant \sum_i P(A_i).

The setup

Let P be a probability distribution P on \R^n and X=\{x_1,\dots,x_k\}\subseteq \R^n a finite set of points drawn according to P. These points may be considered as random variables X_1,X_2,\dots,X_k on the sample space \R^n, with X_i evaluating to 1 only on x_i, and 0 otherwise. Choose s>0 and construct the nearest neighbor graph G on X, with parameter s. Write X=A\cup B and set
\eta := \inf_{a\in A,b\in B}\left\{|| a-b||\right\} \hspace{1cm},\hspace{1cm} \alpha_s := \inf_{a\in A}\left\{P(B^n(s,a))\right\} \hspace{1cm},\hspace{1cm} \beta_s \:= \sup_{b\in B}\left\{P(B^n(s,b))\right\},
with h = (\alpha_s-\beta_s)/2. We assume that
  • \eta>0, so A and B are disjoint;
  • s<\eta/2, so A and B are in separate components of G; and
  • \alpha_s >\beta_s, so any point in A is more likely to be chosen than every point in B.
Proposition: Choose \delta\in (0,1). If |X| >-W_{-1}(-\delta h^2e^{-2h^2})/(2h^2), then for all a\in A and b\in B, with probability 1-\delta,
\frac{\deg_G(a)}{k - 1} > \frac{\alpha_s+\beta_s}2 \hspace{1cm}\text{and}\hspace{1cm} \frac{\deg_G(b)}{k - 1} < \frac{\alpha_s+\beta_s}2.
The statement holds also for \alpha,\beta instead of \alpha_s,\beta_s, such that \alpha_s\> \alpha >\beta \> \beta_s, which may be useful to bound the degree of vertices in G.

The proof

For each i=1,\dots,k, define new random variables Y_{ij} on the sample space X, with Y_{ij} evaluating to 1 on x_j iff x_j\in B^n(s,x_i), and evaluating to 0 otherwise. The mean of Y_{ij} is P(B^n(s,x_i)). Since the Y_{ij} are independent with the same mean, Hoeffding's inequality gives that
\left(\begin{array}{c} \text{the probability that the sampled $x_j$}\\ \text{have clustered around a point more than} \\ \text{a distance $h$ away from $B^n(s,x_i)$} \end{array}\right) = P\Bigg(\underbrace{\left|\frac{1}{k-1}\sum_{j\neq i} Y_{ij} - P(B^n(s,x_i))\right| \> h}_{\text{event}\ A_i}\Bigg)  \leqslant 2e^{-2h^2(k-1)}.
The union bound gives that
\left(\begin{array}{c} \text{the probability that at}\\ \text{least one $A_i$ occurs} \end{array}\right) = P\left(\bigcup_{i=1}^k A_i\right) < \sum_{i=1}^k P(A_i) \leqslant 2ke^{-2h^2(k-1)}.
Note that \sum_{j\neq i} Y_{ij} = \deg_G(x_i) for every i, so whenever \delta>2ke^{-2h^2(k-1)}, with probability 1-\delta
\left| \frac{\deg_G(x_i)}{k-1} - P(B^n(s,x_i))\right| < h \hspace{1cm}\text{or}\hspace{1cm} P(B^n(s,x_i)) -h < \frac{\deg_G(x_i)}{k-1} < P(B^n(s,x_i)) + h.
When x_i\in A (x_i\in B) we have a lower (upper) bound of \alpha_s (\beta_s) on P(B^n(s,x_i)). Indeed:
\frac{\deg_G(a)}{k-1} > \alpha_s- h =  \frac{\alpha_s+\beta_s}2 \hspace{1cm}\text{and}\hspace{1cm} \frac{\deg_G(b)}{k-1} < \beta_s + h = \frac{\alpha_s+\beta_s}2.
To find how many points we need to sample, we solve for k in the inequality \delta > 2ke^{-2h^2(k-1)}. With the aid of a computer algebra system, we find that
k > \frac{-1}{2h^2}W_{-1}\left(-\delta h^2e^{-2h^2}\right),
completing the proof.

References:
[1] Hoeffding (Probability inequalities for sums of bounded random variables)
[2] Niyogi, Smale, and Weinberger (A topological view of unsupervised learning from noisy data)