Tag Archives: linear

Polar Decomposition and Real Powers of a Symmetric Matrix

The main purpose of this post is to prove the polar decomposition theorem for invertible matrices. As an application, we extract some information about the topology of {SL(2,\mathbb C)}, namely that { SL(2,\mathbb C)\cong S^3\times \mathbb R^3}. Along the way, we recall a few facts and also define real powers of a positive definite Hermitian matrix. We assume that the matrices below are non singular, even though some of the results are true without this assumption. We first prove that every Hermitian matrix has only real eigenvalues. Notice that with respect to the standard Hermitian inner product {|v|^2 = v^*v} for a column vector {v}. In fact, more generally, {\langle v,w \rangle = v^* w}.

Proposition 1 Let {A} be a Hermitian matrix i.e. {A^*=A}, then {A} has only real eigenvalues.

Proof: Over complex numbers every non constant polynomial of degree {n} has {n} solutions. Thus, it is clear that {A} has eigenvalues. Let {\lambda} be an eigenvalue of {A} with an eigenvector {v\neq 0}. That is {Av = \lambda v.} Taking conjugate transpose of both sides, we get

\displaystyle \begin{array}{rcl} v^*A^* &=& \lambda^* v^* \\ v^*A &=& \lambda^* v^* \\ v^*Av &=& \lambda^* v^*v \\ v^*\lambda v &=& \lambda^* v^*v \\ \lambda |v|^2 &=& \lambda^* |v|^2. \end{array}

Since {v\neq 0}, {|v|^2\neq 0} and we get {\lambda=\lambda^*} which is only possible if {\lambda} is real. \Box

Recall that a matrix is called positive definite if {v^*Av>0} for all {v\neq 0}. Clearly, such a matrix must be nonsingular.

Proposition 2 If {A} is a positive definite matrix, then it has only positive real eigenvalues.

Proof: As above let {\lambda} be an eigenvalue of {A} with an eigenvector {v}. Then,

\displaystyle \begin{array}{rcl} Av &=& \lambda v \\ v^*Av &=& v^*\lambda v \\ 0 < v^*Av &=& \lambda v^*v \\ 0 < v^*Av &=& \lambda |v|^2 \end{array}

Thus, {\lambda} must be positive as well. \Box

If {A} is Hermitian, it has an eigenspace decomposition. Here is the sketch of the proof. We apply induction on the dimension {n}. It is clear for {n=1}. Now, we may assume {n>1}. Let {\lambda} and {v} be as above. We consider the one dimensional space {V} generated by {v} and the complementary space {V^\perp}. By definition, for {w\in V^\perp}, {\langle w,v\rangle =0 }, or equivalently {w^*v=0}. Thus,

\displaystyle \begin{array}{rcl} 0 &=& \lambda w^*v \\ &=& w^*\lambda v \\ &=& w^*Av \\ &=& w^*A^*v \\ &=& (Aw)^*v \end{array}

or in a more familiar form {\langle Aw, v \rangle = 0}. This means that {A} preserves {V^\perp} which has dimension {n-1}. So, by induction, {V^\perp} has an eigenspace decomposition and we are basically done. The reason that this argument is not very precise is that in this post, we are using a concrete definition of being Hermitian. So, we also have to argue that somehow the matrix {A} is still Hermitian when restricted to {V^\perp}. Of course, using the abstract definition, this is trivial. In fact, it is pretty easy to translate every argument presented here to abstract one.

Lemma 3 If {A} is Hermitian, then it is diagonalizable by a unitary matrix.

Proof: Since {A} has an eigenspace decomposition, we can choose a basis of consisting of eigenvectors only. Furthermore, we may choose those vectors to be unit. Consider the matrix {U} that takes the standard basis to this eigenbasis. Then, it is clear that {U^{-1}AU} is a diagonal matrix. It is also clear that {U^*U = I}. \Box

Next, we prove the polar decomposition for invertible matrices. In this proof, we also define the square root of a matrix.

Theorem 4 Given an invertible matrix {A}, there is a positive definite Hermitian matrix {P} and a unitary matrix {U} such that {A=PU}.

Proof: Let {R=AA^*}. Clearly, {R^*=R} i.e. {R} is Hermitian. Also, for nonzero {v}, {Av} is nonzero, thus

\displaystyle \begin{array}{rcl} 0<\langle A^*v,A^*v\rangle &=& (A^*v)^*(A^*v) \\ &=& v^*AA^*v \\ &=& v^*Rv. \end{array}

So, {R} is also positive definite. By the above lemma, as {R} is Hermitian, there is a unitary matrix {K} which diagonalizes {R} i.e. {K^{-1}RK=D}. Since {K} is unitary, {K^{-1}=K^*} and hence, {K^*RK=D}. Also, since {R} is positive definite, all the eigenvalues of {R} and hence of {D} are positive, by the above proposition. So, we define {\sqrt{R}} to be {K\sqrt{D}K^*} where {\sqrt{D}} is defined by taking square root of each entry on the diagonal. In fact, using this idea, we can define any power of {R} by {R^p=KD^pK^*}. Note that {D^p} is also diagonal with positive diagonal entries. Hence, in particular, it is Hermitian. Clearly, a diagonal matrix with positive diagonal entries is positive definite. So, {\sqrt{D}} is positive definite.

We set {P=\sqrt{R}}. It is easy to check that {P} is positive definite.

\displaystyle \begin{array}{rcl} x^*Px &=& x^*K\sqrt{D}K^*x \\ &=& (K^*x)^*\sqrt{D}(K^*x) > 0 \end{array}

as {\sqrt{D}} is positive definite.

Finally, we let {U=P^{-1}A}. Of course, here {P} is invertible because its a product of nonsingular matrices. Now, we just need to check that {U^*U=I}.

\displaystyle \begin{array}{rcl} U^*U &=& (P^{-1}A)^*(P^{-1}A) \\ &=& A^*(P^{-1})^*P^{-1}A \\ &=& A^*((K\sqrt{D}K^*)^{-1})^*(K\sqrt{D}K^*)^{-1}A \\ &=& A^*(K(\sqrt{D})^{-1}K^*)^*K(\sqrt{D})^{-1}K^*A \\ &=& A^*K(D^{-1/2})^{*}K^*KD^{-1/2}K^*A \\ &=& A^*K(D^{-1/2})^{*}(D^{-1/2})K^*A \\ &=& A^*K(D^{-1/2})D^{-1/2}K^*A \\ &=& A^*KD^{-1}K^*A \\ &=& A^*R^{-1}A \\ &=& A^*(AA^*)^{-1}A \\ &=& A^*(A^*)^{-1}A^{-1}A \\ &=& I \end{array}

which was to be shown. \Box

Now, we will apply our knowledge to understand the topology of {SL(2,\mathbb C)}. Given {A\in SL(2,\mathbb C)}, it is clear from our proof that we can choose positive definite Hermitian part so that {\det(P)=1}. Hence, {\det(U)=1}, in other words, {U} is an element of {SU(2)}. Again, in our proof, we have explained that in fact you may take any power of a positive definite Hermitian matrix. So we can define a path of matrices {A_t} by {P^tU}. We see that {A_0=U} and {A_1 = A}. This defines a deformation retract of {SL(2,\mathbb C)} onto the {SU(2)}. It is easy to see that the space of {2\times 2} positive definite Hermitian matrices of determinant 1 is homeomorphic to {\mathbb R^3}. More concretely, to write down any such matrix, we need {a\in \mathbb R^+}; {b,c\in \mathbb R}. Also, we set {d = (b^2+c^2+1)/a}. Then,

\begin{pmatrix} a & b+ic \\ b-ic & d \end{pmatrix}

is positive definite Hermitian of determinant 1.

It is also not very hard to check that only the identity matrix is the only matrix that lies in {SU(2)} which is also positive definite Hermitian of determinant 1. Thus, {SL(2,\mathbb C)\cong SU(2)\times \mathbb R^3}. We leave it as an exercise to prove that {SU(2)\cong S^3}.

G-Structures

Let {M} be a smooth {n}-manifold and {p} be a point in {M}. Consider the set { S_p} of all linear isomorphisms {L_p:T_pM\rightarrow {\mathbb R}^n} between the tangent space at {p} and {{\mathbb R}^n}. Note that there is a natural left action of {GL(n,{\mathbb R})} on {S_p}. Since this action may be seen as a function composition, we denote the action by {\circ}. Though, we will quite often drop the notation altogether hoping that it is clear. The disjoint union {F = \sqcup_p S_p} is called the frame bundle of {M}. (Of course, we need to have more conditions on {F} but we will not go into details in this post.) The action of {GL(n,{\mathbb R})} on {S_p} induces a natural action on {F}.

It is easy to define a bijection between any fiber of {F} and {GL(n,{\mathbb R})}. Using the bijection, we may define a group multiplication on {F} which will, in turn, make the fiber isomorphic to {GL(n,{\mathbb R})}, trivially. In general, this bijection is not canonical as we will see below. First, we fix an isomorphism {L_p} in {S_p}. Then, we send {K_p \in S_p} to {K_p \circ L_p^{-1} \in GL(n,{\mathbb R})}. Clearly, this map is injective and {L_p} is sent to the identity matrix under this identification. Also, for any {N\in GL(n,{\mathbb R})}, {N \circ L_p \in S_p} and {N\circ L_p \circ L_p^{-1} = N} i.e. the identification is onto. So, we have a bijection. As we can see, once an identity element {L_p} is fixed, the fiber {S_p} becomes a group isomorphic to {GL(n,{\mathbb R})}. In other words, {S_p} is a {GL(n,{\mathbb R})}-torsor.

Let {G} be a Lie subgroup of {GL(n,{\mathbb R})} and {P} be a subbundle of {F} whose fibers (which we still denote by {S_p}) are isomorphic to {G} in the above sense. Then, {P} is called a {G}-structure on {M}. Clearly, the frame bundle {F} is a {GL(n,{\mathbb R})}-structure on {M}.

Next, we discuss two examples of proper subbundles inducing various structures on {M}. In our first example, we consider {G} to be the orthogonal group {O(n)}. Recall that the standard euclidean metric {g_0} on {{\mathbb R}^n} is fixed by {O(n)}. In other words, for any {N\in O(n)}, {N^*g_0 =g_0}. We can use this property together with {P} to define a Riemannian metric on {M}. Let {p\in M}, {L_p\in S_p} and define the metric {g_p} as the pullback {L_p^*(g_0)}. We need to show that this definition is independent of the choice of {L_p}. Let {K_p\in S_p}, then {K_p\circ L_p^{-1} \in O(n)} as {P} is an {O(n)}-structure. Hence, {K_p^{*}(g_0) = (K_p\circ L_p^{-1}\circ L_p)^* (g_0)= L_p^*\circ (K_p\circ L_p^{-1})^* (g_0) = L_p^* (g_0)}. So, we can choose any isomorphism in the fiber in order to define {g_p}. So, we see that an {O(n)}-structure gives us a Riemannian metric. Next, we will go the other way around i.e. given a Riemannian metric {g}, we construct an {O(n)}-structure on {M}. Each tangent space {T_pM} is equiped with an inner product and we consider {{\mathbb R}^n} equiped with the standard inner product. We define a fiber of {P} to be the set of linear isometries between {T_pM} and {{\mathbb R}^n}. Next, we need to check that a fiber is isomorphic to {O(n)}. Again, first, we fix an isomorphism {L_p}. Then, given another isometry {K_p} from {T_pM} to {{\mathbb R}^n}, {K_p\circ L^{-1}_p} is an isometry from {{\mathbb R}^n} to itself i.e. {K_p \circ L^{-1}_p \in O(n)}. Also, for {N\in O(n)}, {N\circ L_p} is an isometry from {T_pM} to {{\mathbb R}^n} and {N \circ L_p \circ L_p^{-1} = N\in O(n)}. Hence, as above, we have an isomorphism. It is easy to see that this correspondence between {O(n)}-structures and Riemannian metrics is one to one.

This examle can be generalized easily. Any structure on {{\mathbb R}^n} which is fixed by a Lie subgroup {G} of {GL(n,{\mathbb R})} can be carried to a manifold which admits a {G}-structure. In fact, our second example will be of this type, again. We consider the correspondence between an almost complex structure and a {GL(m,{\mathbb C})}-structure where {n=2m}. Before we discuss the correspondence, let us clearify a few things. We view {GL(m,{\mathbb C})} as a subgroup of {GL(2m,{\mathbb R})} using the monomorphism

\displaystyle \begin{array}{rcl} N \mapsto \begin{pmatrix} Re(N) & -Im(N) \\ Im(N) & Re(N) \end{pmatrix} \end{array}

Let {J_0:{\mathbb C}^m\rightarrow {\mathbb C}^m} denote the action of {i} on {{\mathbb C}^m}. In other words, {J_0 = iI} where {I} denotes the {m\times m} identity matrix. Or using, the monomorphism defined above

\displaystyle \begin{array}{rcl} J_0 = \begin{pmatrix} 0 & -I \\ I & 0 \end{pmatrix}, \end{array}

in matrix block form, as a real {n\times n} matrix. Of course, for any matrix {N \in GL(m,{\mathbb C})}, {J_0 N = iN=Ni=NJ_0}. Equivalently, we have {N^{-1}J_0N=J_0}. On the other hand, let {N = \begin{pmatrix} A & B \\ C & D \end{pmatrix} \in GL(n,{\mathbb R})}. Then,

\displaystyle \begin{array}{rcl} J_0 N &=& \begin{pmatrix} 0 & -I \\ I & 0 \end{pmatrix} \begin{pmatrix} A & B \\ C & D \end{pmatrix} \\ &=& \begin{pmatrix} -C & -D \\ A & B \end{pmatrix} \end{array}

and

\displaystyle \begin{array}{rcl} N J_0 &=& \begin{pmatrix} A & B \\ C & D \end{pmatrix} \begin{pmatrix} 0 & -I \\ I & 0 \end{pmatrix} \\ &=& \begin{pmatrix} B & -A \\ D & -C \end{pmatrix}. \end{array}

Therefore, {NJ_0 = J_0N} if and only if {A=D} and {C = -B}. Hence, we see that a real matrix {N} can be identified with a complex matrix if and only if {NJ_0=J_0N}.

Now, we go back to the correspondence between a {GL(m,{\mathbb C})}-structure and an almost complex structure. First, let us assume that we have a {GL(m,{\mathbb C})}-structure and construct an almost complex structure {J:TM\rightarrow TM}. Let {L_p\in S_p}. Then, we define {J:T_pM\rightarrow T_pM} by {J = L_p^{-1} J_0 L_p}. Next, we show that {J} is well defined. Let {K_p\in S_p}. Since {L_p K_p^{-1} \in GL(m,{\mathbb C})}, we have {K_pL_p^{-1} J_0 L_pK_p^{-1} = J_0}. Hence,

\displaystyle \begin{array}{rcl} J &=& L_p^{-1}J_0L_p \\ &=& K_p^{-1}K_pL_p^{-1}J_0L_pK_p^{-1}K_p \\ &=& K_p^{-1}J_0K_p. \end{array}

Moreover, clearly, we have {J^2 = -I}. Thus, we have constructed an almost complex structure.

Next, we go in the other direction. Given a complex structure {J}, we form a subbundle {P} of {F} which consists of linear isomorphisms {L_p} that satisfy {L_pJ=J_0L_p}. Take a basis of {T_pM} of the form {\left\{ e_1,\dots,e_m,Je_1,\dots,Je_m \right\}} and, then define {L_p(e_i) = (0,\dots,0,1,0,\dots,0)} with {1} in the {i^{th}} position and {L_p(Je_i)=(0,\dots,0,1,0,\dots,0)} with {1} in the {m+i^{th}} position. It is easy to verify that {L_pJ=J_0L_p}. Hence, the subbundle is non-empty. Next, we want to show that the fibers are isomorphic to {GL(m,{\mathbb C})}. Let {K_p} be another isomorphism satisfying {K_pJ = J_0 K_p}. Then, {J = K_p^{-1}J_0K_p}. Thus, {L_p K_p^{-1}J_0K_p = J_0L_p}, or equivalently, {L_p K_p^{-1}J_0 = J_0 L_p K_p^{-1}}. Therefore, {L_pK_p^{-1}\in GL(m,{\mathbb C})} by above remarks. Also, given {N \in GL(m,{\mathbb C})}, {NL_p J = NJ_0L_p = J_0NL_p} that is {NL_p} is also an element of {P}. So, the fibers are isomorphic to {GL(m,{\mathbb C})}.

Our third example will be a {G_2}-structure. However, I want to have a more detailed discussion of {G_2}-structures before I present it in this context. So, I will include it in a future post.

Some Comments on Linear Complex Structures via an Example

Consider a real vector space {V} generated by {\partial_ x} and {\partial_ y}. There is an obvious identification {L:V\rightarrow\mathbb C} of {V} with the complex plane {\mathbb C} such that {L(\partial_ x) = 1} and {L(\partial_ y) = i}. Define a linear complex structure on {V} by setting {J(\partial_ x) = \partial_ y} and {J(\partial_ y)=-\partial_ x}. With the identification mentioned above, since {\mathbb C} is a complex vector space, {V} can be viewed as a complex vector space, too. Furthermore, the action of {J} can be viewed as multiplication by {i} on {V} but we will see below why this view does not extend further.

Next, we complexify {V} by taking a tensor product with {\mathbb C} over {\mathbb R}. We know that (real) dimension of {V_{\mathbb C} = V\otimes \mathbb C} is {4} and it is generated by {\partial_ x\otimes 1, \partial_ y \otimes 1, \partial_ x \otimes i} and {\partial_ y \otimes i}. We can view {V_{\mathbb C}} as a complex vector space and, for notational simplicity, write {v = v \otimes 1} and {iv = v \otimes i}. Note that over the complex numbers {V_{\mathbb C}} is {2} dimensional and generated by {\partial_ x} and {\partial_ y}. However, these are not the “natural” bases to work with as we wil see. Next, we extend (complexify) {J:V\rightarrow V} to get {J_{\mathbb C}:V_{\mathbb C}\rightarrow V_{\mathbb C}} which we will still denote by {J} for notational simplicity. Let {\partial_ z = \frac{1}{2}(\partial_ x - i \partial_ y)} and {\partial_ {\bar z} = \frac{1}{2}(\partial_ x + i\partial_ y)}. Now, we see that

\displaystyle \begin{array}{rcl} J(\partial_ z) &=& \frac{1}{2}\left( J(\partial_ x) -i J(\partial_ y) \right) \\ &=& \frac{1}{2}\left( \partial_ y +i \partial_ x \right) \\ &=& i\frac{1}{2}\left(\partial_ x - i \partial_ y \right) \\ &=& i \partial_ z \end{array}

and also,

\displaystyle \begin{array}{rcl} J(\partial_ {\bar z}) &=& \frac{1}{2}\left( J(\partial_ x) +i J(\partial_ y) \right) \\ &=& \frac{1}{2}\left( \partial_ y -i \partial_ x \right) \\ &=& -i\frac{1}{2}\left(\partial_ x + i \partial_ y \right) \\ &=& -i \partial_ {\bar z}. \end{array}

This means that {\partial_ z} is an eigenvector of {J} corresponding to the eigenvalue {i}. Similarly, {\partial_ {\bar z}} is an eigenvector corresponding to the eigenvalue {-i}. So, the set {\left\{ \partial_ z, \partial_ {\bar z} \right\}} is an eigenbasis for {J} and it gives us an eigenspace decomposition of {V_{\mathbb C}}. Computing {J}, using this basis, is clearly more convenient and hence, this is a “natural” choice as a basis. Furthermore, from this viewpoint, it is also clear why the action of {J} cannot be viewed as multiplication by {i} any more.