Polar Decomposition and Real Powers of a Symmetric Matrix

The main purpose of this post is to prove the polar decomposition theorem for invertible matrices. As an application, we extract some information about the topology of {SL(2,\mathbb C)}, namely that { SL(2,\mathbb C)\cong S^3\times \mathbb R^3}. Along the way, we recall a few facts and also define real powers of a positive definite Hermitian matrix. We assume that the matrices below are non singular, even though some of the results are true without this assumption. We first prove that every Hermitian matrix has only real eigenvalues. Notice that with respect to the standard Hermitian inner product {|v|^2 = v^*v} for a column vector {v}. In fact, more generally, {\langle v,w \rangle = v^* w}.

Proposition 1 Let {A} be a Hermitian matrix i.e. {A^*=A}, then {A} has only real eigenvalues.

Proof: Over complex numbers every non constant polynomial of degree {n} has {n} solutions. Thus, it is clear that {A} has eigenvalues. Let {\lambda} be an eigenvalue of {A} with an eigenvector {v\neq 0}. That is {Av = \lambda v.} Taking conjugate transpose of both sides, we get

\displaystyle \begin{array}{rcl} v^*A^* &=& \lambda^* v^* \\ v^*A &=& \lambda^* v^* \\ v^*Av &=& \lambda^* v^*v \\ v^*\lambda v &=& \lambda^* v^*v \\ \lambda |v|^2 &=& \lambda^* |v|^2. \end{array}

Since {v\neq 0}, {|v|^2\neq 0} and we get {\lambda=\lambda^*} which is only possible if {\lambda} is real. \Box

Recall that a matrix is called positive definite if {v^*Av>0} for all {v\neq 0}. Clearly, such a matrix must be nonsingular.

Proposition 2 If {A} is a positive definite matrix, then it has only positive real eigenvalues.

Proof: As above let {\lambda} be an eigenvalue of {A} with an eigenvector {v}. Then,

\displaystyle \begin{array}{rcl} Av &=& \lambda v \\ v^*Av &=& v^*\lambda v \\ 0 < v^*Av &=& \lambda v^*v \\ 0 < v^*Av &=& \lambda |v|^2 \end{array}

Thus, {\lambda} must be positive as well. \Box

If {A} is Hermitian, it has an eigenspace decomposition. Here is the sketch of the proof. We apply induction on the dimension {n}. It is clear for {n=1}. Now, we may assume {n>1}. Let {\lambda} and {v} be as above. We consider the one dimensional space {V} generated by {v} and the complementary space {V^\perp}. By definition, for {w\in V^\perp}, {\langle w,v\rangle =0 }, or equivalently {w^*v=0}. Thus,

\displaystyle \begin{array}{rcl} 0 &=& \lambda w^*v \\ &=& w^*\lambda v \\ &=& w^*Av \\ &=& w^*A^*v \\ &=& (Aw)^*v \end{array}

or in a more familiar form {\langle Aw, v \rangle = 0}. This means that {A} preserves {V^\perp} which has dimension {n-1}. So, by induction, {V^\perp} has an eigenspace decomposition and we are basically done. The reason that this argument is not very precise is that in this post, we are using a concrete definition of being Hermitian. So, we also have to argue that somehow the matrix {A} is still Hermitian when restricted to {V^\perp}. Of course, using the abstract definition, this is trivial. In fact, it is pretty easy to translate every argument presented here to abstract one.

Lemma 3 If {A} is Hermitian, then it is diagonalizable by a unitary matrix.

Proof: Since {A} has an eigenspace decomposition, we can choose a basis of consisting of eigenvectors only. Furthermore, we may choose those vectors to be unit. Consider the matrix {U} that takes the standard basis to this eigenbasis. Then, it is clear that {U^{-1}AU} is a diagonal matrix. It is also clear that {U^*U = I}. \Box

Next, we prove the polar decomposition for invertible matrices. In this proof, we also define the square root of a matrix.

Theorem 4 Given an invertible matrix {A}, there is a positive definite Hermitian matrix {P} and a unitary matrix {U} such that {A=PU}.

Proof: Let {R=AA^*}. Clearly, {R^*=R} i.e. {R} is Hermitian. Also, for nonzero {v}, {Av} is nonzero, thus

\displaystyle \begin{array}{rcl} 0<\langle A^*v,A^*v\rangle &=& (A^*v)^*(A^*v) \\ &=& v^*AA^*v \\ &=& v^*Rv. \end{array}

So, {R} is also positive definite. By the above lemma, as {R} is Hermitian, there is a unitary matrix {K} which diagonalizes {R} i.e. {K^{-1}RK=D}. Since {K} is unitary, {K^{-1}=K^*} and hence, {K^*RK=D}. Also, since {R} is positive definite, all the eigenvalues of {R} and hence of {D} are positive, by the above proposition. So, we define {\sqrt{R}} to be {K\sqrt{D}K^*} where {\sqrt{D}} is defined by taking square root of each entry on the diagonal. In fact, using this idea, we can define any power of {R} by {R^p=KD^pK^*}. Note that {D^p} is also diagonal with positive diagonal entries. Hence, in particular, it is Hermitian. Clearly, a diagonal matrix with positive diagonal entries is positive definite. So, {\sqrt{D}} is positive definite.

We set {P=\sqrt{R}}. It is easy to check that {P} is positive definite.

\displaystyle \begin{array}{rcl} x^*Px &=& x^*K\sqrt{D}K^*x \\ &=& (K^*x)^*\sqrt{D}(K^*x) > 0 \end{array}

as {\sqrt{D}} is positive definite.

Finally, we let {U=P^{-1}A}. Of course, here {P} is invertible because its a product of nonsingular matrices. Now, we just need to check that {U^*U=I}.

\displaystyle \begin{array}{rcl} U^*U &=& (P^{-1}A)^*(P^{-1}A) \\ &=& A^*(P^{-1})^*P^{-1}A \\ &=& A^*((K\sqrt{D}K^*)^{-1})^*(K\sqrt{D}K^*)^{-1}A \\ &=& A^*(K(\sqrt{D})^{-1}K^*)^*K(\sqrt{D})^{-1}K^*A \\ &=& A^*K(D^{-1/2})^{*}K^*KD^{-1/2}K^*A \\ &=& A^*K(D^{-1/2})^{*}(D^{-1/2})K^*A \\ &=& A^*K(D^{-1/2})D^{-1/2}K^*A \\ &=& A^*KD^{-1}K^*A \\ &=& A^*R^{-1}A \\ &=& A^*(AA^*)^{-1}A \\ &=& A^*(A^*)^{-1}A^{-1}A \\ &=& I \end{array}

which was to be shown. \Box

Now, we will apply our knowledge to understand the topology of {SL(2,\mathbb C)}. Given {A\in SL(2,\mathbb C)}, it is clear from our proof that we can choose positive definite Hermitian part so that {\det(P)=1}. Hence, {\det(U)=1}, in other words, {U} is an element of {SU(2)}. Again, in our proof, we have explained that in fact you may take any power of a positive definite Hermitian matrix. So we can define a path of matrices {A_t} by {P^tU}. We see that {A_0=U} and {A_1 = A}. This defines a deformation retract of {SL(2,\mathbb C)} onto the {SU(2)}. It is easy to see that the space of {2\times 2} positive definite Hermitian matrices of determinant 1 is homeomorphic to {\mathbb R^3}. More concretely, to write down any such matrix, we need {a\in \mathbb R^+}; {b,c\in \mathbb R}. Also, we set {d = (b^2+c^2+1)/a}. Then,

\begin{pmatrix} a & b+ic \\ b-ic & d \end{pmatrix}

is positive definite Hermitian of determinant 1.

It is also not very hard to check that only the identity matrix is the only matrix that lies in {SU(2)} which is also positive definite Hermitian of determinant 1. Thus, {SL(2,\mathbb C)\cong SU(2)\times \mathbb R^3}. We leave it as an exercise to prove that {SU(2)\cong S^3}.

G-Structures 2

In this post, we briefly introduce the Lie group {G_2}, {G_2}-structures on a manifold and a {G_2}-manifold. Let us denote the three form {dx^i\wedge dx^j \wedge dx^k} on {{\mathbb R}^7} by {dx^{ijk}}. We set {\phi_0 = dx^{123}+ dx^{145}-dx^{167}+dx^{246}+dx^{257}+dx^{347}-dx^{356}}. This three form is non-degenerate in the sense that whenever we have two linearly independent vectors in {{\mathbb R}^7}, we can find a third vector such that the evaluation of {\phi_0} on these vectors is non-zero. We define {G_2 = \left\{ M\in GL(7,{\mathbb R}) \big| M^*\phi_0 = \phi_0 \right\}}. One may prove that {G_2} is a {14}-dimensional Lie subgroup of {SO(7)}.

Let us give a different descriptions of {\phi_0}. So, it does not look completely arbitrary. {7} is the highest dimension that one may define a cross product. After we identify {{\mathbb R}^8} with the octonions {\mathbb O} equiped with some octonion product, for any two imaginary octonions {x,y \in {\mathbb R}^7 \cong im(\mathbb O)} we define the cross product to be

\displaystyle \begin{array}{rcl} x \times y = \frac{1}{2} [x,y] = \frac{1}{2}(xy-yx). \end{array}

Then, we may define the {3}-form on {{\mathbb R}^7} by {\phi_0(x,y,z) = \left< x \times y, z\right>} where the inner product is the standard inner product. Of course, there is a choice on the octonion product and hence, {\phi_0} may be different than the one we explicitly wrote above. However, we show that they are equivalent using the right octonion product. To show they are equivalent; first, we prove that {\left<x\times y, z\right>} is indeed a {3}-form and then, evaluate it on the basis elements to see how the octonion product should be defined.

Using {im(xy)=-im(yx)}, we obtain {x\times y = im(xy)} and thus, the above definition is equivalent to

\displaystyle \begin{array}{rcl} \phi_0(x,y,z) &=& \left< xy, z\right>. \end{array}

To prove that {\phi_0} is alternating, it is enough to prove {\phi_0(x,x,y)=0, \phi_0(x,y,x)=0} and {\phi_0(y,x,x)=0} as we may replace {x} by {x+z} to get the desired equalities. However, also note that {x \times y = - y\times x}. Therefore, the first two equalities are enough. It is clear that {x\times x = 0}. Hence, we have the first equality. Furthermore,

\displaystyle \begin{array}{rcl} \phi_0(x,y,x) &=& \left< xy, x\right> \\ &=& |x|^2\left< y,1\right> \\ &=& 0. \end{array}

Thus, we have showed that {\phi_0} is alternating.

Our next goal is to define the octonion product. Clearly, from the explicit definition, we want {\phi_0(x_1,x_2,x_3)=1}. In other words, {\left<x_1 x_2, x_3\right> = 1}. So, a natural choice for the product {x_1x_2} is {x_3}. Similarly, we can choose {x_1x_4=x_5}, {x_1x_6=-x_7}, {x_2x_4=x_6}, {x_2x_5=x_7}, {x_3x_4=x_7} and {x_3x_5=-x_6}. Of course, as we are describing octonion multiplication, we should also define the multiplication with the {8}th generator but it is the generator of {Re(\mathbb O)={\mathbb R}} part. So, it is just the trivial multiplication i.e. the multiplication coming from the vector space structure. We do not show that this indeed defines an octonion product.

Next, we need to show that they are equal and to do that, it is enough to evaluate on the basis elements. It is an easy computation which we omit.

Note that this definition makes an earlier claim more plausible, namely that {\phi_0} is non-degenerate. Because {\phi_0(x,y,x\times y) = \left<x\times y,x\times y\right>} so, we only need to show that {x\times y} is non-zero for linearly independent {x} and {y}. However, that is a built-in property for a cross-product.

A {G_2}-structure on a manifold {M} can be defined as a subbundle of the frame bundle of {M} whose fibers are isomorphic to {G_2}. However, there is an equivalent, more convenient definition. In fact, this definition will follow the scheme of the previous post. More explicitly, since {G_2} fixes {\phi_0} on {{\mathbb R}^7}, we may pull it back to each space {T_pM} to have a three form {\phi} on the manifold and similarly, if we have such a three form on the manifold, then we may find a subbundle of the frame bundle whose fibers are {G_2}. So, having a three form {\phi} on {M} such that for any point {p\in M}, {\phi_p} and {\phi_0} can be identified by an isomorphism between {{\mathbb R}^n} and {T_pM}, means that we can find a {G_2}-structure on {M}. By an abuse of notation, we call {(M,\phi)} a manifold with a {G_2}-structure. Furthermore, since {G_2} is a subgroup of {SO(7)}, it also fixes the standard metric and orientation on {{\mathbb R}^7} giving rise to a Riemannian metric and orientation on the manifold. This immediately implies a non-orientable manifold does not admit a {G_2}-structure.

Next, we introduce {G_2}-manifolds. Given a manifold {M} with a {G_2}-structure {\phi} and the induced metric {g}, let {\nabla} be the Levi Civita connection on {(M,g)}. If {\nabla \phi =0}, we call {\phi} a torsion free {G_2}-structure. A manifold with a torsion free {G_2}-structure is called a {G_2}-manifold. In fact, there are a number of ways to define {G_2}-manifolds, as we can see in the following proposition.

Proposition 1 Let {(M^7,\phi)} be a {G_2}-structure on {M} with the induced metric {g} and the Levi Civita connection {\nabla}. Then, the following are equivalent:

  1. {\nabla \phi = 0}
  2. {Hol(g) \subseteq G_2}
  3. {d\phi = 0} and {d^*\phi = 0}.

If any one of the conditions of the proposition holds (and hence, all), we call {M} a {G_2}-manifold. The first example of a metric with {G_2} holonomy is given by Bryant. The metric in his example is incomplete. Later, Bryant and Salamon constructed complete metrics with {G_2} holonomy on non-compact manifolds. Then, Joyce constructed complete examples on compact manifolds.


Let {M} be a smooth {n}-manifold and {p} be a point in {M}. Consider the set { S_p} of all linear isomorphisms {L_p:T_pM\rightarrow {\mathbb R}^n} between the tangent space at {p} and {{\mathbb R}^n}. Note that there is a natural left action of {GL(n,{\mathbb R})} on {S_p}. Since this action may be seen as a function composition, we denote the action by {\circ}. Though, we will quite often drop the notation altogether hoping that it is clear. The disjoint union {F = \sqcup_p S_p} is called the frame bundle of {M}. (Of course, we need to have more conditions on {F} but we will not go into details in this post.) The action of {GL(n,{\mathbb R})} on {S_p} induces a natural action on {F}.

It is easy to define a bijection between any fiber of {F} and {GL(n,{\mathbb R})}. Using the bijection, we may define a group multiplication on {F} which will, in turn, make the fiber isomorphic to {GL(n,{\mathbb R})}, trivially. In general, this bijection is not canonical as we will see below. First, we fix an isomorphism {L_p} in {S_p}. Then, we send {K_p \in S_p} to {K_p \circ L_p^{-1} \in GL(n,{\mathbb R})}. Clearly, this map is injective and {L_p} is sent to the identity matrix under this identification. Also, for any {N\in GL(n,{\mathbb R})}, {N \circ L_p \in S_p} and {N\circ L_p \circ L_p^{-1} = N} i.e. the identification is onto. So, we have a bijection. As we can see, once an identity element {L_p} is fixed, the fiber {S_p} becomes a group isomorphic to {GL(n,{\mathbb R})}. In other words, {S_p} is a {GL(n,{\mathbb R})}-torsor.

Let {G} be a Lie subgroup of {GL(n,{\mathbb R})} and {P} be a subbundle of {F} whose fibers (which we still denote by {S_p}) are isomorphic to {G} in the above sense. Then, {P} is called a {G}-structure on {M}. Clearly, the frame bundle {F} is a {GL(n,{\mathbb R})}-structure on {M}.

Next, we discuss two examples of proper subbundles inducing various structures on {M}. In our first example, we consider {G} to be the orthogonal group {O(n)}. Recall that the standard euclidean metric {g_0} on {{\mathbb R}^n} is fixed by {O(n)}. In other words, for any {N\in O(n)}, {N^*g_0 =g_0}. We can use this property together with {P} to define a Riemannian metric on {M}. Let {p\in M}, {L_p\in S_p} and define the metric {g_p} as the pullback {L_p^*(g_0)}. We need to show that this definition is independent of the choice of {L_p}. Let {K_p\in S_p}, then {K_p\circ L_p^{-1} \in O(n)} as {P} is an {O(n)}-structure. Hence, {K_p^{*}(g_0) = (K_p\circ L_p^{-1}\circ L_p)^* (g_0)= L_p^*\circ (K_p\circ L_p^{-1})^* (g_0) = L_p^* (g_0)}. So, we can choose any isomorphism in the fiber in order to define {g_p}. So, we see that an {O(n)}-structure gives us a Riemannian metric. Next, we will go the other way around i.e. given a Riemannian metric {g}, we construct an {O(n)}-structure on {M}. Each tangent space {T_pM} is equiped with an inner product and we consider {{\mathbb R}^n} equiped with the standard inner product. We define a fiber of {P} to be the set of linear isometries between {T_pM} and {{\mathbb R}^n}. Next, we need to check that a fiber is isomorphic to {O(n)}. Again, first, we fix an isomorphism {L_p}. Then, given another isometry {K_p} from {T_pM} to {{\mathbb R}^n}, {K_p\circ L^{-1}_p} is an isometry from {{\mathbb R}^n} to itself i.e. {K_p \circ L^{-1}_p \in O(n)}. Also, for {N\in O(n)}, {N\circ L_p} is an isometry from {T_pM} to {{\mathbb R}^n} and {N \circ L_p \circ L_p^{-1} = N\in O(n)}. Hence, as above, we have an isomorphism. It is easy to see that this correspondence between {O(n)}-structures and Riemannian metrics is one to one.

This examle can be generalized easily. Any structure on {{\mathbb R}^n} which is fixed by a Lie subgroup {G} of {GL(n,{\mathbb R})} can be carried to a manifold which admits a {G}-structure. In fact, our second example will be of this type, again. We consider the correspondence between an almost complex structure and a {GL(m,{\mathbb C})}-structure where {n=2m}. Before we discuss the correspondence, let us clearify a few things. We view {GL(m,{\mathbb C})} as a subgroup of {GL(2m,{\mathbb R})} using the monomorphism

\displaystyle \begin{array}{rcl} N \mapsto \begin{pmatrix} Re(N) & -Im(N) \\ Im(N) & Re(N) \end{pmatrix} \end{array}

Let {J_0:{\mathbb C}^m\rightarrow {\mathbb C}^m} denote the action of {i} on {{\mathbb C}^m}. In other words, {J_0 = iI} where {I} denotes the {m\times m} identity matrix. Or using, the monomorphism defined above

\displaystyle \begin{array}{rcl} J_0 = \begin{pmatrix} 0 & -I \\ I & 0 \end{pmatrix}, \end{array}

in matrix block form, as a real {n\times n} matrix. Of course, for any matrix {N \in GL(m,{\mathbb C})}, {J_0 N = iN=Ni=NJ_0}. Equivalently, we have {N^{-1}J_0N=J_0}. On the other hand, let {N = \begin{pmatrix} A & B \\ C & D \end{pmatrix} \in GL(n,{\mathbb R})}. Then,

\displaystyle \begin{array}{rcl} J_0 N &=& \begin{pmatrix} 0 & -I \\ I & 0 \end{pmatrix} \begin{pmatrix} A & B \\ C & D \end{pmatrix} \\ &=& \begin{pmatrix} -C & -D \\ A & B \end{pmatrix} \end{array}


\displaystyle \begin{array}{rcl} N J_0 &=& \begin{pmatrix} A & B \\ C & D \end{pmatrix} \begin{pmatrix} 0 & -I \\ I & 0 \end{pmatrix} \\ &=& \begin{pmatrix} B & -A \\ D & -C \end{pmatrix}. \end{array}

Therefore, {NJ_0 = J_0N} if and only if {A=D} and {C = -B}. Hence, we see that a real matrix {N} can be identified with a complex matrix if and only if {NJ_0=J_0N}.

Now, we go back to the correspondence between a {GL(m,{\mathbb C})}-structure and an almost complex structure. First, let us assume that we have a {GL(m,{\mathbb C})}-structure and construct an almost complex structure {J:TM\rightarrow TM}. Let {L_p\in S_p}. Then, we define {J:T_pM\rightarrow T_pM} by {J = L_p^{-1} J_0 L_p}. Next, we show that {J} is well defined. Let {K_p\in S_p}. Since {L_p K_p^{-1} \in GL(m,{\mathbb C})}, we have {K_pL_p^{-1} J_0 L_pK_p^{-1} = J_0}. Hence,

\displaystyle \begin{array}{rcl} J &=& L_p^{-1}J_0L_p \\ &=& K_p^{-1}K_pL_p^{-1}J_0L_pK_p^{-1}K_p \\ &=& K_p^{-1}J_0K_p. \end{array}

Moreover, clearly, we have {J^2 = -I}. Thus, we have constructed an almost complex structure.

Next, we go in the other direction. Given a complex structure {J}, we form a subbundle {P} of {F} which consists of linear isomorphisms {L_p} that satisfy {L_pJ=J_0L_p}. Take a basis of {T_pM} of the form {\left\{ e_1,\dots,e_m,Je_1,\dots,Je_m \right\}} and, then define {L_p(e_i) = (0,\dots,0,1,0,\dots,0)} with {1} in the {i^{th}} position and {L_p(Je_i)=(0,\dots,0,1,0,\dots,0)} with {1} in the {m+i^{th}} position. It is easy to verify that {L_pJ=J_0L_p}. Hence, the subbundle is non-empty. Next, we want to show that the fibers are isomorphic to {GL(m,{\mathbb C})}. Let {K_p} be another isomorphism satisfying {K_pJ = J_0 K_p}. Then, {J = K_p^{-1}J_0K_p}. Thus, {L_p K_p^{-1}J_0K_p = J_0L_p}, or equivalently, {L_p K_p^{-1}J_0 = J_0 L_p K_p^{-1}}. Therefore, {L_pK_p^{-1}\in GL(m,{\mathbb C})} by above remarks. Also, given {N \in GL(m,{\mathbb C})}, {NL_p J = NJ_0L_p = J_0NL_p} that is {NL_p} is also an element of {P}. So, the fibers are isomorphic to {GL(m,{\mathbb C})}.

Our third example will be a {G_2}-structure. However, I want to have a more detailed discussion of {G_2}-structures before I present it in this context. So, I will include it in a future post.

Gronwall Inequality

There are a number of different statements of Gronwall’s inequality. In this post, we will consider only one of them, perhaps the weakest of all.

Proposition 1 Let {f(t)} be a non-negative continuous function on {\left[ a,b \right]} such that there are positive constants {C} and {K} satisfying

\displaystyle \begin{array}{rcl} f(t)\le C + K\int_{a}^{t}f(s)ds \end{array}

for all {t\in\left[ a,b \right]}. Then,

\displaystyle \begin{array}{rcl} f(t)\le Ce^{K(t-a)} \end{array}

for all {t \in \left[ a,b \right]}.

Proof: Define {U(t) = C + K\int_{a}^{t}f(s)ds}. Note that, by definition, {f(t)\le U(t)} and {U} is a strictly positive differentiable function. Also, we have {U'(t) = Kf(t)\le KU(t)}. In other words, {\frac{U'(t)}{U(t)}\le K} which means the relative rate of change of {U} is less than {K}. Hence, the growth of {U} is slower than an exponential function with relative rate of change {K}. That is {U(t) \le U(a) e^{K(t-a)}} (if you did not like this reasoning, you may integrate both sides of the previous inequality from {a} to {t}). So, we have the desired result {f(t) \le U(t) \le U(a)e^{K(t-a)}= Ce^{K(t-a)}}. \Box


Some Comments on Linear Complex Structures via an Example

Consider a real vector space {V} generated by {\partial_ x} and {\partial_ y}. There is an obvious identification {L:V\rightarrow\mathbb C} of {V} with the complex plane {\mathbb C} such that {L(\partial_ x) = 1} and {L(\partial_ y) = i}. Define a linear complex structure on {V} by setting {J(\partial_ x) = \partial_ y} and {J(\partial_ y)=-\partial_ x}. With the identification mentioned above, since {\mathbb C} is a complex vector space, {V} can be viewed as a complex vector space, too. Furthermore, the action of {J} can be viewed as multiplication by {i} on {V} but we will see below why this view does not extend further.

Next, we complexify {V} by taking a tensor product with {\mathbb C} over {\mathbb R}. We know that (real) dimension of {V_{\mathbb C} = V\otimes \mathbb C} is {4} and it is generated by {\partial_ x\otimes 1, \partial_ y \otimes 1, \partial_ x \otimes i} and {\partial_ y \otimes i}. We can view {V_{\mathbb C}} as a complex vector space and, for notational simplicity, write {v = v \otimes 1} and {iv = v \otimes i}. Note that over the complex numbers {V_{\mathbb C}} is {2} dimensional and generated by {\partial_ x} and {\partial_ y}. However, these are not the “natural” bases to work with as we wil see. Next, we extend (complexify) {J:V\rightarrow V} to get {J_{\mathbb C}:V_{\mathbb C}\rightarrow V_{\mathbb C}} which we will still denote by {J} for notational simplicity. Let {\partial_ z = \frac{1}{2}(\partial_ x - i \partial_ y)} and {\partial_ {\bar z} = \frac{1}{2}(\partial_ x + i\partial_ y)}. Now, we see that

\displaystyle \begin{array}{rcl} J(\partial_ z) &=& \frac{1}{2}\left( J(\partial_ x) -i J(\partial_ y) \right) \\ &=& \frac{1}{2}\left( \partial_ y +i \partial_ x \right) \\ &=& i\frac{1}{2}\left(\partial_ x - i \partial_ y \right) \\ &=& i \partial_ z \end{array}

and also,

\displaystyle \begin{array}{rcl} J(\partial_ {\bar z}) &=& \frac{1}{2}\left( J(\partial_ x) +i J(\partial_ y) \right) \\ &=& \frac{1}{2}\left( \partial_ y -i \partial_ x \right) \\ &=& -i\frac{1}{2}\left(\partial_ x + i \partial_ y \right) \\ &=& -i \partial_ {\bar z}. \end{array}

This means that {\partial_ z} is an eigenvector of {J} corresponding to the eigenvalue {i}. Similarly, {\partial_ {\bar z}} is an eigenvector corresponding to the eigenvalue {-i}. So, the set {\left\{ \partial_ z, \partial_ {\bar z} \right\}} is an eigenbasis for {J} and it gives us an eigenspace decomposition of {V_{\mathbb C}}. Computing {J}, using this basis, is clearly more convenient and hence, this is a “natural” choice as a basis. Furthermore, from this viewpoint, it is also clear why the action of {J} cannot be viewed as multiplication by {i} any more.

Perception of Difficulty: Change Colors

Disclaimer: this post is opinion based.

A few years ago, I have tried to write a JS code to draw some fractals. However, I was not able to come up with an original and beautiful result. So, I have modified the code a little and made it draw a bunch of lines with random colors (and repeat that every second.) You can view it here.

Back then, I did not see anything particularly nice about it but after a couple of years, when I look at it again, it makes me think about our perception of difficulty. Let me emphasize that it draws exactly the same lines at every run, it is just the colors that are random (unless, of course, I have made an error.) So, in theory, no matter which set of colors you start with, if you diligently trace the lines, you can understand the pattern of the lines. Here is a sample picture:

image1I do not know about you but the above picture does not give me a lot of hints about its pattern other than a possible point of symmetry. So, I would say that it is hard to recognize a pattern, if it exists at all.

However, if you change the colors a little bit, a spiral in the middle becomes clearly visible.image2 (Click on the image to see a bigger version.)

Tweaking a little bit more, we also getimage3which makes it more clear that these lines are polygonal chains approximating some circle-like path. Well, it is an approximation of a spiral.

As you saw, it became a lot easier to notice a pattern when you change the way you look at the lines. A quite similar situation happens often in my daily life; when I spend a lot of hours working on a seemingly hard mathematical problem, only to realize that I have been using wrong “colors”. Of course, in general, you do not know which “colors” to start with but if you find something challenging, it is often rewarding to change “colors”.